Which Build Tool For A Bootstrappable Project?

Posted on April 1, 2024 by Jack Kelly
Tags: coding, c, autotools, meson, cmake

I have a nascent side project which is intended to participate in a bootstrap chain. This means it shouldn’t depend on too many things, that the transitive closure of its build dependencies must also be small, and at no point in the process should any build depend on an opaque binary blob.

Choices on the language side are pretty constrained. Zig is currently not a candidate (despite the language itself being rather promising), because it has removed its C++-based bootstrap in favour of keeping a WASM-based build of a previous compiler version. It’s great that their compiler output is reproducible — Zig-built-by-Zig is byte-for-byte identical with Zig-built-via-WASM — but for now, it’s not truly bootstrappable. (Andrew Kelley says he hopes someone writes a Zig compiler in C when Zig stabilises. I sincerely hope this happens.)

Rust is right out, for reasons described in the Zig article:

Use a prior build of the compiler - This is the approach taken by Rust as well as many other languages.

One big downside is losing the ability to build any commit from source without meta-complexity creeping in. For example, let’s say that you are trying to do git bisect. At some point, git checks out an older commit, but the script fails to build from source because the binary that is being used to build the compiler is now the wrong version. Sure, this can be addressed, but this introduces unwanted complexity that contributors would rather not deal with.

Additionally, building the compiler is limited by what targets prior binaries are available for. For example, if there is not a riscv64 build of the compiler available, then you can’t build from source on riscv64 hardware.

The bottom line here is that it does not adequately support the use case of being able to build any commit on any system.

As far as I can see, the best choice for writing bootstrap-related software in 2024 is still C99, with as few dependencies as possible. Any (hopefully few) necessary dependencies should also be bootstrappable, written in C99 and ideally provide pkg-config-style .pc files to describe the necessary compiler/linker flags. But at least there are several C compilers as well as several implementations of pkg-config (the FreeDesktop one, pkgconf, u-config, etc.).

Since we are compiling C, what should we use for the build system? Autotools is under scrutiny again in the wake of the xz-utils compromise, as code to trigger the payload was smuggled into the dist tarball as “autotools junk” that nobody looks at. Should bootstrappable projects still use autotools, or is there something better in 2024?

Autotools Is A Pain, But Not As Much As You Think

I have historically used automake (and the other autotools) as my go-to build environment for C projects. For simple projects (example: MudCore), it’s really not that much effort to set up, and you get a lot for free:

Many handwritten Makefiles fail at least one of these criteria.

That said, the autotools do have their warts, both for developers and users:

The Best Alternative Is meson And I Don’t Like It

I asked online for recommendations for a modern, portable build system for C. CMake is absolutely out: I still haven’t forgiven it for deciding that the space-separated output of pkg-config meant it was a “list”, then emitting that list as a semicolon-separated string into my build commands. CMake was a good intermediate step and brought some good ideas (interactive configuration, enforced out-of-tree builds), but I think we’ve surpassed it now.

Meson seems to be the lead recommendation, and a lot of the GTK/Gnome-verse packages seem to have switched over to it. I think it has a lot of shortcomings with regards to writing bootstrappable software, and I think these are serious concerns for software projects in general:

I don’t want this to just be a rant. There is a lot I genuinely like about meson’s design:

So, that’s it, then? Meson seems to have a similar number of warts, and a much more attractive feature set? Well, it’s complicated. I consider its finite list of compilers, its Python dependency, and the transitive closure of its bootstrap footprint to be pretty severe drawbacks. There is a possible answer in the muon and Samurai projects, which are C99 reimplementations of meson and ninja, respectively. Do they get us out of the woods? Yes and no:

Where Does That Leave Us?

I think there are two decent paths forward, if one cares about writing bootstrappable software:

  1. Use Meson as the build language, but do all regular development using muon and Samurai. This ensures that the project is at least buildable and installable as part of a lean boostrap chain. Use meson to run “project administration”-type tasks, like generating documentation and dist tarballs. Remember to generate .tar.gz tarballs, so that bootstrappers don’t have to get all the way to xz-utils just to start the build.

  2. Go back to automake and eat the additional warts in exchange for a guaranteed small set of dependencies. Despite everything, it still works well. Use non-recursive Makefile.am where you can (with include, %reldir%, and %canon_reldir%) and SUBDIRS where you must.

    I can think of a couple of ways to harden the build system against the sort of subversion that Jia Tan pulled:

    • For developers: stop passing --install to aclocal (i.e. in Makefile.am’s ACLOCAL_AMFLAGS). This prevents aclocal from copying the m4 files it uses into your build tree, which stops make dist from bringing them into your release tarballs. End users will still be able to build the package, but developers changing the build system will need to install autoconf-archive. That’s no problem; it’s available in most distros.

    • For distro packagers: unconditionally re-bootstrap the package (using autoreconf or similar) before building. People working on bootstrappable builds routinely do this, but I think it’s now something everyone should be doing. If you can’t bootstrap the build system, the package is nearly unmaintainable, and it reduces the likelihood of surprises sneaking into the configure script or other generated files. I don’t know what’s going on in Pythonland, but they seriously tell people to regenerate configure using a Makefile target which runs something in Docker, which sounds utterly bonkers to my ears. Requiring simple, single-command bootstrap of packages’ build systems should stop things from getting too wild.

      autoreconf-ing everything unfortunately puts more work onto distro build farms, but I see many packages in nixpkgs doing it anyway, because of the patches they apply to their packages. It’s probably a price worth paying just for the peace of mind.

Whether using Meson or autotools, it’s probably also worth thinking about building distro packages against VCS release tags instead of tarballs. xz-utils, being an autotooled package, had a lot more dark corners for Jia to hide his payload, but there’s really nothing autotools-specific about this attack. Building against VCS tags also makes it possible to detect when upstream force-pushes over a release tag, which is a thing some maintainers do.

Previous Post
All Posts | RSS | Atom
Next Post
Copyright © 2024 Jack Kelly
Site generated by Hakyll (source)