It’s really easy to misuse lazy I/O (e.g., hGetContents
)
in nontrivial Haskell programs. You can accidentally close a
Handle
before the computation which reads from it has been
forced, and it’s hard to predict exactly when data will be produced or
consumed by IO
actions. Streaming libraries in Haskell
avoid these problems by explicitly interleaving the yielding of data and
execution of effects, as well as helping control the memory usage of a
program by limiting the amount of data “in flight”.
A number of veteran Haskellers have built streaming libraries, and
off the top of my head I’m aware of conduit
,
io-streams
,
iteratee
,
machines
,
pipes
,
streaming
,
and streamly
.
Of those, I think conduit
, pipes
,
streaming
, and streamly
are the most commonly
used ones today. It can be hard to know which library to choose when
there’s so many options, so here is my heuristic:
conduit
); orstreaming
.I’ll explain why after the jump.
Read more...I have a nascent side project which is intended to participate in a bootstrap chain. This means it shouldn’t depend on too many things, that the transitive closure of its build dependencies must also be small, and at no point in the process should any build depend on an opaque binary blob.
Choices on the language side are pretty constrained. Zig is currently not a candidate (despite the language itself being rather promising), because it has removed its C++-based bootstrap in favour of keeping a WASM-based build of a previous compiler version. It’s great that their compiler output is reproducible — Zig-built-by-Zig is byte-for-byte identical with Zig-built-via-WASM — but for now, it’s not truly bootstrappable. (Andrew Kelley says he hopes someone writes a Zig compiler in C when Zig stabilises. I sincerely hope this happens.)
Rust is right out, for reasons described in the Zig article:
Use a prior build of the compiler - This is the approach taken by Rust as well as many other languages.
One big downside is losing the ability to build any commit from source without meta-complexity creeping in. For example, let’s say that you are trying to do
git bisect
. At some point, git checks out an older commit, but the script fails to build from source because the binary that is being used to build the compiler is now the wrong version. Sure, this can be addressed, but this introduces unwanted complexity that contributors would rather not deal with.Additionally, building the compiler is limited by what targets prior binaries are available for. For example, if there is not a riscv64 build of the compiler available, then you can’t build from source on riscv64 hardware.
The bottom line here is that it does not adequately support the use case of being able to build any commit on any system.
As far as I can see, the best choice for writing bootstrap-related
software in 2024 is still C99, with as few dependencies as possible. Any
(hopefully few) necessary dependencies should also be bootstrappable,
written in C99 and ideally provide pkg-config
-style
.pc
files to describe the necessary compiler/linker flags.
But at least there are several C compilers as well as several
implementations of pkg-config
(the
FreeDesktop one, pkgconf
, u-config
,
etc.).
Since we are compiling C, what should we use for the build system?
Autotools is under scrutiny again in the wake of the xz-utils
compromise, as code to trigger the payload was smuggled into the
dist tarball as “autotools junk” that nobody looks at. Should
bootstrappable projects still use autotools, or is there something
better in 2024?