Reproducible builds mean that rustc always generates the same output
for any given source code and configuration. It doesn’t do that today,
but it could, and will need to eventually.
Reproducible builds would help us ensure the security of Rust’s build
environment by allowing independent verification that the source code
for a Rust release produces the binaries we say it does.
I want people to be able to take our source code, write x.py verify-release,
and the build system will run x.py dist then verify that the result is
the same one rust-lang signed and published.
With that we can set up independent build environments that confirm
that the official binaries were built, without tampering, by the
previous official binaries. We could provide a tool that would just
deploy the whole setup to EC2 and start verifying what the official
build machines are doing.
Having such a tool would give us more confidence in the integrity of
the compiler. The more releases in a row that other people verify,
the more confident people can be that the compiler is not subject
to a backdoor inserted by a compromised build server.
I could imagine that once we have independent verification of the
rustc builds going forward, we could also put together a project to
reconstruct the complete rustc bootstrap, from the ocaml compiler
until it joins the binary-reproducible history. And then we would
have some strong confidence that there are no backdoors in rustc.
If this sounds cool to you, then why not go make rustc builds reproducible?
This is definitely a good idea, not just for security reasons.
Note that the continuous testing that we do for incremental compilation (https://travis-ci.org/rust-icci) already today relies on binary reproducibility of LLVM bitcode and object files. This has worked reliably on Windows, Linux, and macOS for a couple of months now.
I’m not sure about crate-metadata, but there have also been some changes that should make it pretty much deterministic. This needs to be verified though.
And then there’s the linking step. There’ve been reports that this might not be entirely deterministic on all platforms. But it could also just have been unstable crate metadata being misinterpreted as the linker’s fault.
Yes, a call to action! We need a reproducible builds hero.
Ultimately, what we need is for our infrastructure to verify that two bootstraps of the same compiler produce the same installer tarballs, the same .msi’s, and the same .pkg’s. So there’s a clear goal, and I suspect a number of obstacles standing in the way.
If I were to start I’d probably just do some experimentation to see what is and is not reproducible, then think about how to set up test cases to verify reproducibility at various levels.