Previous discussion (2016) : Verifying rustc releases with reproducible builds
I'm creating this topic to share my understanding of the problem and what approaches are available to potentially try and build this capability. I would also like to understand if this is worth doing as I don't have a handle on the complexity of the problem or dependencies that sit outside the rust source tree that could affect reproducibility.
By reproducible build of rustc I mean that if I gave you a copy of the rust source archive and gave you the details of my build environment and build steps that were used to create rustc, you should be able to create the same (i.e. bit-for-bit) rustc. This addresses the problem of whether I, as a compiler distributor, can be trusted and that my build environment does not accidentally or maliciously insert code or functionality that was not present in the rust compiler source.
As a user of rustc, you (or other independent 3rd parties) can verify that my (distributor of rustc binaries) build process is beyond reproach. However, note that having a reproducible build makes no assurances about the code of the rustc compiler itself. Hypothetically, if someone inserted a malicious backdoor in the source code of the rust compiler (or a library that it depends on) - having a reproducible build will not fix the issue. The assumption is that the source code for the rustc compiler has already been reviewed by other means.
A number of folks have already done some work in this area (see: Testing out reproducible builds - The Rust Programming Language Forum). Sadly the test bench referenced in the discussion is no longer available.
More discussions here:
The Debian project has a set of tools (reprotest and diffoscope) that can help with checking how changes in environment, date/time, filesystem ordering, etc. affect a build process. The tooling is Debian/Linux specific and I currently don't have access to a beefy Linux instance to test it. In my initial testing on macOS, I couldn't get reprotest to work as it looks like it has a hard dependency on dpkg, even if you're not building Debian packages.
So I don't know if this fits the criteria for a GSOC sized project since there's a whole lot that I don't know - but it's an interesting problem that I'd like to tackle starting with Debian as it has infra to test for this already.
- Get a beefy Debian box, figure out if it is possible to get repro builds for rustc by making changes to the build process
- If we learn that the changes to the build process can make builds reproducible - propose to add them in
- Want to figure out if changes to make builds reproducible will apply to macOS, other linuxes
- Tackle MS Windows (this would be the most daunting task, personally I have no experience here but I think there are lots of people interested in getting this done!)
- If we learn that this is not possible today (e.g due to the way LLVM is used or something), figure out what it would take to fix it - and weigh in on whether it is worth fixing
Would love to hear from folks who have already done some work in this area or have experiences to share from making builds reproducible.