I'm cross-compiling for cortex-m4 and I noticed that rustc 1.80.0 aarch64-unknown-linux-gnu and x86_64-unknown-linux-gnu generate different names for the same function:
I believe that this in turn generates slightly different binaries. I would like my firmware to be reproducible so that others can confirm that the shipped binary is compiled from the open sources.
I thought those last numbers are the hash of the content of the function, but the content didn't change, and the hash is different for all the functions in the compiled file.
Does anyone know if I can make these hashes deterministic?
Checking to confirm: you're using different host toolchains, but you're using the same cross-compilation targets, and getting different binaries?
That seems like a reasonable thing to expect. I think what may be happening is that the mangling is using a hash of the Rust toolchain, and so different compiled binaries of the same rustc sources are producing different ABIs. It would take some amount of additional work to make the hash that's used as input to the mangling be the same for two different builds of the same stable rustc sources, but it seems worth doing.
I tested 1.81.0-beta.5, but unfortunately the mangling still differs. The difference in the .text section is only 224 bytes now however, so that is much better.
For the different symbol names can you build with -v and diff the rustc invocations between both builds. Using -j1 may help getting the order deterministic. Otherwise sort would work too.
For the .text changes are you using --remap-path-prefix to ensure that paths that end up being embedded in the binary are the same on both machines?
I checked and these folders are identical on both machines, except of course the path due to the toolchain name:
$ ls /opt/rustup/toolchains/beta-aarch64-unknown-linux-gnu/lib/rustlib/thumbv7em-none-eabi/lib/
liballoc-b8e640c80c99247d.rlib libcompiler_builtins-679ee573caf6d8a5.rlib libcore-afaa5e9723996f9c.rlib librustc_std_workspace_core-10319abd33b68d85.rlib
Yes I'm using --remap-path-prefix. Perhaps I need to use that for the rlibs as well?
edit:
Even if I add --remap-path-prefix=/opt/rustup/toolchains/beta-x86_64-unknown-linux-gnu= and --remap-path-prefix=/opt/rustup/toolchains/beta-aarch64-unknown-linux-gnu= the metadata hash still ends up being different. Anything else I can try?
The paths of rlibs never end up in anything anyway, so --remap-path-prefix doesn't help for them.
Are the source paths and -Cmetadata/-Cextra-filename arguments the only differences? If so the issue is somewhere in the way cargo determines the identity of the crates it builds.
Are you using -Zbuild-std? If not the exact same rlibs for the standard library should be used. If you do, does it reproduce without -Zbuild-std (eg for a tier 1 or tier 2 target). If it doesn't without -Zbuild-std, this may be a -Zbuild-std specific thing that remains the difference. Maybe it could be the fact that -Zbuild-std doesn't lock the dependencies of the standard library and instead takes the latest semver compatible versions from crates.io. Up until very recently the standard library sources were not shipped with a lockfile that would be usable by -Zbuild-std. Consider doing a forced lock of the standard library. ยท Issue #38 ยท rust-lang/wg-cargo-std-aware ยท GitHub tracks cargo support now that we do ship a usable lockfile for the standard library on recent nightlies.
Are you using proc macros by any chance? It is possible they get a different id depending on the host triple, which would then results in all dependent crates getting a different id and thus -Cmetadata argument.
If proc-macros get different IDs depending on the host tuplet that sounds like a bug too. Same goes for other host crates needed by build scripts or as dependencies of proc-macros.
I noticed that panic-halt gets the same ID in both runs. And panic-halt doesn't depend on autocfg. So I guess all of the other ones get differnt IDs because autocfg does.
edit2: Hmm, looking a bit more carefully. Autocfg is compiled for the host compiler. Could the issue then be with num-traits? Could it be that autocfg generates something slightly different for num-traits?