Stability of dylibs per compiler release

In a discussion that followed from the idea to provide a stabilized ABI environment as a Flatpak runtime extension, a question has arisen: is there a guarantee that two dylib artifacts created:

  • from the same source revision;
  • by the same release of the compiler (possibly built into different binaries from the source with the same configuration options);
  • for the same target;
  • with the same compiler flags;
  • on different machines or at different times;

have exactly the same ABI? For example, what goes into the magic hashes that the compiler embeds into sonames and dynamic symbols?

If such a guarantee can be provided, the problem of lack of ABI stability might be possible to solve elegantly in Flatpak and similar app containerization platforms, with no need for the Rust language developers to commit to any ABI stability above the minimum described above.

3 Likes

Thanks. Thae suggested approach, which such a guarantee facilitates, has the potential to reduce the surface area of the problem to one that seems manageable.

1 Like

There are two separate things:

You have the SVH (semantic version hash), which depends on the exact source, compiler version, target, most compiler flags and if you don't use --remap-path-prefix I think the exact path of files. This is used by rustc to determine if a crate you pass in when compiling the executable/dylib matches the one used when compiling a dependency crate.

To help with incremental compilation the symbol names don't depend on the SVH, but only on a subset of the things that influence the SVH. This includes for example the function signatures and the value passed in -Cmetadata, but not the function contents.

For two dylib files to be ABI compatible the SVH must match. If only the symbol names match, then the executable that links to the dylib may have inlined a different version of a function during compilation than what the function would be during runtime. This may cause problems.

This would require reproducible builds. While there is no inherent thing preventing this in rustc like randomness, there are many external influences that may work against this. For example if you don't use --remap-path-prefix, then the full path of all files will end up in the crate metadata and debuginfo.

1 Like

With this GitHub workflow, I got two shared libraries identical down to .note.gnu.build-id. I did not change things like the working directory or try with the compiler on a different host architecture, because I think these will not vary for Flatpak builds either.

Cargo does not put that into the name of a crate-type = ["dylib"] crate. Is there a way to do it outside of Rust's own build system, or is it only used for Rust system libraries?

Here's a more extensive test, now with variable build directories and on multiple OSes, too.

On Linux, all binaries produced with the same compiler and build flags are identical, the absolute path to the library is not baked in (the source tree location did not vary). This is satisfactory for Flatpak purposes.

On MacOS, the absolute path to the dylib is baked into both the dylib itself and the binary that requires it. I can't tell right now if this is something important at runtime; if it is, --remap-path-prefix or maybe more is needed there to remove it.

On Windows, the absolute path to the .PDB (debug info file) is baked in, but seemingly not the path to the DLL.

There are other small differences in MacOS and Windows files, but I'd need to dig out development tools for these platforms to try to see what these are.

The SVH is put in the crate metadata. Eg the .rustc section for dylibs. Cargo already tries to compute something similar, which it passes to rustc as -Cmetadata. This will end up in the filename. This metadata value contains crate name and version and all compilation flags that can affect the output, including those of all dependencies, but it doesn't contain the exact source code.

It seems that for dylibs Cargo only passes -Cmetadata and not -Cextra-filename. This means that for dylibs it doesn't become part of the filename.

1 Like

Neither it is embedded into the soname that dependent binaries list in NEEDED.

This is unfortunate, as this feature makes the Rust std libraries more protected against ABI drift, or at least the situation is more readily detectable with distro tools that check and catalogue soname dependencies.

This seems to be the reason dylibs don't contain the metadata in their filename.

Maybe the SVH could be embedded as a symbol in the dylib and then every dylib/executable could require that symbol to be present? That would prevent running an executable against an unexpected dylib version.

1 Like

This is a good idea. The symbol name prefix could also be a distinctive string considered stable, so that DSO analysis tools could detect shared libraries that are Rust dylibs (this idea has come up in a discussion about automatically banning any such libraries from a distribution :smiling_imp:)

1 Like

Thanks! I have learned about __CARGO_DEFAULT_LIB_METADATA, though as the comments mention, it probably should not be relied on anywhere outside building libstd.

I opened https://github.com/rust-lang/rust/issues/73917

There could be a cargo feature to always force the metadata as part of dylib filenames.

1 Like

For my part, I have opened https://github.com/rust-lang/rust/issues/73932 to follow up on this discussion.

FYI I think the dylib can't have a hash in the filename due to the name being encoded into the dylib for OSX.