Access to synthetic object file of exported and used symbols

Currently, the Rust compiler exports a synthetic object file of all #[used] and #[no_mangle] symbols for linking. (see: https://github.com/rust-lang/rust/pull/95604)

This file is linked in to ensure these symbols are not discarded during linking if they are not referenced some other way (see also: `no_mangle`/`used` static is only present in output when in reachable module · Issue #47384 · rust-lang/rust · GitHub). This prevents issues with crates like linkme which can generate ways to reach symbols that are not understood by the linker. Without the synthetic object file, the statics used with the linkme could sometimes be discarded.

For binaries built by Rust, the synthetic object file presents a good solution, as it's understood by all linkers and doesn't require setting any additional linker flags, apart from including the object file. When building Rust code into a staticlib however, a similar solution is currently not available, as the linker is then not invoked by Rust.

Especially when integrating Rust staticlibs into C/C++ build systems (e.g. CMake via Corrosion), access to the synthetic object file would allow the #[used] attribute to work across the boundary into any C/C++ (or other languages) build system. I encountered this issue myself when I tried to create a custom test harness that could be called by C++ and that collected tests with the linkme crate. After linking the Rust static library into the C++ executable, the test cases would sometimes disappear in Debug builds, as those use a different internal object file layout in the generated static library. Linking in the corresponding symbols.o file solves this, but that is currently not accessible in a stable directory.

Do you think it is reasonable for Rust to export the symbols.o file for each crate into the target/ directory when building "staticlib" crates? Or potentially only when given a specific flag?

Linking the staticlib with --whole-archive would work too, right?

Wouldn't that cause unused objects without the #[used] attribute to also be kept? Reading the GNU le man page, that is my understanding of it at least.

Sure, the object files would all get pulled in from the archive, but that is the only thing symbols.o is meant to do anyway. You can pass --gc-sections to discard any unused functions and statics anyway. You should do this anyway when linking rust as basically every object file rustc emits will contain both used and unused functions and statics. Rustc also passes --gc-sections when it runs the linker itself. --gc-sections still allows discarding #[used] statics, but the linker knows that .init_array statics and statics whose section is referenced by __start_*/__stop_* (as linkme uses) should be considered gc roots. symbols.o is only meant to pull in the object files such that the linker gets a change of treating those statics as gc roots.

@bjorn3 Thank you for the quick reply.

We have used --whole-archive before when linking Rust staticlibs into C++ but ran into issues, as too much was being linked, which resulted in duplicate symbols in CXX-Qt: For example: MSVC CI doesn't compile with Rust 1.78.0 · Issue #958 · KDAB/cxx-qt · GitHub Because this failed to link in the first place, I don't think it can be fixed with --gc-sections.

In the end we now also generate an object file and link it in via our custom CMake code. In our case that is possible because we know the symbols that we need to reference, as they come from a build script. But this isn't yet the possible for #[used] symbols defined in normal Rust code.

At this point I don't know whether the symbols.o file would also link in too much, but the approach is more fine-grained, so I'm hoping it won't cause such issues.

I'm willing to prototype this on a fork of the Rust compiler to confirm that it does indeed help, but want to know whether the compiler team is open to exporting the used symbols as public API if it proves useful.