Pre-RFC: Alternative approach to -Zno-link/-Zlink-only split linking

Issue 64919 proposes a mechanism to split the compilation and linking parts of building an executable (or any other crate type which requires linking).

It proposes 3 reasons for doing this, but I find the first two the most compelling:

  • Allow linked crates to participate in pipelined builds, by using .rmeta files for dependencies (and only requiring .rlibs for the actual linker invocation)
  • Improved caching, both because the crate build can use cached .rmeta files, and because the output of the Rust crate build can be reused (eg, if you're rebuilding something where only non-Rust dependencies have changed, the previously built Rust code can just be relinked).
  • (more control over linking, but the mechanism doesn't really allow for this)

Personally, I'm particularly interested in the possibilities of distributed builds, and aggressively caching and reusing build outputs.

The issue proposes the --no-link and --link-only flags, which are currently implemented as -Zno-link and -Zlink-only. The -Zno-link invocation generates a .rlink file and a set of .o files. -Zlink-only takes these files and uses them to invoke the linker. It's not clear to me whether the intent is that the command line of the -Zlink-only invocation is supposed to be identical to the -Zno-link one (aside from that option), or if -Zlink-only can be as minimal as rustc -Zlink-only thing.rlink.

This implementation works as far as it goes, but it has a number of significant limitations.

Firstly, all the .rlib paths are baked into the .rlink file. This means that they have to be supplied to the -Zno-link invocation. But this undermines the goal of supporting pipelined builds - a pipelined build would only need the .rmeta files for the dependencies, and the .rlib files are only needed for linking. This implies that while the -Zno-link invocation should be given .rmeta files for dependencies, the -Zlink-only phase should resolve the corresponding .rlib files so that the dependency coupling is as late as possible.

Secondly since the .rlink file is a dump of internal compiler state, it must be treated as opaque. But as mentioned above, it contains a full set of paths to .rlib files that the executable depends on (directly and indirectly). These paths are absolute paths, which means that they're likely only meaningful on a single machine, implying that the -Zno-link and -Zlink-only phases can't be distributed.

Thirdly, a more minor problem is that the -Zlink-only invocation treats the .o files as compiler temporaries and deletes them after linking. This means that if you want to preserve them you need to copy them (or I think -Csave-temps will keep them along with any other temp files). -Zlink-only should treat all its inputs as inputs and preserve them. Failing to do either of these could corrupt a cache - or perhaps implies that the mechanism simply isn't intended to support reusing artifacts from -Zno-link multiple times.

I think the second problem could be mitigated by always using relative paths (and some way to deal with sysroot crates), and/or some defined tooling to allow the paths to be remapped/normalized without necessarily exposing the details of the file format.

The third problem is more or less a simple bugfix.

The first problem arises from binding too much state into the .rlink file too early. More could be deferred to the -Zlink-only, but I think this would require fairly large scale implementation changes.

Rather than that, I have an

Alternative Proposal

Rather than exposing internal compiler details via .rlink, I think it would be more straightforward to:

  1. build the linkable crate as a plain .rlib in (more or less) the normal way
  2. build the executable with the effective main.rs source of use lib_crate::main; (or re-export all the public symbols for a dylib/cdylib style crate).

This makes split linking look a lot more normal to the surrounding build system - it simply becomes two build rules invoking rustc, with a normal dependency relationship via a .rlib artifact (and all the other dependencies).

The main problem with this is that fn main() {} is special in that it has a variable signature and isn't public, so it would need special handling to be visible from step 1 to 2.

proc-macros may also pose a problem; I haven't looked into the details as closely.

At least for my use-cases, I don't see any loss of flexibility or functionality with respect to -Zno-link/link-only; it's strictly an improvement and easier to integrate. The particularly nice thing is that it's easy to prototype with no rustc changes at all, at least for executables, by manually changing fn main() to pub fn main(). But it would be nice to have rustc support so that this can be automatically applied without crate source changes.

3 Likes

fn main() {} is special in that it has a variable signature

No, it's not really variable, it would just be:

#![feature(termination_trait_lib)]
fn main() -> impl ::std::process::Termination {
  main_lib::main()
}

Unfortunately

use main_lib::main; // not pub either

doesn't work; rustc complains main() is missing. If it did, then it might just be a matter of adding a -Cmake-main-pub option for the "main_lib" crate and having the "real" main.rs be:

pub use main_lib::*;

Yeah, this makes -Zno-link/-Zlink-only kind of useless.

I assumed that you had to pass in -L to -Zlink-only too for resolving crates, but indeed it uses the actual paths to dependencies. This will need to be changed. The used_crate_source, used_crates_static and used_crates_dynamic fields will need to be computed in the -Zlink-only pass. This will likely require storing the CrateHash for each crate and either storing a map from CrateNum to StableCrateId or even better store the StableCrateId directly in each field.

This makes sense in some way. Only rustc knows which temp files are stored where, so it is the only thing that can delete them at all. It does make sense to keep the temp files though both to keep all inputs immutable and to maybe in the future allow only re-executing the linking step when a dependency changed without changing the interface of the crate. (the crate interface here includes #[inline] and generic function bodies as changing those requires recompiling downstream crates to take effect)

This is the imported_main feature: Tracking issue for Allow a re-export for `main` (RFC 1260) · Issue #28937 · rust-lang/rust · GitHub

1 Like

Isn't all this in the rmeta already? Wouldn't it be better to reuse that file format rather than come up with something new?

In other words, I'm not really proposing a huge change. It's essentially:

  • rustc -Zno-link main.rs <rmeta deps for building> => mybinary.rlib
  • rustc -Zlink-only mybinary.rlib <rlib deps for linking> => mybinary
1 Like

Rlibs contain a lot of information that is unnecessary for linking, which will slow down -Zno-link/-Zlink-only. In addition it is valid to specify both rlib and bin or dylib at the same time. The rlib must then among other things not contain the allocator shim and dylib metadata object file, so the rlib produced by -Zno-link wouldn't be valid as link output. Also my proposed changes shouldn't be too hard to implement.

Is that actually a big concern, vs the overall cost of linking? I presume all the metadata still needs to be computed, so you're just talking about the cost of writing it into a file?

I'm not sure what you mean by this - can a crate be both a binary and a dylib? Is that actually useful? I can see what you mean about generating an rlib and dylib at the same time, but I assume that in principle a .rlib can be specialized into a .dylib at link time.

I'm less concerned about implementation complexity as integration complexity with the surrounding build system. From my POV, simplest to most complex is:

  1. emit & consume a .rlib file
  2. emit & consume some other single file
  3. emit and consume a metadata file & objs (no other direct references to other files by path); references to objs are relative
  4. emit and consume a metadata & objs with references to other files (must allow pathnames to be enumerated and remapped)

What I'm proposing is 1 and the status quo is 4. I'm hoping your proposal amounts to 1 or 2, or at worst 3.

1 Like

No, for the executables no metadata is generated at all.

Yes, it can. You can pass --crate-type multiple times. In fact for example libstd does this to to get both an rlib and dylib.

The problem is that it is the other way around. A dylib or executable contains more object files than rlibs.

Also various of the inputs of the link process don't exist in the crate metadata at all. Only the rlink files contain them. In addition having -Zno-link use rlibs would require the same code to get the file paths as not embedding the file paths in .rmeta files and computing it at the link stage.

But all the relevant data is still computed as part of the compilation - it just isn't serialized right?

Yes, but you implied that it might make sense to specify bin and dylib. rlib and dylib makes perfect sense, but deriving a dylib from the contents of a rlib should be possible in the link phase, so long as the rlib has the right relocation model (which we'll assume it does in this case). I'm also assuming that the rlib generated by -Zno-link is pretty normal, and also suitable for use as normal.

Yeah, but I'm assuming the link phase can inject those while creating a dylib from an rlib.

Yeah, I really don't want those extra inputs to be passed from -Zno-link -> -Zlink-only - they should be derived from the -Zlink-only command line. Otherwise there's too much coupling between the two steps. From my POV, the whole point is to allow the compilation step of -Zno-link to be as decoupled as possible from the linking phase; if you need to know the linker inputs for -Zno-link then you lose that.

This seems relatively reasonable. I can imagine scenarios in which there's more linking complexity than just main and exported symbols, but the general concept seems like it could work.

1 Like

Among the things encoded in the rlink files, but jot commandline or rmeta files are the name of the allocator module. In addition the dependency format calculation really only works for the local crate. If you take an rlib as input the local crate is no longer the crate for which you called -Zno-link but some dummy crate. Queries with LOCAL_CRATE as crate num are never forwarded to the metadata loader.

What I am trying to say is that using rlib is at least as complex as rlink and may even be more complex due to no way to communicate certain things. I have been working on the necessary refactoring to not embed lib paths in the rlink file. I already got it down from up to three copies of the lib path in the rlink file to onlely a single one. That single one should be removable too by encoding the SVH instead and using CrateLoader to load the libs in the -Zlink-only invocation.

Yeah, I had a go at implementing -Zlink-only with a .rlib and it definitely looked tricky to instantiate the rlib as the local crate. I also looked at the shim crate approach, but it wasn't clear to me what side-effects that would have (eg for proc-macro crates).

I'm happy to defer to your judgement that using a .rlib is too hard to implement; I'm primarily thinking about it from the perspective of a rustc user/integrator. A self-contained .rlink file (ie, another renaming of a .a file which contains the necessary metadata and .o files) would be just as simple to integrate.

Are you retaining the implementation of serializing CodegenResults but refactoring what it contains? Or taking another approach? It seemed to me that if you want to re-locate all the crate dependencies at link-only time, then CodegenResults as it stands is too late because it already has all the paths.

In the long term I'd like to find some way where rustc can provide enough information to completely construct an external link line, so that we don't have to rely on it to invoke the linker - ie so that a top-level C++ target can link in Rust code in a native way. This would allow us to eliminate cdylib/staticlib whose hermetic "link in all the deps" approach causes all sorts of problems at scale (eg, multiple layers of Rust depending on C++ with the same crate appearing both as a Rust dep and as a C++ dep).

That's a lot more complex than what we're talking about here, so I'm not really going to touch it until we have a clean way to split linking in the simple case.

1 Like

Correct, I am refactoring what it contains. The codegen unit paths are already relative, so I only need to get rid of the dependency paths and compute them at link time instead based on the commandline and a list of all dependency names and SVH's (semantic version hash).

1 Like