Issue 64919 proposes a mechanism to split the compilation and linking parts of building an executable (or any other crate type which requires linking).
It proposes 3 reasons for doing this, but I find the first two the most compelling:
- Allow linked crates to participate in pipelined builds, by using .rmeta files for dependencies (and only requiring .rlibs for the actual linker invocation)
- Improved caching, both because the crate build can use cached .rmeta files, and because the output of the Rust crate build can be reused (eg, if you're rebuilding something where only non-Rust dependencies have changed, the previously built Rust code can just be relinked).
- (more control over linking, but the mechanism doesn't really allow for this)
Personally, I'm particularly interested in the possibilities of distributed builds, and aggressively caching and reusing build outputs.
The issue proposes the --no-link
and --link-only
flags, which are currently implemented as -Zno-link
and -Zlink-only
. The -Zno-link
invocation generates a .rlink
file and a set of .o
files. -Zlink-only
takes these files and uses them to invoke the linker. It's not clear to me whether the intent is that the command line of the -Zlink-only
invocation is supposed to be identical to the -Zno-link
one (aside from that option), or if -Zlink-only
can be as minimal as rustc -Zlink-only thing.rlink
.
This implementation works as far as it goes, but it has a number of significant limitations.
Firstly, all the .rlib
paths are baked into the .rlink
file. This means that they have to be supplied to the -Zno-link
invocation. But this undermines the goal of supporting pipelined builds - a pipelined build would only need the .rmeta
files for the dependencies, and the .rlib
files are only needed for linking. This implies that while the -Zno-link
invocation should be given .rmeta
files for dependencies, the -Zlink-only
phase should resolve the corresponding .rlib
files so that the dependency coupling is as late as possible.
Secondly since the .rlink
file is a dump of internal compiler state, it must be treated as opaque. But as mentioned above, it contains a full set of paths to .rlib
files that the executable depends on (directly and indirectly). These paths are absolute paths, which means that they're likely only meaningful on a single machine, implying that the -Zno-link
and -Zlink-only
phases can't be distributed.
Thirdly, a more minor problem is that the -Zlink-only
invocation treats the .o files as compiler temporaries and deletes them after linking. This means that if you want to preserve them you need to copy them (or I think -Csave-temps
will keep them along with any other temp files). -Zlink-only
should treat all its inputs as inputs and preserve them. Failing to do either of these could corrupt a cache - or perhaps implies that the mechanism simply isn't intended to support reusing artifacts from -Zno-link multiple times.
I think the second problem could be mitigated by always using relative paths (and some way to deal with sysroot crates), and/or some defined tooling to allow the paths to be remapped/normalized without necessarily exposing the details of the file format.
The third problem is more or less a simple bugfix.
The first problem arises from binding too much state into the .rlink
file too early. More could be deferred to the -Zlink-only
, but I think this would require fairly large scale implementation changes.
Rather than that, I have an
Alternative Proposal
Rather than exposing internal compiler details via .rlink
, I think it would be more straightforward to:
- build the linkable crate as a plain
.rlib
in (more or less) the normal way - build the executable with the effective
main.rs
source ofuse lib_crate::main;
(or re-export all the public symbols for a dylib/cdylib style crate).
This makes split linking look a lot more normal to the surrounding build system - it simply becomes two build rules invoking rustc, with a normal dependency relationship via a .rlib artifact (and all the other dependencies).
The main problem with this is that fn main() {}
is special in that it has a variable signature and isn't public, so it would need special handling to be visible from step 1 to 2.
proc-macros may also pose a problem; I haven't looked into the details as closely.
At least for my use-cases, I don't see any loss of flexibility or functionality with respect to -Zno-link/link-only; it's strictly an improvement and easier to integrate. The particularly nice thing is that it's easy to prototype with no rustc changes at all, at least for executables, by manually changing fn main()
to pub fn main()
. But it would be nice to have rustc support so that this can be automatically applied without crate source changes.