Statistics of and Ideas for Lazy compilation

DragonDev1906 · October 7, 2024, 12:44pm

First of:

This is nothing new, and there have probably been hundreds of people asking for this ^[1].
Yes, this would be a big/massive change to the compiler (including the distinction between rust and cargo), but I think it can also have big impacts on compilation speed that may be really hard to achieve by incremental compilation alone ^[2].
There have been posts on this topic, for example Laziness in the compiler - #3 by SichangHe (and probably many others I don't know of), but so far I couldn't find much on the two topics I'll mention below.

Impact Statistics

So far I couldn't find statistics on how much potential is in lazily compiling crates (nor much discussion about it besides what I wrote above). Finding out the compile time impact is of course really hard, but are there any statistics on how much code of the dependencies ends up in the binary (e.g. in terms of LoC and/or compiled binary size) with the current compiler implementation? Or alternatively a way to run Rust/Cargo with a flag to allow getting that information. The best way I could come up with so far is to use code coverage, but that of course only covers code that is actually executed at runtime/in tests ^[3].

I would not be surprised if in many cases less than 20% of the dependency code ends up in the binary ^[4], even when only considering basic dead code elimination like whether an import or fully qualified name (<crate>::<module>::<mytype>) exists]. Proc-macros are an issue with this and likely would need to be compiled + executed before compiling the dependencies (potentially slowing things down again), but I also wouldn't be surprised if compiling these proc-macros will be faster, since I doubt they use a lot of parts from common crates like syn ^[5] (especially when not making good use of the feature flags there ^[6]).

Unfortunately there is no good way to evaluate that impact without statistics on how much of the dependencies is actually relevant. If the main goal of such a change would be to improve compile times, one of the first steps should probably be to do a crater run and collect information on how many lines are currently "needlessly" compiled and thrown away later. That may require a few smaller changes in the compiler to collect that statistic, but it may give really valuable information and allow more informed decisions in regards to both incremental and lazy compilation, as well as the effectiveness of dead code elimination as it exists now. And collecting these statistics ^[7] will be useful for normal development as well, for example to show/hint that you could disable a feature flag you don't need, since you use 0% of the code it enables.

Ideas for combining incremental with lazy compilation

I don't know much of the compiler internals, many of you can probably say a lot more about this ^[8]. If there have been discussions about the points below please point me towards them, as I'm interested what you think about them and if/why they might have been dismissed so far ^[9].

Most of the things I've seen so far have been an all-of-nothing approach to lazy compilation. Wouldn't it make sense to use the current "complete" compilation for the most indirect and/or most often depended on (in the build plan) crates, which could start compilation immediately and from which you probably need a lot of things, while compiling proc macros and doing the "querying" of what is needed from the other crates? That way you still have trivial parallelism while doing the harder-to-parallelize task of figuring out which parts of a crate need to be compiled. ^[10]
- Another useful way to decide may be to use lazy compilation for indirect dependencies (except perhaps the first ones to improve parallelism) and complete compilation for direct dependencies, as it's more likely you're going to start using a new function in them. In many cases this probably reduces the impact of lazy compilation (especially when having many direct dependencies that don't depend on much else).
Make use of a list of included symbols and/or bloom filters to decide if a cached dependency/incremental compilation step can be used. Unless I'm mistaken rust/cargo currently only uses a small hash containing version, environment variables, conditional compilation and similar. This would for example allow using a cached dependency that contains more things than needed (late tree-shaking) instead of recompiling the same dependency with one less function. This is probably one of the most relevant changes to how dependencies and incremental compilation artifacts are stored/used and how it useful lazy compilation is in general ^[11].

This means using a function may cause the dependency to recompile (thus slowing down compile times), but removing it again will not cause a rebuild. With incremental compilation this could potentially even be reduced to only having to recompile that single dependency and your crate (instead of the entire tree), but that may be difficult/hard to do.

Another option could be to just say "if you have to recompile a crate because a new function is used: do not use lazy compilation for that crate". That way the cost of recompiling the tree is reduced to only happening a single time, but once (for example) you update the dependency you get lazy compilation again.

Regarding the use of feature flags to reduce compile times

I think this indicates that (although they are really useful for this) feature flags are not the right/best solution for early dead-code-elimination.

Regarding the current incremental progress to early dead-code-elimination

I think this is leading to a local optimum. One where the above mentioned things probably still need to happen and will still have a significant impact on compile times. At that point you're either accepting the compile times as they are or you have to make (potentially even more) changes that increase compile times to get from that local optimum to a place where compilation times are even lower.

A documented (long-term) plan on lazy evaluation on a high-level view may help here. ^[12]

If that's the case: Sorry for being yet another person to bring up this topic. ↩︎
If not impossible ↩︎
Nor do I know a way to ask rust/cargo to warn/inform when a pub function/struct is used nowhere in the workspace or build graph, which would allow similar but less detailed statistics. ↩︎
At least for crates with many dependencies. ↩︎
As an aside: Are there statistics/Is there a way to know how much time when using proc-macros is spend compiling them and how much time is spent executing them? ↩︎
See quote at the end ↩︎
Depending on how detailed they are of course. ↩︎
Which means the following could also be complete garbage. ↩︎
I doubt I'm the first one to come up with these. ↩︎
This complicates the build plan/graph of course. ↩︎
And probably having it be useful when combined with incremental compilation and cached dependencies in general. ↩︎
That probably exists, perhaps just in heads but not documented. Any links/references to this? ↩︎

bjorn3 · October 7, 2024, 8:28pm

The keyword you are looking for is mir-only rlibs.

Topic		Replies	Views
Unused dependency code - can compiler performance be improved by doing less work? compiler	12	1748	May 24, 2022
Laziness in the compiler compiler	13	1957	October 14, 2023
Slow Linking with External Crates (trying to investigate internal cause)	3	955	March 25, 2019
Help us benchmark incremental compilation!	48	12258	March 25, 2019
Dynamic linking for compilation speed improvement? compiler	10	2447	March 19, 2021

Statistics of and Ideas for Lazy compilation

Impact Statistics

Ideas for combining incremental with lazy compilation

Regarding the use of feature flags to reduce compile times

Regarding the current incremental progress to early dead-code-elimination

Related topics