Slow incremental compilation when changing small things or comments

Ardi · December 12, 2024, 8:43am

Get a big crate. Change let x = "foo"; for let x = "bar;", or add a comment. Watch as your life slips between your fingers after you type cargo check, or even worse with cargo build.

One would think that this should be instant, especially with incremental compilation, but it's not. This is even worse when changing a crate that other crates depend of in a workspace.

Is it possible that this kind of change could be a special case in the incremental compilation?

sahnehaeubchen · December 12, 2024, 9:36am

Changing a constant and adding a comment differ a lot in their potential consequences.

In many cases, most of the time of lang incremental builds is spent linking. The easiest way to improve on that is to switch to dynamic linking during development which is much more performant: https://robert.kra.hn/posts/2022-09-09-speeding-up-incremental-rust-compilation-with-dylibs/

Ardi · December 12, 2024, 9:42am

In many cases, most of the time of lang incremental builds is spent linking

Not in check builds, where it also happens

kpreid · December 12, 2024, 4:40pm

One of the tricky bits of incremental compilation is that some things depend on spans (line and column) of source text: panic sites and debugger information. For example,

fn foo() {
    // delete this comment line?
}

fn bar() {
    panic!("hello world");
}

If you change the height of foo’s text, then the panic message produced by bar() changes its line number. Then because that implicit string literal has changed, other things dependent on that have be recomputed too. So, if you avoid any span changes (e.g. by using more, smaller files so fewer things are after the code you edit), you can reduce the rebuild time. In my previous testing (unfortunately, I can’t recall where I might have written down the results) this was, I think, something like a 10% speedup in the scenario I was testing.

It might also help to disable debug info to further reduce dependence on line numbers (and reduce the work of compilation in general) but I haven’t tried this.

There probably is also potential to make the compiler better at partitioning this kind of thing from other queries that don’t need to be recomputed, but I don’t know the details of the compiler’s subsystems so I can’t comment on how feasible that is. I know that it wouldn’t be a matter of “well, don’t recompute that when you don't need to” because the query system automatically handles that for everything; it would involve changing what queries exist and how they depend on each other (probably breaking up queries into smaller ones).

epage · December 12, 2024, 5:22pm

This is being tracked in Downstream dependencies of a crate are rebuilt despite the changes not being public-facing · Issue #14604 · rust-lang/cargo · GitHub

epage · December 12, 2024, 5:28pm

iirc there was discussion on changing spans for proc-macro expansion caching. Unsure if its limited to that case or will help in more of these cases.

epage · December 12, 2024, 5:30pm

@davidlattimore has done some investigation on incremental compilation performance. iirc one of the problems is with the query system.

davidlattimore · December 12, 2024, 11:41pm

The span issue, mentioned by @kpreid, is something I've thought a bit about before. I'd really like it if spans could be made relative to the named item that contains them. So for example in the code given above, the panic message would contain a span relative to the start of the function bar and a DefId of bar, or some stable ID derived from the item path. At runtime, or when a panic occurs, the actual line number could be looked up by calling some function, passing the relative span, the DefId and a reference to a table that maps DefIds to their spans. Structured like this, when you make an edit, all that would need to change would be the function you edited and the table mapping DefIds to spans.

A related issue is the size of the codegen unit. At the moment, even in debug builds a lot of functions are packed together into a codegen unit. At least for a non-optimised (dev build), it'd be ideal if each function was a separate codegen unit. That way when a single function gets changed, only that one function needs to be recompiled. If that changed function was inlined into another function, then it too would been to be recompiled. But other functions that weren't changed, shouldn't need to be. I think I recall @bjorn3 mentioning that the cranelift backend compiles each function separately.

Another thing related to incremental compilation that I've thought a bit about is whether it's primarily pull-based or push-based. In a pull-based model (which is what is there now), the compiler starts by effectively asking what it needs in order to build the binary. It parses everything, then runs queries, reusing cache hits from previous runs. One issue with this is that if you have a very large tree of queries, you need to traverse the tree right down to the leaves before you can determine that those leaves and thus their parents in the tree haven't actually changed. Another problem is that some things don't lend themselves to queries like this at all because they always change. An example of this is the list of monomorphised items. i.e. the list of all functions that need to be passed to codegen. Any code edit might have changed this list, so it doesn't make sense to cache it. Recomputing it from scratch every compile takes time and is something that often shows up in -Ztime-passes.

The alternative model (although potentially both models can be used together) is push-based. In a push-based model, the compiler starts by determining what inputs have changed. e.g. it looks at all its input files and finds that just one file has changed. It then reparses just that one file. Taking the parsed items, it pushes these changes through the stages of the compiler. Only the items that have actually changed need to be pushed. So once you get to say the list of monomorphised items, rather than computing it from scratch, you've got some deltas, adding, removing or redefining some functions.

A push-based model is also ideal for integrating with an incremental linker, since it can pass just the bits that have changed to the linker rather than passing everything and making the linker figure out what has changed. I've been writing a linker called Wild with the plan to make incremental, so this has been on my mind.

bjorn3 · December 13, 2024, 8:53am

Cranelift compiles one function at a time, however cg_clif currently does still compile and cache a single object file for each codegen unit the same way as cg_llvm. A single object file for each individual mono item would likely have too much overhead. In the future I may add caching for individual functions however.

josh · December 13, 2024, 4:23pm

That's a really interesting point! For a long time I've assumed that pull-based would be ideal, but you're right that if there's clear information for dependencies, push-based has the potential to eliminate a lot of "is this up to date" checks.

ekuber · December 13, 2024, 6:37pm

Note that there's no need to re-architect the compiler to accomplish that output for the benefit or an incremental linker. We can instead keep everything as is except the monomorphization/codegen steps, where we can check if the DefId of the item being generated has (transitively) changed and if not do not further evaluate it. This doesn't bring all of the theoretical performance benefit, as some redundant work is still happening before those stages, but it could significantly cut on the work the linker would have to do without as big an engineering lift to get it working to begin with. I think it makes sense to attempt that first, and once codegen and Wild are working correctly with each other, we can go back and reduce redundant work one stage at a time, earlier and earlier in the process. For the record, I don't think that parsing the full crate is ever a significant drain, but would love us if we ever got to the point of "we only reparse files that got edited and take it from there".

tigregalis · December 21, 2024, 11:43am

Can we upstream this, or some version of this, into a cargo subcommand?

The unfortunate thing is that in this version you have to specially add your dependencies (and it basically just creates a wrapper crate)... but surely cargo could just do this transparently with your normally added dependencies.

Topic		Replies	Views
Intuition for rustc current incremental compilation? compiler	6	1008	March 4, 2021
Slow Linking with External Crates (trying to investigate internal cause)	3	955	March 25, 2019
Incremental Compilation Beta compiler	37	30298	March 25, 2019
Dynamic linking for compilation speed improvement? compiler	10	2431	March 19, 2021
Statistics of and Ideas for Lazy compilation compiler	1	193	October 7, 2024

Slow incremental compilation when changing small things or comments

Related topics