Intuition for rustc current incremental compilation?

I wonder if there's any write-up about what current incremental compilation can and can't do? I want to understand it better, so that I know which changes are "fast" and which changes are "slow", and to optimize the structure of my project for fast incremental compiles.

For example, at the moment I can't explain the following: changing a (non-doc) comment in rust-analayzer's middle-end crate and rebuilding the analyzer in "debug" mode takes 10s. I would expect this no-op change to be effectively no-op tbh.

Here's what I am doing.

At this line I added // ... and I add or remove the . (so the line numbers are intact). I then build the leaf crate using

λ rustc --version
rustc 1.49.0-beta.2 (bd26e4e54 2020-11-24)
λ cargo build -p rust-analyzer --bin rust-analyzer -Ztimings=info

There's debug=0 in Cargo.toml profile.

With all that, this is what I am seeing:

   Compiling hir v0.0.0 (/home/matklad/projects/rust-analyzer/crates/hir)
   Compiling ide_db v0.0.0 (/home/matklad/projects/rust-analyzer/crates/ide_db)
   Completed hir v0.0.0 in 0.7s
   Compiling assists v0.0.0 (/home/matklad/projects/rust-analyzer/crates/assists)
   Compiling completion v0.0.0 (/home/matklad/projects/rust-analyzer/crates/completion)
   Compiling ssr v0.0.0 (/home/matklad/projects/rust-analyzer/crates/ssr)
   Completed ide_db v0.0.0 in 0.9s
   Completed ssr v0.0.0 in 0.5s
   Completed completion v0.0.0 in 0.6s
   Compiling ide v0.0.0 (/home/matklad/projects/rust-analyzer/crates/ide)
   Completed assists v0.0.0 in 0.9s
   Compiling rust-analyzer v0.0.0 (/home/matklad/projects/rust-analyzer/crates/rust-analyzer)
   Completed ide v0.0.0 in 0.8s
   Completed rust-analyzer v0.0.0 in 1.6s
   Completed rust-analyzer v0.0.0 bin "rust-analyzer" in 5.8s
    Finished dev [unoptimized] target(s) in 10.06s

THe hir is the middle crate being edited, the rust-analyzer is the final leaf binary, and other crates are some front-end IDE bits.

I am surprised that middle crates take up to a second to compile, as those should be no-ops...


Unless something has changed since March, incremental compilation in rustc starts after HIR construction and ends before the linking step. That means that parsing, macro expansion, and name resolution are executed unconditionally, and that the linker will always be invoked with the full set of object files (i.e. linking is not incremental). This results in a certain amount of fixed overhead once cargo decides that rustc must be invoked.

There are also a number of additional inefficiencies that will become more prominent the less work actually has to be redone. For example crate metadata is always entirely rebuilt, so that even if nothing changes everything that goes into crate metadata has to be at least loaded from the incr. on-disk cache. The same is true for the dependency graph and the on-disk cache itself, they also are entirely re-created for each succeeding rustc invocation.

These are all just general guesses though. self-profiling would give you a clearer picture on where the compiler actually spends its time (self-profiling can also collects data on overhead introduced by incremental compilation, like time spent loading or persisting things from/to disk).


Yeah, I think this is what I am observing here, monomorphization_collector_graph_walk and generate_crate_metadata| are the biggest time-sinks according to -Ztime-passes

Yes, the monomorphization collector is a monolithic pass that causes a lot of MIR to be loaded (from upstream crates and the incr. cache). Somehow subdividing the work there, so that it can be made incremental might speed certain scenarios up quite a bit.

1 Like

The spans stored in for example mir are relative to the start of the source file, which means that any changes to a file will invalidate all functions defined after it.

In my experience most of the time attributed to the monomorphization collector is actually from the optimized_mir queries, which can easily be invalidated by a changed span.

You should probably use -Zself-profile. This is a much more reliable way of attributing the time to specific actions than -Ztime-passes for queries.


Yes, the performance of incremental compilation depends a lot on the form of the dependency graph and unnecessary/accidental dependencies can have a big impact, with span information being a particularly bad case because it also changes quite frequently (see

1 Like