What if we didn't compile dead code?


#1

In my post about compile times earlier this year, I wondered if we might get some benefit from having a more lazy compiler. At the time, I had a hunch that a lot of what is compiled when we build a dependency wasn’t actually used.

I did a quick test today with a project with a single dependency, which is far from being a decent sample set. Still, I wanted to put what I found here for discussion. It’s, at the very least, interesting to see.

The transformation I tried was pretty simple: take an external dependency, move it into your code as a module, and fix up the references. At this point, it’ll build in the same amount of time as it did before. Makes sense. It’s the same amount of code, after all.

Next, I noticed that I had over 40 unused function/struct messages at this point.

Eg:

warning: struct is never used: `WavIntoSamples`
   --> src/main.rs:915:1
    |
915 | / pub struct WavIntoSamples<R, S> {
916 | |     reader: WavReader<R>,
917 | |     phantom_sample: marker::PhantomData<S>,
918 | | }
    | |_^

warning: method is never used: `read_wave_header`
   --> src/main.rs:924:5
    |
924 | /     fn read_wave_header(reader: &mut R) -> Result<u32> {
925 | |         // Every WAVE file starts with the four bytes 'RIFF' and a file length.
926 | |         // TODO: the old approach of having a slice on the stack and reading
927 | |         // into it is more cumbersome, but also avoids a heap allocation. Is
...   |
941 | |         Ok(file_len)
942 | |     }
    | |_____^

My theory in the original post was that we should be lazy when we compile dependencies, meaning we shouldn’t be pulling in functions or structs if we didn’t need to. While this is important for responsiveness in IDEs, I thought it might also improve compile times.

I used these warnings to help me find dead code, which I removed. Here are some stats:

Before:

  • Debug: 1.0 secs
  • Release: 1.19 secs
  • Lines of code (incl. comments): 2335

After:

  • Debug: 0.69 secs
  • Release: 0.94 secs
  • Lines of code (incl. comments): 595

Note that my application is still the same. I just removed all the code I didn’t need to build it.

In the end, my app only needed about a quarter to a third of the code in the dependency. A quick look at why points to a few areas. All test functions are pretty trivially ignored. There was some functionality the test cases needed that could be safely removed as well once the test functions were removed. Also, my crate only needed some of the exported functionality. In particular, the crate I was depending on, Hound, works with a variety of .wav files, but I only needed one kind. All the code that supports the other formats could be safely ignored. Same for reading .wav files. Since my code was focused on writing instead of reading, the reading functionality wasn’t necessary.

I recognise this is an apples to oranges comparison. Today, the Rust compiler is thorough to check all dependent crates for any issues. My question is: does it need to? If you only use less a third of your dependency’s code, do you really need to pay for the compilation time of the other 2/3rds? Maybe there are tricks we can use in the compiler to get some of that time back.

Not sure how much this translates for other projects and their use of dependencies, but I think it’s worth a look.


#2

Aren’t dependencies only compiled when you first do cargo build? It is annoying when setting up a project, but I don’t see how this impacts IDE responsiveness.


#3

I don’t see how this impacts IDE responsiveness.

I also agree about IDE responsiveness not being a benefit here, but for a different reason: in IDEs, you should not look at the bodies of methods anyways (even if they are used), but you have to look even at the headers of unused methods, for “navigate to symbol”, autoimport and probably other stuff.

I wonder how this used code affects linking time nevertheless… For example, in Cargo we have a ton of integration tests, and each of these tests links to the Cargo library, which gives about 7GB of test artifacts… I bet not all code there is actually used, and it would be nice not to write 7GB to disk on each cargo test


#4

I meant laziness as a general feature is good for IDEs. This specific application of it isn’t specific to IDEs but rather to build time for first build.


#5

#[inline] or generic code is a bit like this, isn’t it? For some of the compilation passes, in particular the whole LLVM side. The monomorphic part is only ever compiled where it is used.

  • You can speed up your library’s compilation by applying #[inline] everywhere (only applicable if you have somewhere that’s without type parameters)
  • But… now the usage side has more code to compile in its own crate, their compile takes longer. You compile only what you use, but recompiles of just the usage crate are larger.

#6

This is a different module for compilation. One way to think of it is to imagine that cargo started with your project first, and didn’t build any of the dependencies. Then, as your project pulled in a dependency, it went and found it and only built the parts it needed from that dependency. The rest, as @matklad mentions, are only parsed to the point they they absolutely need to be and no more.

The end result is that your project builds successfully, but you haven’t spent time checking or compiling parts of dependencies your project doesn’t use.


#7

I feel like this is a long-term goal of the query system (along with MIR-only rlibs) – or maybe that has just been in my head. Right now, when handling queries for a crate X, when we get to the dependencies of crate X, we wind up loading the answers we need for metadata. But I think longer term we can direct those queries to the RLS, which might then invoke a compiler for the dependency crate instead. The nifty thing is that we could do this without deep changes to the compiler itself – the RLS would be doing the routing, and moreover serving as a cache for incremental results.

Update: To be clear, by talking about the RLS, I’m not saying this is only useful for IDE scenarios. I’m just saying that I think we would want to factor out the “cross-crate coordination” from the “compile a single crate” job that the compiler currently does, and I’m saying that the architecture should allow for that fairly easily.


#8

If anyone is wondering “what query system” - Niko pointed me to this new doc: https://github.com/rust-lang/rust/blob/master/src/librustc/ty/maps/README.md