Help us benchmark incremental compilation!


#1

As we come down the home stretch for 2017, @michaelwoerister and others have been hard at work getting incremental compilation ready to ride the trains to stable. But we need your help to get it across the finish line!

Here’s how you can help:

  1. Grab a project with poor compile times
  2. Do cargo +nightly build --release
  3. Do:
    • rm -rf target, then
    • time cargo +nightly build --release
  4. Do:
    • rm -rf target, then
    • CARGO_INCREMENTAL=1 time cargo +nightly build --release
  5. Do:
    • touch lib.rs, then
    • CARGO_INCREMENTAL=1 time cargo +nightly build --release
  6. Do:
    • A trivial change to a method body (like adding a no-op), then
    • CARGO_INCREMENTAL=1 time cargo +nightly build --release
  7. Post the project and numbers here!

We want to hear about the timings you see, in part so that we can beef up the benchmark suite with a much wider range of examples. In the next several weeks, we expect some major improvements to incremental to land (e.g., being able to skip typeck), and we’d like to track the improvements to these numbers as that happens.

Note: these timings are based on the release profile because, today, incremental doesn’t help much with debug. That will be changing in the very near future.

For compiler team members: beyond extra benchmarks, what improvements to http://perf.rust-lang.org/ would be most helpful to track performance in incremental mode? Is it enough to have an easy-to-get “view”, using today’s knobs but not having to set them yourself? Or is there a view of the data you want but can’t get with today’s perf?

For everyone: if you have more thoughts about what/how we should be measuring or presenting measurements, please speak up!


The impl period newsletter #4
Impl period newsletter #3
#2

cc @michaelwoerister @nikomatsakis @alexcrichton @arielb1 @Mark_Simulacrum


#3

Project: cargo-script (specifically, commit 614e60e.)

Results:

Kind Run 1 Run 2 Run 3 Mean SD
Full LTO 262.8 s 257.9 s 259.1 s 259.9 s 2.593
Full 189.7 s 189.3 s 188.9 s 189.3 s 0.411
Noop 23.5 s 23.4 s 23.9 s 23.6 s 0.288
Full Inc 96.7 s 105.7 s 102.5 s 101.6 s 4.568
Touch Inc 5.0 s 4.6 s 4.9 s 4.8 s 0.222
Noop Inc 7.5 s 7.3 s 7.4 s 7.4 s 0.108

The “Noop” change was to append fn _dummy() {} to the end of main.rs. Only the runs ending in “Inc” had incremental compilation turned on.


#4

My (currently) private 3D project which utilizes Vulkano for point cloud visualization.

  1. 365.36 secs
  2. 235.30 secs
  3. 3.51 secs
  4. 8.51 secs (added to_le() to one variable)

Without incremental compilation trivial change takes 40-50 seconds to compile.


#5

Project: Gutenberg at commit 3f9d394.

Results (using the same numbers as in the original post):

  1. 723.96 secs
  2. cargo +nightly build --release 2065.99s user 25.46s system 238% cpu 14:36.46 total
  3. cargo +nightly build --release 1983.41s user 23.65s system 248% cpu 13:28.80 total
  4. cargo +nightly build --release 41.57s user 0.48s system 98% cpu 42.796 total
  5. cargo +nightly build --release 43.68s user 0.56s system 97% cpu 45.476 total

Touching one lib.rs of a workspace crate: cargo +nightly build --release 154.71s user 1.17s system 132% cpu 1:58.03 total


#6

@aturon Maybe 3. should explicitly disable incremental compilation just to be safe?

I think everything that uses lalrpop is probably a good benchmark. For example the gluon parser CARGO_INCREMENTAL=0 time cargo +nightly build --release

3.)

parser git:(master) ✗ CARGO_INCREMENTAL=0 time cargo +nightly build --release => 421.18 secs

4.)

parser git:(master) ✗ CARGO_INCREMENTAL=1 time cargo +nightly build --release => 220.70 secs

5.)

parser git:(master) ✗ CARGO_INCREMENTAL=1 time cargo +nightly build --release => 7.72 secs

extra.) touch lib.rs but with CARGO_INCREMENTAL=0

parser git:(master) ✗ CARGO_INCREMENTAL=0 time cargo +nightly build --release => 24.95 secs

nightly

➜  parser git:(master) ✗ rustc --version --verbose
rustc 1.23.0-nightly (d9f124965 2017-10-27)
binary: rustc
commit-hash: d9f124965551e8fb9403def6a715e13dfd5e9c95
commit-date: 2017-10-27
host: x86_64-unknown-linux-gnu
release: 1.23.0-nightly
LLVM version: 4.0

#7

This is for a moderately sized project I’ve been developing at work (can’t share the repo though :frowning: ). Commit bcc8172.

$ unset CARGO_INCREMENTAL
$ rustc +nightly --version --verbose
rustc 1.23.0-nightly (2be4cc040 2017-11-01)
binary: rustc
commit-hash: 2be4cc040211a85b17f21e813ff62351ae4de642
commit-date: 2017-11-01
host: x86_64-unknown-linux-gnu
release: 1.23.0-nightly
LLVM version: 4.0

# point 3
$ rm -rf target
$ time cargo +nightly build --release
1570.75s user 10.33s system 271% cpu 9:42.47 total

# point 4
$ rm -rf target
$ CARGO_INCREMENTAL=1 time cargo +nightly build --release
1608.94s user 12.92s system 349% cpu 7:43.97 total


# point 5
$ touch lib.rs 
$ CARGO_INCREMENTAL=1 time cargo +nightly build --release
2.51s user 0.21s system 100% cpu 2.691 total

# Point 6
# Add "let _x = 5;" to one of the files
$ CARGO_INCREMENTAL=1 time cargo +nightly build --release
2.77s user 0.20s system 107% cpu 2.766 total

#8

Project: media-toc at ba3713b

rust version 1.23.0-nightly (2be4cc040 2017-11-01)

  1. 210.81 secs
  2. 218.64 secs
    real 3m38,946s user 7m55,918s sys 0m7,200s
    
  3. 165.49 secs
    532.42user 7.86system 2:45.70elapsed 326%CPU (0avgtext+0avgdata 1329836maxresident)k
    1360inputs+566688outputs (9major+2934617minor)pagefaults 0swaps
    
  4. 3.66 secs
    3.50user 0.37system 0:03.87elapsed 100%CPU (0avgtext+0avgdata 560688maxresident)k
    0inputs+60224outputs (0major+199198minor)pagefaults 0swaps
    
  5. 3.83 secs
    4.24user 0.37system 0:04.04elapsed 114%CPU (0avgtext+0avgdata 552592maxresident)k
    0inputs+60432outputs (0major+191183minor)pagefaults 0swaps
    

For 6, I added a println!.


#9

Here’s my results:

3.    Finished release [optimized] target(s) in 227.2 secs

real    3m47.627s
user    13m7.008s
sys     0m18.948s

4.    Finished release [optimized] target(s) in 192.88 secs
712.74user 17.50system 3:13.47elapsed 377%CPU (0avgtext+0avgdata 350900maxresident)k
0inputs+522008outputs (0major+1833555minor)pagefaults 0swaps

5.    Finished release [optimized] target(s) in 3.52 secs
3.58user 0.54system 0:04.14elapsed 99%CPU (0avgtext+0avgdata 329160maxresident)k
0inputs+31344outputs (0major+57270minor)pagefaults 0swaps

6.    Finished release [optimized] target(s) in 4.73 secs
5.71user 0.54system 0:05.32elapsed 117%CPU (0avgtext+0avgdata 326920maxresident)k
0inputs+31624outputs (0major+52478minor)pagefaults 0swaps

I’m not comfortable sharing my code, but it’s a slideshow viewer that depends on piston and image.


#10

Hello,

Here are the results for the Sōzu HTTP reverse proxy. Code changes are done in the command library that every other crate of the project uses:

  1. 222.40 secs
  2. 129.31 secs
  3. 13.37 secs
  4. 13.95 secs

That’s awesome!


#11

Personal game project based on piston:

  1. rm -rf target/ && time cargo +nightly build --release

real 1m43.398s user 7m44.359s sys 0m16.399s

  1. rm -rf target/ && CARGO_INCREMENTAL=1 time cargo +nightly build --release

    87.72 real       486.96 user        18.64 sys
    
  2. touch src/main.rs && CARGO_INCREMENTAL=1 time cargo +nightly build --release

     1.63 real         1.27 user         0.34 sys
    
  3. CARGO_INCREMENTAL=1 time cargo +nightly build --release

     2.51 real         2.75 user         0.36 sys
    
  4. time cargo +nightly build --release

     7.94 real         7.60 user         0.31 sys
    

The numbers for step 7 are measuring undoing the previous noop change (I added a (); line to a function in src/main.rs) and seeing how long that takes without incremental compilation enabled.


#12

Hi, I’m working on one private project which performs some data processing and computation, here’s a compilation speed comparison result:

  1. 94.353 total
  2. 59.860 total
  3. 3.761 total
  4. 3.956 total

I’m always benchmarking this app and it is worrying that I found that the incrementally compiled version has significantly worse performance than the normal one.

Normal release version processing time: 0.140sec total

CARGO_INCREMENTAL=1 release version processing time: 0.512sec total

Which means 3.6x slowdown for the incrementally compiled version. Is this expected?


#14

Some slowdown is expected, although that’s a pretty big one.

Good news: ThinLTO should be enabled by default very soon, which will bring the performance of incremental builds on par with normal builds.


#15

Project Dora:

  1. 134s
  2. 170s
  3. 4.38s
  4. 6.11s

Somehow 4) is slower than 3) for me, for most projects posted here this seems to be the other way around…


#16

I decided to run it on a small crate with basically no dependencies, to make sure that there’s no significant regressions there. I’m also on Windows.

First build, just to give you a general idea.

> Measure-Command { cargo +nightly build --release }
   Compiling semver-parser v0.7.0
   Compiling semver v0.9.0 (file:///C:/Users/steve/src/semver)
    Finished release [optimized] target(s) in 4.7 secs


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 4
Milliseconds      : 541
Ticks             : 45413291
TotalDays         : 5.25616793981481E-05
TotalHours        : 0.00126148030555556
TotalMinutes      : 0.0756888183333333
TotalSeconds      : 4.5413291
TotalMilliseconds : 4541.3291

with incremental on:

> Measure-Command { cargo +nightly build --release }
   Compiling semver-parser v0.7.0
   Compiling semver v0.9.0 (file:///C:/Users/steve/src/semver)
    Finished release [optimized] target(s) in 4.43 secs


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 5
Milliseconds      : 71
Ticks             : 50710922
TotalDays         : 5.86931967592593E-05
TotalHours        : 0.00140863672222222
TotalMinutes      : 0.0845182033333333
TotalSeconds      : 5.0710922
TotalMilliseconds : 5071.0922

touching lib.rs:

> Measure-Command { cargo +nightly build --release }
   Compiling semver v0.9.0 (file:///C:/Users/steve/src/semver)
    Finished release [optimized] target(s) in 0.92 secs


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 1
Milliseconds      : 354
Ticks             : 13546301
TotalDays         : 1.56785891203704E-05
TotalHours        : 0.000376286138888889
TotalMinutes      : 0.0225771683333333
TotalSeconds      : 1.3546301
TotalMilliseconds : 1354.6301

adding a let binding to a method

> Measure-Command { cargo +nightly build --release }
   Compiling semver v0.9.0 (file:///C:/Users/steve/src/semver)
warning: unused variable: `x`
   --> src\version.rs:241:13
    |
241 |         let x = 1;
    |             ^
    |
    = note: #[warn(unused_variables)] on by default
    = note: to avoid this warning, consider using `_x` instead

    Finished release [optimized] target(s) in 1.10 secs


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 1
Milliseconds      : 590
Ticks             : 15908323
TotalDays         : 1.84124108796296E-05
TotalHours        : 0.000441897861111111
TotalMinutes      : 0.0265138716666667
TotalSeconds      : 1.5908323
TotalMilliseconds : 1590.8323

#17

Project: ggez

Compiler: rustc 1.23.0-nightly (2be4cc040 2017-11-01)

Times:

  1. 733.83user 11.20system 7:26.52elapsed 166%CPU (0avgtext+0avgdata 399868maxresident)k
  2. 701.60user 11.82system 6:22.74elapsed 186%CPU (0avgtext+0avgdata 551944maxresident)k
  3. 7.16user 0.47system 0:07.75elapsed 98%CPU (0avgtext+0avgdata 550680maxresident)k
  4. 7.51user 0.43system 0:07.77elapsed 102%CPU (0avgtext+0avgdata 538092maxresident)k

Change in 6 was just adding a println!


#18

Hello,

I tried a crate which used to compile strangely on my machines in the past: nrc’s rustw.

  1. version + platform
$ rustc +nightly --version --verbose
rustc 1.23.0-nightly (2be4cc040 2017-11-01)
binary: rustc
commit-hash: 2be4cc040211a85b17f21e813ff62351ae4de642
commit-date: 2017-11-01
host: x86_64-pc-windows-msvc
release: 1.23.0-nightly
LLVM version: 4.0
  1. rm -rf target ; time cargo +nightly build --release
real    3m11.555s
user    0m0.046s
sys     0m0.000s
  1. rm -rf target ; CARGO_INCREMENTAL=1 time cargo +nightly build --release
real    2m8.805s
user    0m0.016s
sys     0m0.015s
  1. touch lib.rs ; CARGO_INCREMENTAL=1 time cargo +nightly build --release
real    0m7.507s
user    0m0.000s
sys     0m0.046s
  1. no op change; CARGO_INCREMENTAL=1 time cargo +nightly build --release
real    0m12.341s
user    0m0.000s
sys     0m0.015s
  1. undoing the no op change and recompiling the crate without incremental
real    1m10.605s
user    0m0.031s
sys     0m0.015s

#19

Some thoughts off the top of my head:

  • The interface is ridiculously sluggish. The only way to use it is to click Hide All first thing and then enable various tests. Also, if you want to see why some spike occurred, you have to click-and-drag to zoom in and then select the exact point where performance dropped. It feels like these very common operations could be more streamlined. But honestly just being faster would be a big part of that.
  • It’d be useful, I think, to have the ability to group the tests in some logical ways so they can one can enable/disable test suites instead of individual tests.
  • Further, right now for a single incremental test foo we actually have N measurements that are displayed as if they were individual tests. In particular, things like foo@010-bar andfoo@020-baz`. These are successive measurements of a single directory being built and rebuilt after changes have been applied. It would be very useful to group them together in some way so that they are enabled/disabled as a group.
    • The convention is:
      • 000-baseline is a non-incremental build
      • 010-incr is a “cold” incremental build with no incremental state
      • 020-clean is a second incremental build with no changes – kind of the “best case”
      • further changes are applying various diffs, always relative to the previous build
    • Therefore, comparing e.g. 020-clean to 000-baseline tells you how much faster an individual change was relative to today.
    • The default display normalizes all lines relative to themselves. This is perhaps useful for incremental tests but misleading: it seems like it would be useful to normalize foo@020-baz relative to foo@000-baseline – i.e., how long did it take to do some incremental change, relative to the base, non-incremental build? Or at least it ought to be easy to get that data out. (You can change to absolute measurements now, but then you have to guesstimate the percentages if you want that.)
  • It’d be great to have the ability to generate a (short) URL that can be given so that other people can see the same view that you are seeing.

Finally, not a UI thing, but:

  • It’d be great to measure disk space consumed by the incremental directory. Even better if we can chart that depending on the kind of file.

#20

For my own project with lots of crates:

3: non-incremental: 167.42s

4: fresh incremental: 163.55s

5: “touch” incremental: 1.34s

6: trivial change incremental: 1.58s


#21

Not expected, but we have seen incremental makes things slower, particularly in crates where there is just one module (which means that we don’t get good re-use, but we still pay the overhead of tracking). We plan to make some changes to address this – still, 3.6x is a lot! How private is this project? :slight_smile: Maybe we can have you run some more detailed commands to get further data? If nothing else, running with -Zincremental-info might be helpful.