Help test faster incremental debug macOS builds on nightly

Are you on macOS? Would you like a faster incremental debug build? Nightly has the answer for you! The Cargo team is looking for feedback on new feature which you can opt-in with:

# inside of ~/.cargo/config.toml
[profile.dev]
split-debuginfo = 'unpacked'
[profile.test]
split-debuginfo = 'unpacked'

This can make your incremental debug builds up to seconds faster depending on the use case. This option is not turned on by default but we would like to turn it on by default in the future! The Cargo team, however, would like to take some time to evaluate this change. If you experience any breakage using this feature, please report it here or to the Cargo team.

Background

A historical pain point for macOS Rust users is that incremental debug builds can often be far worse than compared to other platforms. The reason for this is that when debuginfo is enabled (as it is by default for cargo build) then rustc will automatically execute the dsymutil command line tool on the final executable. The purpose of dsymutil is to "link" debuginfo together. This works over your entire executable and creates a separate *.dSYM folder next to the binary itself, and this .dSYM folder contains all the DWARF debug info you'll need for the binary.

The dsymutil tool, however, is not incremental. This means that it can take quite a long time to run, especially over larger projects that have more and more debug information. If you're not using debuginfo or all you're using it for is filenames and line numbers in backtraces, this is often a hefty cost to pay! I've personally measured locally that of a ~10s incremental build time locally on macOS for Cargo itself ~7s are spent purely in dsymutil. That means 70% of my incremental build time was doing something that I rarely use!

Another downside of running dsymutil is that debuginfo takes up quite a lot of space on disk. By "linking" debug information you end up with two copies, one in *.rlib and *.o files and one in the *.dSYM folders. Debug information can sometimes approach the size of gigabytes so this can be a lot of space wasted.

Recently the -C split-debuginfo option was stabilized for macOS targets. This has been exposed in Cargo as well now through the split-debuginfo profile option. The defaults for all compilations have not changed at this time.

The Cargo team would like to change the default split-debuginfo setting for macOS to unpacked by default (it's currently packed). This might be a breaking change for some workflows, however, so that's where you come in!

  • Does changing to unpacked break your build?
  • Does it speed up your build? If so, by how much?
  • Does this make your target directory smaller? If so by how much?
  • Do you have external tooling relying on *.dSYM files existing?
  • Do you use any tooling locally that requires *.dSYM to exist? (e.g. does your local debugger/tooling support the unpacked format of the object files)

In the future when -Csplit-debuginfo is stabilized for other Unix platforms (those using ELF binaries, e.g. Linux) the Cargo team would also like to switch the default to unpacked. This will likely require another round of testing, however, and is blocked on the stabilization of -Csplit-debuginfo for other Unix platforms in the first place (this is pseudo-tracked here and is known as "split dwarf" in the compiler itself).

How can I help?

First make sure you're building for macOS. Next make sure that you're on the nightly channel. Next configure Cargo to use split-debuginfo=unpacked. This configuration can be via TOML:

# inside of ~/.cargo/config.toml
[profile.dev]
split-debuginfo = 'unpacked'
[profile.test]
split-debuginfo = 'unpacked'

or you can export environment variables too for your shell or build:

$ export CARGO_PROFILE_DEV_SPLIT_DEBUGINFO=unpacked
$ export CARGO_PROFILE_TEST_SPLIT_DEBUGINFO=unpacked

Next run cargo build. Then cargo test. Then incremental. Then anything else if you're feeling wild!

In other words we're looking to confirm that this is a build time win, shrinks the size of the target directory, and whether or not it breaks your tooling. If it breaks anything, we're interested in the details there too!

16 Likes

Minimal incremental WebRender wrench builds go from 8.7s to 3.0s for me with this change. Thanks a lot for making it happen.

What's the earliest nightly that's valid to test?

What do you mean exactly by "Then incremental"? Is incremental on by default now, so just continue working and building and testing as usual after the initial build and test?

Should we capture the size of the target directory before enabling these options? That is, between what commands do you hope to see the size of the target directory shrink?

1 Like

nightly-2021-02-06 should be the first release where this is available. This will be in the 1.51 release, but unfortunately beta isn't building right now.

I think incremental means something like cargo build, then touch src/lib.rs and then time cargo build. The 2nd call to build should use the incremental cache to very quickly rebuild the package.

When doing timing tests, I usually do them with the OS caches warm (run the build at least once and throw away that result).

For the size comparison, it would be something like run cargo build before setting the profile. Check the size of target. Then delete the target directory. Then enable the profile option and run cargo build again. Then compare the sizes.

For me, building cargo itself, a full build is about 2x faster (164s to 87s) and an incremental build is over 6x faster (69s to 11s). The target directory size went from 1.1GB to 0.8GB.

3 Likes

pulldown-cmark
normal: cargo build 10.61s user 1.52s system 199% cpu 6.064 total, target size: 46M
after: cargo build 9.93s user 1.35s system 216% cpu 5.215 total, target size: 34M

cargo
before: cargo build 349.55s user 33.93s system 484% cpu 1:19.11 total, target size: 1.1G
after: cargo build 330.23s user 31.25s system 506% cpu 1:11.38 total, target size: 823M

tide
before: cargo build 264.57s user 25.56s system 685% cpu 42.299 total, target size: 584M
after: cargo build 231.53s user 22.16s system 617% cpu 41.086 total, target size: 377M

druid
before: cargo build 114.77s user 10.37s system 434% cpu 28.782 total, target size: 323M
after: cargo build 113.19s user 9.85s system 429% cpu 28.668 total, target size: 248M

before:

Executed in 55.55 secs fish external usr time 287.68 secs 122.00 micros 287.68 secs sys time 23.46 secs 581.00 micros 23.46 secs

after:

Executed in 34.32 secs fish external usr time 249.38 secs 129.00 micros 249.38 secs sys time 18.66 secs 746.00 micros 18.66 secs

small personal project with rocket

Works great for me.

Before: Full build: 120s After: Full build: 110s

Before: Incremental build: 3.1s After: Incremental build: 1.9s

Project is 26k lines of Rust. The full build is dominated by building spriv_cross, but this is definitely helping the linking performance for me.

This is on influxdb_iox.

I'm using rustc 1.52.0-nightly (35dbef235 2021-03-02) on Big Sur 11.2.2.

$ touch src/main.rs
$ time cargo build
   Compiling influxdb_iox v0.1.0 
    Finished dev [unoptimized + debuginfo] target(s) in 29.98s

real	0m30.227s
user	0m26.794s
sys	0m3.865s
$ touch src/main.rs
$ time cargo build
   Compiling influxdb_iox v0.1.0 
    Finished dev [unoptimized + debuginfo] target(s) in 30.47s

real	0m30.672s
user	0m27.031s
sys	0m3.829s
$ du -hs target/
2.9G	target/
$ rm -rf target/
$ export CARGO_PROFILE_DEV_SPLIT_DEBUGINFO=unpacked
$ export CARGO_PROFILE_TEST_SPLIT_DEBUGINFO=unpacked
$ cargo build
... recompiling everything ...
$ du -hs target/
2.1G	target/
$ touch src/main.rs
$ time cargo build
   Compiling influxdb_iox v0.1.0
    Finished dev [unoptimized + debuginfo] target(s) in 13.54s

real	0m13.726s
user	0m11.291s
sys	0m2.347s
$ touch src/main.rs
$ time cargo build
   Compiling influxdb_iox v0.1.0 (/Users/carolnichols/rust/delorean)
    Finished dev [unoptimized + debuginfo] target(s) in 11.39s

real	0m11.579s
user	0m10.360s
sys	0m2.138s