Default opt-level for release builds


#1

Currently the Cargo release profile compiles with -C opt-level=3 by default. Should this be changed to opt-level=2 instead?

rustc and Clang both use -O2 as the “default” optimization level (the one you get if you just pass the -O flag). If I understand correctly, rustc’s optimization levels are approximately equivalent to Clang’s (i.e., they enable/disable most of the same LLVM passes). According to the Clang developers, Clang’s -O2 and -O3 flags roughly correspond to these goals:

  • -O2: optimized build, but should not explode in code size nor consume all resources while compiling
  • -O3: throw everything and hope it sticks

In many benchmarks, clang -O2 generates code that is as fast or faster than clang -O3.

I did some brief tests of a couple of crates (regex and quickersort) with opt-level set to both 2 and 3 in the release and bench profiles, while keeping all other options unchanged. My results for these two crates showed that opt-level=2 reduced build time slightly (1–3%), used about same amount of memory at build time (within 1%), and generated code that is about the same size. The opt-level=2 build was a few percent faster on several of the crates’ benchmarks, about the same speed on several more, and a few percent slower on some benchmarks.

I also recently tested the effects of opt-level=2 versus 3 on code size for Servo’s “stylo” library. In that case, opt-level=3 produced code that was about 6.5% larger than opt-level=2, but I don’t currently have performance measurements for the generated code.

I can continue gathering data on other crates, but first I’m curious if anyone has good arguments or data for or against changing the default opt-level.


#2

Honestly, I assumed it was at 2, until very recently when I noticed it was at 3. I’d support this.


#3

O3 in the GCC stack is notably complicated but I had the impression that O3 in the LLVM stack was recommended.

Do we have a precise list of the differences between O2 and O3 for LLVM?