Let's talk about parallel codegen

Release builds (real release builds, that you would ship to people) need to be optimized and have full debug info. A less-optimized "opt-debug" option doesn't make sense to me if it is named that.

I've changed my Cargo.toml files to hard-code LTO, opt-level 3, and codegen-units 1 for release/bench/test. I imagine other libraries developers will do the same once they learn that is what you need to do for good performance.

rustc and Servo are unusually large and complex compared to most Rust projects. IMO, the complexity for tuning compiler options should scale with the complexity of the project. Consequently, it doesn't make sense to me to change the defaults because compiling rustc and Servo is slow. Instead, it makes more sense to provide options for complex things like rustc and Servo to tune the trade-off, and keep the defaults sensible for smaller, less sophisticated projects.

BTW, earlier in the thread, it was mentioned that there are good reasons for not enabling LTO by default. IMO, LTO should be the default for release builds. It would be great to get a link to a list of good reasons why it is not.

This is getting a bit off-topic but I would not characterize rustc as a "large" project; it's about a tenth the size of LLVM. Maybe I'm biased because I find rustc irritatingly slow even for debug builds. Has there been any work on parallelizing other parts of the compiler? typeck, etc seem like they'd fit well.

@briansmith note that optimization and debug info are already decoupled today. You can do -O3 with debug info. Obviously it’s going to be garbage info because of how optimized it is, but it’s there.

I meant rustc and libstd and everything else that gets built when you check out GitHub - rust-lang/rust: Empowering everyone to build reliable and efficient software..

Right. I'm just pointing out that "opt-debug" and similar names are not good choices for a not-really-optimized configuration.

I hope the debug info for -O3 -g improves.

So I had some further thoughts here that I just wanted to write SOMEWHERE. I was thinking more about @alexcrichton’s vision that there should just be “debug” and “optimized”, and we should tweak those per-project. The main thing I wanted to point out is that there are actually a lot more knobs here than just how much we optimize the code: for example, how do we compile debug_assert? What about overflow checks (which are kind of a variety of debug_assert)? I’d also like to have some knobs for what to do in “impossible” cases, like matching an empty enum, so that in debug mode you might get a panic rather than just plain UB, etc. These all seem like things that an opt-debug sort of build might commonly toggle on.

I guess this doesn’t change the basic debate. After having reflected for some time, I feel like we probably want:

  1. A collection of readily accessible presets. One of these corresponds to “maximal debug” – the most safety checks, the lowest level of optimization, etc. One of them corresponds to “maximal perf” – minimal safety checks, maximum optimization. This probably means LTO. And then it seems we want some levels in between. I am not sure how many, but I suspect we want at least two:

    • a “opt-debug” mode that does optimization, but uses codegen-units and enables debug assertions etc
    • a “typical release” mode that does not do LTO; it may or may not use codegen-units (this seems to be partly what we are debating here)
  2. Advanced users can make their own profiles which derive from the above profiles but tweak various settings.

  3. The standard “debug” and “release” modes can be easily changed to any of the “detailed” profiles above by advanced projects, but by default debug is probably “max debug” and “release” is “typical release”. Not quite sure about this part.

I just wanted to note one further argument in favor of enabling parallel codegen for the “typical release” setting – it’ll make incr. comp. possible as well.

Sorry this is a bit scattered.

I’ve been thinking similar things, though I think I would break it down into just three different default levels, and then allow tweaking exactly what those levels mean:

  1. debug, or what you call “max debug”; do as few optimizations as possible and retain as much debugging information as possible, for cases where you really want to have one-to-one correspondence between lines of Rust source and generated code, and have the maximum number of possible assertions.
  2. devel (similar to what you call “opt-debug”). This is the default mode that you get if you don’t specify anything. Optimizations that are reasonably quick to apply are turned on, but nothing that will affect compile speed too much, and things like parallel codegen can be on by default. The usual debug assertions are on. This is what you generally expect to do most development against, what you get if you just check out a project and do “cargo build”, etc.
  3. release, which defaults to no debug assertions, and fairly aggressive optimizations (possibly even including LTO). This actually defaults to no parallel codegen, since release builds are frequently done on dedicated build machines and reproducibility and final executable speed are generally considered more important than improving compile speed. This is what most package managers that build binary packages will use (I don’t know of any package managers that do incremental builds by default, they generally do a full clean build for the sake of reproducibility).

With these three default levels, and allowing you to tweak exactly what these levels mean or define your own other specialized profiles, I think there would be a reasonable set of defaults that would work for most projects, and you would avoid some of the “why is rust so slow” questions that are caused by doing default builds that have no optimization whatsoever (though to do real benchmarking you’d still want to compare release builds).

I think most projects could live with those three levels as specified, while some that had particularly demanding needs could increase the release optimization level more, or people who wanted faster or incremental release builds could enable parallel codegen at that level, or people who really need more profiles for a project could define their own.

While it’s a separate issue, I also think that all levels should include symbols by default. I really, really dislike shipping production code that doesn’t have symbols available to make sense of a core dump that happens in the field, but is hard to reproduce. Most software I ship is in the form of dpkg or rpm, and the build tools for both of those (Debian, Fedora) have convenient helpers for stripping symbols from the package and collecting them into a separate package-dbg that you can then use to get useful information out of a core dump. That feature has made tracking down bugs in the field so much easier.

My goal with adding more roles is mainly to make tweaking easy. Probably rather than more modes what is really wanted is just easier profile settings. I also think it's important that projects be able to make their own modes. i.e., a project might want a "valgrind-friendly" mode or something.

I agree with this desire and motivation, but I have also seen cases where debuginfo reduces LLVM's ability to optimize, like I mentioned here. GCC works harder to make sure this doesn't happen, so much that there's even the option -fcompare-debug you can use to assert there's no difference.

That's quite unfortunate, I hope LLVM can fix that. Given that, I don't think that release mode should include debuginfo, but I would really prefer to have optimizations and debuginfo for production code.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.