At the Rust 2018 All Hands a couple of weeks ago the compiler team discussed, among other things, the way forward for parallelizing query evaluation. The TL;DR is that
-
we basically want to continue following the course laid out by @Zoxc here, but for the initial version we prefer to use a more traditional thread pool instead of a fiber-enhanced version of Rayon, and
-
we will not pursue end-to-end querification (i.e. querifying parsing, macro expansion, and name resolution) in the immediate future, as not have too many things in flight at once.
We also discussed some of the particular instances of mutable state in the compiler and how they should be handled with respect to concurrent access. More details below.
State of Ongoing Work
At the beginning of our parallelization time slot, we tried to verify that the basic concept of parallelizing the compiler at the query level is a good strategy and concluded that we still think it’s a good idea. We also discussed the current implementation approach – that is, slowly merging changes from @Zoxc’s branch into master, reviewing as we go, making sure that single-threaded performance does not regress, and taking note of necessary future refactorings. This approach still seems like a sensible one.
Threads, Rayon, Fibers
While talking about potential problems with the planned architecture, the use of Rayon-with-fibers instead of regular OS threads came up as the biggest unknown and a potential risk:
- No one in the compiler team (except for @Zoxc maybe) has any substantial experience using fibers.
- We want parallelization to integrate well with the Make jobserver. Retrofitting Rayon to do so seems harder than implementing this in a custom-tailored thread pool. (Although @nmatsakis says that this might be a longterm goal for Rayon)
- @nagisa noted that, in their experience, fibers well-suited for massively concurrent scenarios with blocking I/O, but less so for parallelizing workloads.
Overall, the team was more comfortable with using regular OS-threads for the initial version of parallel-queries. This does not preclude that we might want to look into a fiber-based version in the future.
End-to-end queries
A topic that was discussed in a different session of the compiler team was “end-to-end queries”, i.e. making the entire compilation process query-based instead of just the middle part. This is desirable because only queries profit from incremental and parallel evaluation. However, end-to-end queries also require some major refactorings and in the short term, the querification of the early compilation pipeline would be an ineffective one: parsing, macro expansion, and name resolution would each become one monolithic query, thus practically nullifying the potential benefits of incrementalization and parallelization. Since the compiler usually spends only around 5% of its time in these phases, we decided that it was not worth the additional complexity of trying to pursue end-to-end queries and parallelization at the same time. Instead we’d like to push on parallelization first and tackle end-to-end queries at a later point in time.
Measuring Performance
We would like to start measuring performance of the compiler built with #[cfg(parallel_queries)]
. Longterm, we obviously want the compiler to efficiently use all available CPU cores, but a question that is of interest much sooner is whether we can always build the compiler with locks in place without noticeably degrading single-threaded execution. Getting an idea of the expected synchronization overhead would be an important first step.
Diagnostics Emission
Some ideas were thrown around on how to handle diagnostics being generated by the compiler concurrently. The only solid conclusion was that we’ll need some kind of buffering in order for messages not to be too indeterministic, but we definitely don’t want to buffer diagnostics until the compiler process ends. Limited-time buffering seems like a promising approach.
Miscellaneous Thread-Safety Issues
We also discussed particular instances where the compiler maintains mutable state that needs to be made thread-safe. However, I won’t go into this here. All discussed items are either listed in Parallelizing rustc using Rayon or in https://github.com/rust-lang/rust/issues/48685.
Overall, this was a very re-assuring session and everybody praised the work that @Zoxc has already put into making a parallel Rust compiler a reality.