Parallelizing rustc using Rayon

OK, I finished my sweep through @zoxc’s commit. Here is the full list of stuff I saw. I then went through and grouped it into four categories:

  • The first are the core structures that seem like they definitely require synchronization.
  • The second are things where I think we could/should refactor it away in a fairly obvious way – mostly this amounts to extending the query system.
  • The third are things where locks are wrong.
  • The final are cases where I’m unsure. In some cases, I linked into the commit.

core structures that will require synchronization

things that can be refactored away (and some notes on how)

  • hir-map (inlined bodies)
    • I think that @oli-obk’s work on miri makes this go away
  • session: (in general, this should go away and become the tcx)
    • buffered-lints
      • iirc, the only reason this exists is because the tcx isn’t around to produce the lint-level map query yet
    • recursion-limit
      • move to query
    • entry-fn, plugin crap, etc
      • move to query
    • crate disambiguator
      • move to query
    • features
      • move to query
    • etc
  • caching
    • MIR pred cache
      • this is linked to a MIR and cleared when it changes. Now that MIR moves linearly through the system, we can recode I suspect to use &mut maybe? Or maybe make the computation more explicit (e.g., the cache is populated via explicit action, and then queried, and if it is not populated we get a panic). Synchronization here seems really wasteful in any case.
    • trait system
      • ties in to the WG-traits plans; I would like to completely revamp how trait caching works anyway.
      • but something with locks is prob ok for now
  • all_traits (and this too)
    • move to query
  • MIR stealing
  • set of active features
    • move to query

locks seem wrong

  • layout-depth
    • I think this really wants to walk the stack and count the number of active layout frames, rather than being a counter. Or it could be carried along with the stack.

unclear, would like feedback

  • error emitter, crate metadata store, codemap, filemap, parser session
    • the main reason they use refcell so extensively is because they are old and mutablity in Rust used to be far easier. But maybe it’d be nice to be able to parse source files in parallel, in which case some amount of synchronization is needed? I’d love to see more of this stuff pushed back into queries though. Thoughts would be welcome here.
    • cc @petrochenkov, @eddyb, @estebank
  • session
    • lint-store etc
      • surely this doesn’t have to be as wacky as it is
      • cc @manishearth
    • next-node-id
      • no idea what this is all about
  • derive_macros
  • optimization fuel
    • this whole premise seems to require a single thread
    • we should just lock to one thread if optimization fuel is given I guess to force deterministic ordering
10 Likes