Experience report contributing to rust-lang/rust

I think there are two related, but very different issues here, and it's best to untangle them:

  • rust-lang/rust have certain extra checks to normalize trivial details (tidy)
  • rust-lang/rust is hard to build

Note that "hard to build" affects tidy, as it adds up to the time until you see a tidy error (one can run tidy separately from the main build/test, but one needs to be an expert user to do this). For this reason, I'd love to dive deeper into the second issue, and only touch the first one.

I do believe automated tidy checks are valuable. If you don't fix the trailing WS, someone else would fix it later, so this doesn't actually save any work. Moreover, meanwhile people will spend time reverted ws changes from their unrelated PR, which will create more friction overall. Similarly, as much as I don't like pretty-printing approach of rustfmt, it at least normalized the repository to a single consistent style. Adding new code before was horrible, because one needed to decide which one of the 2.5 styles from the current file to use.

But it is important to make this checks seamless and fast. rust-analyzer checks quite a few of code invariants (rustfmt, no trailing ws (rustfmt allows it in some cases), no todo, each module has a doc commt). However, they are all checked by a vanilla cargo test (we just shell-out to rustfmt from a #[test] fn), and the whole test suite takes 15 seconds. I think this is the experience we should strive for in rust-lang/rust

Here is the paragraph where I give praise to everyone behind original implementation and continuing maintenance of x.py and current cargo-based bootstrap. We should not forget that the current complicated build process is waaaaay better than the makefiles we had before.

However, I do believe that the current situation can be improved a lot. But this is from someone who is not intimately familiar with rustc build process, so take each "should" with a pinch of salt. What follows is a brain dump of things that could be done. It probably would read a little bit like criticizing, but I'd love to explicitly say that this is not true, and give :heart: to anyone involved in the build process, as this is a very annoying and fiddly work :slight_smile:

I do agree that rustc and stdlib are two separate worlds and should be kept as such. I wish we had /compiler/src and /runtime/src instead of a single unified /src, if only to make it easier to find the relevant code.

I think that the default ./x.py process is overly geared towards bootstrapping a compiler. We can assume that the person trying to do something with rust-lang/rust probably has rust already, so we should have .rustup-toolchain file which specifies the beta toolchain and use that by default. The "full bootstrap without existing rust" should be a special case for folks packaging rust. Similarly, default config.toml should be geared towards contributors and not packagers. There should be a separate config.release.toml for "production" use.

If we rely on beta toolchain existing by default, I think we can replace x.py with the xtask hack. It is as hacky as it gets, but it works reliably in rust-analyzer.

We should strive to make as much code as possible to be a boring run-of-the mill rust project buildable with cargo build. I think we can organize src/runtime such that cd src/runtime && cargo test just works, by using the current beta compiler?

For /src/compiler, I think we should strive to further subdive it into pure text-processing part (analysis), and "llvm, linkers, io and the rest of the horrible real world" part (emit). Most of the compiler can be a very boring rust workspace which has no need to require anything beyond cargo-check. Crucially, for the most of the compiler, you don't actually care about stdlib, as you are not generating the code.

So, in my ideal imaginary future,

$ git clone git@github.com:rust-lang/rust && cd rust
$ cargo test
  • automatically downlads beta cargo & rustc via rustup, due to toolchain file
  • compiles and tests all the crates in /runtime using the stock beta compiler
  • compiles and tests all the crates in /compiler/analysis using the stock beta compiler and not touching stdlib at all.
  • runs tidy-checks
  • prints a message "hey, we've skipped codegen and bootstrap tests, run cargo test -p emit to test them"

cargo test -p emit would do the really heavy lifting:

  • download/compile LLVM
  • compile the stdlib to rlib
  • compile the analysis, and link it with coden and LLVM
  • assemble sysroot
  • run the resulting compiler on the set of fully integrated tests.

I do belie that we can and should skip codegen tests by default, and that maybe we can defer them even to some tiered bors system, where

  • every merge is a rollup of N PRs
  • rollups run full bootstrapping test suite
  • before getting into a rollup, each PR passes a sans-codegen CI check.