Experience report contributing to rust-lang/rust

Can't this be "solved" by just advising to base the work on the nightly commit? I imagine for "small and infrequent" contributors the difference usually wouldn't matter.

1 Like

Just a quick thought: why not make std, core, alloc etc submodules and require that they work with yesterday's nightly instead of today? I think this would allow us to move it out of tree and then people could just contribute to that.

8 Likes

Also, I'll share my pain points from contributing:

  • Frequent painful rebasing: I've been doing a bunch of refactoring PRs lately, and I expect that they will have conflicts (I don't think there is any way around that). But I really wish there was a warning when multiple PRs in the queue will conflict with mine so that I can just rebase once and know that I won't have to do it again.
  • Disk space usage: Yesterday, I was doing an unrelated task and I got an ENOSPC error for something. It turns out that my 4 clones of the rust repo were taking up ~100GB of space (over half of my SSD). Most of this was various build artifacts. In particular, the incremental caches never seem to be cleared out, so they just grow monotonically forever until you delete them.
  • Hidden tests (e.g. --compare-mode=nll): as noted here and elsewhere, the full test suite takes forever to run. So I avoid running it like the plague. If anything, I just run ui tests, but often I just wait for CI to give me an error. However, I often find out late about the failure of other test suites like ui --compare-mode=nll (or more recently run-make-fulldeps). This means that after I think my PR is ready (and sometimes even when the PR is already in the queue), I find out otherwise.
  • Long PR queue: a PR can wait in the bors queue for weeks, which makes all of the above worse.

Also, an observation, which I'm not sure is helpful is that many of these issues would be less painful if boostrapping was faster or unnecessary somehow. But I have no ideas on how to do that.

11 Likes

Just a quick thought: why not make std , core , alloc etc submodules and require that they work with yesterday's nightly instead of today? I think this would allow us to move it out of tree and then people could just contribute to that.

This seems to make a lot of sense. I would guess that the contributors for std and the compiler internals are also broadly different groups, so separating this more over time would make sense. (Of course, further librarification in general would potentially improve the situation more.)

4 Likes

We used to have fast-moving bootstrap requirements, but it's challenging to keep that up, especially when it comes to building a final release. It's nicer to have a more stable base for building the next. If nightly breaks for some reason, we still bootstrap from a working beta, and beta from stable.

So really, a quick check of master std is easier with a beta system compiler, since that's what stage0 is designed for.

1 Like

I think there are two related, but very different issues here, and it's best to untangle them:

  • rust-lang/rust have certain extra checks to normalize trivial details (tidy)
  • rust-lang/rust is hard to build

Note that "hard to build" affects tidy, as it adds up to the time until you see a tidy error (one can run tidy separately from the main build/test, but one needs to be an expert user to do this). For this reason, I'd love to dive deeper into the second issue, and only touch the first one.

I do believe automated tidy checks are valuable. If you don't fix the trailing WS, someone else would fix it later, so this doesn't actually save any work. Moreover, meanwhile people will spend time reverted ws changes from their unrelated PR, which will create more friction overall. Similarly, as much as I don't like pretty-printing approach of rustfmt, it at least normalized the repository to a single consistent style. Adding new code before was horrible, because one needed to decide which one of the 2.5 styles from the current file to use.

But it is important to make this checks seamless and fast. rust-analyzer checks quite a few of code invariants (rustfmt, no trailing ws (rustfmt allows it in some cases), no todo, each module has a doc commt). However, they are all checked by a vanilla cargo test (we just shell-out to rustfmt from a #[test] fn), and the whole test suite takes 15 seconds. I think this is the experience we should strive for in rust-lang/rust

Here is the paragraph where I give praise to everyone behind original implementation and continuing maintenance of x.py and current cargo-based bootstrap. We should not forget that the current complicated build process is waaaaay better than the makefiles we had before.

However, I do believe that the current situation can be improved a lot. But this is from someone who is not intimately familiar with rustc build process, so take each "should" with a pinch of salt. What follows is a brain dump of things that could be done. It probably would read a little bit like criticizing, but I'd love to explicitly say that this is not true, and give :heart: to anyone involved in the build process, as this is a very annoying and fiddly work :slight_smile:

I do agree that rustc and stdlib are two separate worlds and should be kept as such. I wish we had /compiler/src and /runtime/src instead of a single unified /src, if only to make it easier to find the relevant code.

I think that the default ./x.py process is overly geared towards bootstrapping a compiler. We can assume that the person trying to do something with rust-lang/rust probably has rust already, so we should have .rustup-toolchain file which specifies the beta toolchain and use that by default. The "full bootstrap without existing rust" should be a special case for folks packaging rust. Similarly, default config.toml should be geared towards contributors and not packagers. There should be a separate config.release.toml for "production" use.

If we rely on beta toolchain existing by default, I think we can replace x.py with the xtask hack. It is as hacky as it gets, but it works reliably in rust-analyzer.

We should strive to make as much code as possible to be a boring run-of-the mill rust project buildable with cargo build. I think we can organize src/runtime such that cd src/runtime && cargo test just works, by using the current beta compiler?

For /src/compiler, I think we should strive to further subdive it into pure text-processing part (analysis), and "llvm, linkers, io and the rest of the horrible real world" part (emit). Most of the compiler can be a very boring rust workspace which has no need to require anything beyond cargo-check. Crucially, for the most of the compiler, you don't actually care about stdlib, as you are not generating the code.

So, in my ideal imaginary future,

$ git clone git@github.com:rust-lang/rust && cd rust
$ cargo test
  • automatically downlads beta cargo & rustc via rustup, due to toolchain file
  • compiles and tests all the crates in /runtime using the stock beta compiler
  • compiles and tests all the crates in /compiler/analysis using the stock beta compiler and not touching stdlib at all.
  • runs tidy-checks
  • prints a message "hey, we've skipped codegen and bootstrap tests, run cargo test -p emit to test them"

cargo test -p emit would do the really heavy lifting:

  • download/compile LLVM
  • compile the stdlib to rlib
  • compile the analysis, and link it with coden and LLVM
  • assemble sysroot
  • run the resulting compiler on the set of fully integrated tests.

I do belie that we can and should skip codegen tests by default, and that maybe we can defer them even to some tiered bors system, where

  • every merge is a rollup of N PRs
  • rollups run full bootstrapping test suite
  • before getting into a rollup, each PR passes a sans-codegen CI check.
30 Likes

My advice here is to avoid too many dependent PRs, or use --onto when rebasing. Also, prefer smaller PRs. If your PR is larger it reduces the chances of being included in a rollup.

I think we should retain the monorepo setup, as it becomes much harder to review a submodule or crate version bump (e.g., since hashbrown moved out of tree it has become harder).

I've asked about having --pass check --compare-mode=nll in the PR builder, but we have long build times already, so this will have to wait on GHA sadly.

Well there are two ways we can improve that situation: a) more rollups (so fewer toolstate PRs!), b) GHA.

Yeah I agree with this; especially as we're adding more crates to improve parallelism and modularism, it makes it frankly harder to scroll in VSCode.

2 Likes

I think autoformatting is a bad trend in software & I don't even have rustfmt installed (I specifically avoided including it in my toolchain bundle to not waste time downloading it). Of course I'm happy to install it to contribute to rustc, but it wasn't obvious to me that tidy was just running rustfmt or if it involved other checks, and so I didn't try to reproduce the check outside of the rust repo. I didn't investigate deeply, I just let CI tell me what changes to make.

3 Likes

Small correction: CI still requires tidy to pass, it's just that it no longer requires tidy to pass in order to run the rest of the rest suite.

7 Likes

Based on everything said here I think what would be the most immediately actionable improvement for me (and anyone else who mainly makes small contributions to std) would be an easy and documented way to build std and run tidy with one of the toolchains that can be installed through rustup. That would make the barrier to contributing pretty similar to any other large rust project.

16 Likes

As someone who has had to, for many years across multiple projects, languages, and industries, dive into and maintain others' software, written over multiple years with no consistent formatting or style, I could not disagree more.

My take is, unless you are the only person who will EVER work on the code base, then auto-formatting should be used. The specifics of the style don't matter all that much, but, consistency does.

27 Likes

Moderation note: Let's please not let this thread de-rail into a general argument about source formatting. Running rustfmt has already been acknowledged, personal opinions aside. I think that's enough for this particular topic.

17 Likes

This seems unfortunate to me and imposes unnecessary limitations. I also don't see this as applicable long-term.

Ideally, I'd like to see three separate projects similar to what @matklad described:

  1. The compiler - rustc
  2. The Standard library - this includes the interface with the Rust users. In other words, all the traits and vocabulary types that define a common language for the users.
  3. Runtime - this includes the interface with the underlying environment - the OS, the hardware, etc.

Each part above deals with separate concerns. Such as separation would expand supported use cases, such as:

  1. Specialised / hobby OSes and HWs - without forcing them to fit-in into the mainstream supported platforms.
  2. Competing implementations just like C/C++ have. allows to support different design tradeoffs without splitting the community. (I reckon Rust, the language, might need to enhance a few of its facilities to handle this gracefully). Ideally, I should be able to change a run-time while remaining compatible and not get affected by coherence.

Finally, this would allow us to better formalize the interface points between the separate parts and have better defined boundaries. This in turn means that there would not be any git sub-modules. Instead, the compiler would have an explicit dependency on a released version of std in its cargo.toml.

Because "std-aware cargo" is a thing that's happening and that everyone seems to agree we want, I figured it was a given that std would move to another repo once it literally was "just another crate". I assumed that was out of scope for this thread and the real "should we monorepo?" question intended here instead breaks down to:

  • does it make any sense to split away std in the short/medium-term, before "std-aware cargo" is fully functional? (presumably without turning it into yet another submodule)
  • can the runtime be put in a separate repo from the compiler? (without yet another submodule)
  • same question for other components like the lexer, parser, backtraces, chalk, polonius, etc.

As someone who doesn't currently contribute to Rust, I have no idea which of these are even possible to do, much less desirable. Perhaps it just boils down to the ongoing "librarification" of the compiler, and there's no special shortcut available at the moment.

1 Like

This sounds like a really awesome setup. If there is agreement on this plan, I can probably contribute some time to make it happen.

I wonder if we can then also do lighter-weight CI on the runtime parts.

6 Likes

I think this could definitely be improved:

$ # use a tmpfs and let's ignore disk IO for now
$ cd /tmp

$ # baseline
$ time git clone https://github.com/rust-lang/rust.git
Cloning into 'rust'...
remote: Enumerating objects: 1154546, done.
remote: Total 1154546 (delta 0), reused 0 (delta 0), pack-reused 1154546
Receiving objects: 100% (1154546/1154546), 490.76 MiB | 22.38 MiB/s, done.
Resolving deltas: 100% (935929/935929), done.

1m 15s

$ # clean up
$ rm -rf rust

$ # shallow clone
$ time git clone --depth=1 https://github.com/rust-lang/rust.git
Cloning into 'rust'...
remote: Enumerating objects: 22424, done.
remote: Counting objects: 100% (22424/22424), done.
remote: Compressing objects: 100% (21033/21033), done.
remote: Total 22424 (delta 1579), reused 6515 (delta 914), pack-reused 0
Receiving objects: 100% (22424/22424), 12.79 MiB | 3.89 MiB/s, done.
Resolving deltas: 100% (1579/1579), done.

8s

This could probably also be extended to the submodule handling in src/bootstrap/bootstrap.py. Does the build system rely on having all commits available (e.g. for revision counting)?

4 Likes

Unfortunately, this isn't quite true. Because std/alloc/core are special in that they can (and do) implement language items, which are a permanently unstable API between the compiler and std (and friends) that can and do change.

Intrinsics (i.e. "call some functionality the compiler provides) is probably the most tricky and variable lang item group, but even the lang items that allow e.g. implementing on primitive types can and do change.

You know how sometimes nightly doesn't have rustfmt/clippy because they haven't been updated to match refactors yet? If std (and friends) were a completely separate repository from the compiler, this would happen for std as well for every change to lang items.

6 Likes

To quote myself from above:

Having lang items is precisely the point I've been making. We already have an informal and unstable interface between these constituent parts because these are separate pieces of code that need to interact with each other in well defined ways.

As an aside, the lang items related to memory allocation would hopefully go away once the global/system allocator APIs are flashed out. I'd consider them mostly part of the runtime whereas abstractions such as Box that rely on them are part of the std component. If I develop a new OS and want to support Rust, I would need to implement the runtime component for my new OS but the std component providing the Box abstraction should work out of the box (excuse the pun..)

1 Like

Sure we can say "make lang items a “more formal” API surface," but that's much easier said than done. In a world where std is in a separate VCS world from the compiler, how would you orchestrate a cooperative change to the lang item interface or the intrinsics interface?

I'd say it's not a viable system if you have a broken state required because someone decided it's a better idea to have the size_of intrinsic return Option<usize> rather than usize (say, so that the intrinsic returns None for extern types and the library handles that edge case directly).

In the current, single-repo setup, that's just putting the #[cfg(bootstrap)] for the library in the same PR that does the language change. With std in a separate repo, you're requiring either std or rustc to push a commit to master that doesn't work until the other pushes the other half of the work. How are you going to get that tested through bors?

Putting core/alloc/std in a separate VCS history from the compiler makes that disconnect, and effectively means freezing the lang item / intrinsic interface. That's not a tractable decision that can be made.

The solution isn't to bifurcate the VCS tracking of the two intimately connected pieces. The better solution is to make testing the "std world" not require building the rest of the tree, but still VCS track them in sync for those #[cfg(not(bootstrap))] changes that are required.


(Side note: does std itself – libstd, not liballoc/libcore/stdarch, actually require any lang items or intrinsics, or are those confined to the liballoc and lower levels?)

4 Likes

Off-hand I know that std defines lang_start and f32/f64_runtime. ETA: also the panic runtime.