Did we start reviewing PRs slower?

Voultapher · December 29, 2022, 9:00am

To give a first hand experience. In August I opened my first Rust library PR ever. I've been working on optimizing sort implementations, and have ~1.5x speedups for both stable and unstable sort, without changing the underlying algorithms. The first PR took several weeks for each round of feedback and is now stuck in reviewer doesn't have the bandwidth to review. The suggestion that reviewer made to simplify the code review process is another PR that has changed hands after weeks of not getting any feedback, and is stuck again in 'need to think about'. Having poured hundred of hours into these PRs and overall research, it's quite disappointing to receive such slow and unwelcoming feedback.

I fully empathise with the reviewers and their immense work-load, to me it seems this part of the Rust team is understaffed.

DesmondWillowbrook · December 29, 2022, 4:05pm

A more formal process towards becoming a reviewer/team member seems like an obvious step to take. The suggestion of informally reviewing PRs seems disrespectful towards the folks who have already invested enough time into Rust to be otherwise trustworthy. A stronger onboarding process might assuage apprehensions of problems caused by inexperienced reviewers. As someone who's been an outsider for a longer time than someone who's been actively contributing, I think it would definitely encourage people if there was a clear outlined path towards becoming a reviewer or team member.

jyn514 · December 29, 2022, 5:31pm

As a team member who regularly does reviews, I would love for random people to leave informal review comments as long as they're accurate and not too discouraging. It would not be disrespectful at all, it would help a lot with my workload.

kpreid · December 29, 2022, 6:37pm

An anecdote of how it's not at all clear that unsolicited/unofficial reviews are helpful: I once commented on a problem I saw in a rust-lang/rust PR. That problem was fixed — and the author later pinged me asking what they needed to do to get my approval, so I had to tell them that I didn't have any approval authority. That went okay, but it would have been much better for me as a casual reviewer if I knew that what I was doing was encouraged, as opposed to that ping being evidence that my stepping in had created confusion and derailed the process.

Perhaps encouragement to perform reviews, and tips on how to do so, so could be made documented in places like rust/CONTRIBUTING.md at master · rust-lang/rust · GitHub and Review policies - Rust Forge. Another useful specific clarification would be: should such non-authoritative reviewers use the GitHub “approve”/“needs work” review function (separate from bors) or not?

jyn514 · December 29, 2022, 6:52pm

I started on a review checklist a while ago but never had time to finish it up: Add a review checklist and suggest reviews as a way to get started with the project by jyn514 · Pull Request #1463 · rust-lang/rustc-dev-guide · GitHub

If someone has time to clean that up and get it merged, that would be a big help.

tschuett · December 29, 2022, 8:08pm

Why is there no pull request template? Some checkpoints for the author. Welcome words for outsiders feel free to join. If the PR gets stuck, you can ask for help HERE.

gasche · December 31, 2022, 2:04pm

I'm an outsider to the Rust community, but I'm interested in this question as we have the same sort of lack-of-reviewers issue with OCaml.

Describing the current process

If I understand correctly, the current process is that:

some people people explicitly get invited as "approving reviewers" for some parts of the codebase
a triaging bot assigns each new PR to one approving reviewer (trying to select reviewers without a large review backlog already)
there is a manual process to get a random new reviewer or suggest someone else

This is documented in the rustc dev guide: reviewer rotation, and the configuration that the triaging bot uses is currently

github.com

rust-lang/rust/blob/36285c5de8915ecc00d91ae0baa79a87ed5858d5/triagebot.toml#L429-L577


      
          [assign.adhoc_groups]
          compiler-team = [
              "@cjgillot",
              "@estebank",
              "@petrochenkov",
              "@davidtwco",
              "@oli-obk",
              "@lcnr",
              "@nagisa",
              "@wesleywiser",
          ]
          compiler-team-contributors = [
              "@compiler-errors",
              "@eholk",
              "@jackh726",
              "@fee1-dead",
              "@TaKO8Ki",
          ]
          compiler = [
              "compiler-team",

This file has been truncated. show original

(first: definition of the review groups, then a series of glob to assign parts of the codebase to review groups.)

Naive questions

Two obvious questions come to mind:

Does the community expect that all heavy contributors (which generate "review load") should also be approving reviewers, with possibly some exceptions? (Maybe some people are bad at reviewing or reviewing is bad for them.) Or maybe that just a fraction of the heavy contributors would serve as reviewers? What's the expectation here?

Looking at rust-lang/rust contributors this year, it looks 6 out of the top 10 contributors get assigned reviews in some category, but (unless I misread the configuration) 4 do not (Ralf, Veykril, bjorn3, WaffleLapkin).
Giving people reviews in their core area of expertise makes a lot of sense, but doing only that can give a more rigid system where some areas are chronically under-staffed. (No one's area of expertise is "saying no to the myriads of people who want to improve the stdlib in ways we will probably regret later"). Assigning people outside their core area can also be a great way to spread expertise of the codebase. Have Rust people considered doing this? For example, the following could be considered:
- (a) when all reviewers in a given area reach a certain backlog, tentatively assign someone from another area instead (if this is a clear mismatch people can always force a new random assignment), or
- (b) all the time, randomly select someone from another area with a low probability, or
- (c), all the time, assign both someone from the area and someone from another area, which gives the "less-expert" the opportunity to get involved if there is interest but without putting them in the critical path

matklad · December 31, 2022, 3:24pm

That's misleading data: rust-lang/rust includes a bunch of other repos (rust-analyzer, miri, cranelift, etc) as subtrees, and contributors to those repos show up in that graph, although they review PRs outside of rust-lang/rust. Eg, Veykril is responsible for GitHub - rust-lang/rust-analyzer: A Rust compiler front-end for IDEs. In other words, yeah, people who do a lot of commits generally also do a lot of reviews.

Giving people reviews in their core area of expertise makes a lot of sense,

I think what happens in Rust is that review process is itself fragmented across the teams. Eg, I believe t-compiler and t-libs use somewhat different mechanisms for review, despite working on the same repo. And then there are things like rust-analyzer which, while a part of rust-project, exist in a separate repo with separate policies.

I think, yeah, the overall path mostly boils down to "you do PRs > soon you find yourself reviewing PRs". The libs team (guardian of the public APIs of stdlib) is a bit of an outlier here -- there's little additive "just write more code" work in this area, the work is indeed mostly about saying no to features, and building community consensus around tradeoffs. I think the historical path to getting onto T-libs team was by maintaining high-profile crates outside of stdlib.

Also, the two things are separate:

ability to do reviews and get the code merged
being on review rotation for new PRs

I think the set of people who can r+ when they are explicitly asked for is significantly larger than the set of people who are actively triaging new PR. I am certainly in the former category

ehuss · December 31, 2022, 6:48pm

Just to clarify on this point, the bot currently just does a random choice. It doesn't look at the backlog. GitHub has a more sophisticated lookup algorithm (summarized here), but we don't use it. Besides the complexity of implementing such a thing, I think it might be difficult to get it to work in an open-source environment. Some reviewers naturally have more bandwidth than others, but we don't want to over-weight them and lead to faster burnout. Also, some reviewers behavior is "spikey", where they may delay reviews for a week or two, and then do a bunch all at once.

I think it would be hard to come up with some algorithm that is significantly better than random choice, but I think it would be good to discuss if people want to try something more sophisticated.

The path-based weighting system (implemented here) definitely could use some improvement. However, in practice I haven't seen anyone complain about it.

scottmcm · December 31, 2022, 10:17pm

To elaborate on this a bit, AFAIK there's only one pool of "people who can r+ in rust-lang/rust". There's no technical restriction keeping people from approving things outside their usual area, just social expectations (which work well enough).

kornel · January 1, 2023, 4:34pm

The problem isn't limited to rust-lang/rust. Cargo, crates-io, and rustup have frustratingly long review times too.

bjorn3 · January 1, 2023, 5:27pm

There are several pools depending on which files got touched:

github.com

rust-lang/rust/blob/bb6e76df06dcbdb96e634eb28a49f161d70ab844/triagebot.toml#L557-L615


      
          [assign.owners]
          "/.github/workflows" =                       ["infra-ci"]
          "/Cargo.lock" =                              ["@Mark-Simulacrum"]
          "/Cargo.toml" =                              ["@Mark-Simulacrum"]
          "/compiler" =                                ["compiler"]
          "/compiler/rustc_apfloat" =                  ["@eddyb"]
          "/compiler/rustc_ast" =                      ["compiler", "parser"]
          "/compiler/rustc_ast_lowering" =             ["compiler", "ast_lowering"]
          "/compiler/rustc_hir_analysis" =             ["compiler", "types"]
          "/compiler/rustc_lexer" =                    ["compiler", "lexer"]
          "/compiler/rustc_llvm" =                     ["@cuviper"]
          "/compiler/rustc_middle/src/mir" =           ["compiler", "mir"]
          "/compiler/rustc_middle/src/traits" =        ["compiler", "types"]
          "/compiler/rustc_const_eval/src/interpret" = ["compiler", "mir"]
          "/compiler/rustc_const_eval/src/transform" = ["compiler", "mir-opt"]
          "/compiler/rustc_mir_build/src/build" =      ["compiler", "mir"]
          "/compiler/rustc_parse" =                    ["compiler", "parser"]
          "/compiler/rustc_parse/src/lexer" =          ["compiler", "lexer"]
          "/compiler/rustc_query_impl" =               ["compiler", "query-system"]
          "/compiler/rustc_query_system" =             ["compiler", "query-system"]

This file has been truncated. show original

scottmcm · January 1, 2023, 6:13pm

My understanding was that those were who could get auto-assigned, not that those are the only people who can r+ a change to those files?

bjorn3 · January 1, 2023, 7:09pm

Right, I thought you were talking about who would get auto-assigned.

the8472 · January 2, 2023, 8:32am

Consider applying for compiler- or libs-contributor membership in the team repo. That provides r+ rights. Once you have that it's ok you can assign yourself as reviewer on PRs where you feel confident.

PoignardAzur · January 3, 2023, 12:11pm

The following discussions kind of left this point to the wayside, but I think it's the more crucial one here.

If the problem is that volunteer bandwidth isn't cutting it, then the obvious solution is to hire full-time people. Even if no specific company wants to hire someone for stdlib triage because it doesn't give them an edge, companies could pool funding to hire a full-time dev.

This kind cross-company funding is within the mandate of the Rust foundation, right?

scottmcm · January 3, 2023, 6:01pm

I'm not sure that triage is exactly the issue. There's currently 18 PRs waiting on ACPs, for example. That means they're triaged fine, but need libs-api bandwidth.

(We could also make the numbers better by closing those PRs, and asking people to re-open if the ACP succeeds. But that gets to my previous point that someone would have to actually do that, and expose themselves to complaints from the PR author, so it's much easier to just leave them open and idle.)

cyukang · January 4, 2023, 3:27am

I'm a new contributor for Rust, yes, it's fustrating to wait for more than one month without any feedbacks. I'm trying to get used to it, and don't push the reviewer.

The workload is one issue, they just have a too long review queue. Another issue I find is that some maintainers are more willing to spend time programming rather than reviewing other people's code, which I can understand as programming is more interesting and satisfying.

Is it possible to implement a mechanism to avoid too long review queue to happen?

For example, we don't pick up reviewer randomly, if a maintainer's task list is too long, the bot should not assign new PR to him/her.

Similarly, if a PR has not received any attention for a week, the system could automatically reassign the PR to another maintainer.

This type of mechanism can help to ensure that tasks and PRs are being handled in a timely manner and that maintainers are not overwhelmed with too much work.

And of course, we need more reviewer and maybe more fulltime maintainer.

mathstuf · January 4, 2023, 4:05am

This is a great solution, but the problem is all in defining/measuring/quantifying that "task list is too long" part of the equation. Most reviewers either have other maestros (day jobs) or are volunteers (and dance to the beat of their own drum) and the bot just doesn't have the power to do much about that.

petrochenkov · January 4, 2023, 6:18am

I'd be interested in seeing some statistics on numbers of reviewed PRs and lengths of review queues.
I suspect that the general rule "20% people review 80% PRs" apply here as well (maybe with different specific numbers), and those 20% do not have long review queues.

Maybe there should be a leaderboard similar to https://thanks.rust-lang.org, but for maintainers, to motivate the subset of people who are motivated by such things (probably me included).

Topic		Replies	Views
Homu queue woes, and suggestions on how to fix them tools and infrastructure	50	5259	March 25, 2019
Handling pull requests of new contributors	3	802	December 18, 2023
Highfive rotation	14	992	May 21, 2020
Rust 2019: Address the "Big" Problem	32	18535	April 2, 2019
Paying developers for pull requests policy	40	2659	December 22, 2024

Did we start reviewing PRs slower?

Describing the current process

Naive questions

Related topics