An idea about bors times


#1

I’ve noticed complaints about bors times. I noticed them myself. I have an idea for improvement I’d like to share.

In my opinion, it is not the time it takes for a correct branch to get merged which is the problem. The problem is when there are failing tests and when one needs to iterate fixing it ‒ each iteration takes a day or so.

This takes so long because the requests wait in line and are processed serially. But if there was enough computing power (actually, I have no idea if there is), an accepted branch could get tested right away. It could work like this:

  • If a branch is approved, take the current master and the branch, merge them in a separate temporary branch and run tests. It is the same as now, just it wouldn’t wait for a slot and run it in parallel (maybe with some limit on how many parallel builds there can be).
  • If it fails, the author knows about the problems much sooner, so it is possible to iterate faster. Most of the pain point is eliminated.
  • If it succeeds, it can’t be put into master, because likely there was something else in the meantime that moved master. However, the chance of this PR causing a failure is now much smaller. The PR moves to a 2nd queue.
  • The 2nd queue either works just as now (with the contention around master, so it needs to wait), but with better success rate, most PRs will wait this queue just once. Or, if the success rate is good enough, it could be doing automatic roll-ups of everything that is in the queue each time it starts a build (and fall back to some other algorithm if it still fails).

Does that make sense? If so, what are the steps that need to be taken if this is to happen? Does bors also follow the RFC process, or should I go ask someone specific? Sorry if this is documented somewhere, but I found only a repository with the last commit in 2014.


#2

bors has support for a “try” command that, while it does need to be triggered manually, does pretty much what you’re asking for (runs the test suite without contending for master). It’s not used very often because Rust is starved for server resources to run CI on.


#3

The infra team looks after bors and if you have ideas we’d love to hear them - here or on #rust-infra is fine.

Unfortunately, as @notriddle mentioned, we’re close to capacity on our CI infra at the moment so I’m not sure how this would work within our existing constraints.


#4

I see. Is this generally known? Maybe some company could be willing to donate some power… (I don’t think I’ll be able to persuade mine just yet, the Rust adoption is somewhat slow, but maybe one day).

So, another idea. Is there a simple way (let’s say a script with docker compose) I could easily run everything or almost everything locally on my computer? I tried to find such thing like half a year ago, but failed. There are some docker files scattered through the source code, but I failed to find out a way to use them.


#5

Is there a way to just have bors rerun only tests that failed the last time? That would be a potentially much faster way to iterate without wasting resources. Then, bors can run the whole thing once more before merging.


#6

I believe it will already do that (or at least it did back when it ran on buildbot), but only if nothing else merged between the first and second runs.


#7

Hmm… it might be nice to do that regardless of whether something has merged between runs, since something is very likely to merge between runs, right? This isn’t meant to be 100% correct (the final run will do that), but simply to improve dev velocity.


#8

So, another idea. Is there a simple way (let’s say a script with docker compose) I could easily run everything or almost everything locally on my computer? I tried to find such thing like half a year ago, but failed. There are some docker files scattered through the source code, but I failed to find out a way to use them.

@vorner Yes, all (linux) builders run in Docker, and should be trivially executable with ./src/ci/docker/run.sh <docker container name> where the name comes from one of the directories in ./src/ci/docker/ directory.

Rust is starved for server resources to run CI on.

I see. Is this generally known? Maybe some company could be willing to donate some power… (I don’t think I’ll be able to persuade mine just yet, the Rust adoption is somewhat slow, but maybe one day).

We don’t yet have a good way to outsource CI resources to build machines provided by companies, but it is something that we want to do (either through a payment system and/or by providing physical resources). As of now, it’s not a huge priority of the Rust project, to my knowledge.

Is there a way to just have bors rerun only tests that failed the last time? That would be a potentially much faster way to iterate without wasting resources. Then, bors can run the whole thing once more before merging.

We do not currently support any way to run a subset of the CI (either images or tests). Such a project would be interesting, but ultimately most of our builds complete close to simultaneously, and the longer running ones fail more often, so the gain from this would be fairly low.

I don’t think we’d be interested today in making only some tests run based on which tests failed the last time, as ultimately, such a state means almost nothing: Rust’s tests are diverse, and fixing a test may break another test. It is potentially more useful to cache the artifacts of a given CI run so we can reuse at least parts of the bootstrap compiler, but it is unknown whether this would be of benefit (since then you pay for network time copying bytes).


#9

I guess nobody complains much about it. People learn to accept it, have multiple branches to work on at the same time or so. But my guess is the overall productivity would get better.

But I see there’s no simple solution.


#10

Running only a subset of tests means that you waste fewer resources, which means that you can run more tests at the same time. We might not improve latency, but we can improve throughput. And this would reduce time people spend waiting in the bors queue.

Also, iterating on fixing a single failed test might be valuable. I personally don’t have a lot of experience to go off of, but if people often take more than one iteration to get the same test right, then this might be valuable. Have people had the experience?


#11

Generally, I’d say it’s not too large a concern – we can’t merge until all tests pass, and tests, while 50% of the CI runtime on average, are run only after everything else has been built, which means that we’re primarily waiting on the compiler to be rebuilt, not for N tests to run before an “interesting” test runs.

I don’t think we’d improve throughput either – people in the bors queue generally expect that tests will pass. If they don’t, it’s somewhat rare that a fix can be implemented immediately. So I don’t think that we’d gain much advantage. We could run failed tests first, which I suppose would be somewhat helpful, but we’d run into the problem of storing that data somewhere – there’s nowhere good today.