Running test crates in parallel

Cargo runs tests in parallel for a single crate, but each test crate is run one after the other in series at the moment. I have a cargo PR that adds a --test-jobs arg to parallise the test execution: [WIP] parallel test crates by gilescope · Pull Request #10052 · rust-lang/cargo · GitHub At the moment the PR doesn't constrain the parallism (aside from you can pass a corse --test-threads to the test runners).

As projects get bigger this becomes more of an issue and many integration tests have just one test in them so have zero parallelism at the moment.

One direction that could be gone in is establising if the test driver supports the jobserver protocol and using that to manage the parallelism. I'm wondering what other options are on the table? Cargo running test crates in parallel can make an extreamly big difference to how long your CI builds take as well as locally if you have lots of cores (and the trend for cores is more).

Idk, maybe rust can't have nice things, but so far as a community we've generally managed to find a way to have our cake and eat it. There's a very nice cake here, but how do we eat it?

(The hack is of course to move all tests into one test crate but that feels quite a large hack)

3 Likes

This is a suboptimal setup: not only such tests run sequentially, they also duplicate linking work. In general, my advice is to have one integration test crate for crates.io libraries, and zero integration test crates for workspaces/applications. See Delete Cargo Integration Tests.

6 Likes

I agree integration tests are a performance footgun and ideally should be aggregated. I also like the idea of turning off doc-tests for crates that don't have any - that's a quick win.

There's going to be people that want to structure their projects in particular ways that aren't necessarily optimal for compile times. In those cases running test crates in parallel would have a material impact.

Thanks for writing. I find that cargo test --test thing is quite useful to only compile and re-run the particular area that's needed, what do you usually do there?

On the topic of parallelism, my #1 request would be a flag that easily does what RUST_TEST_THREADS=1 does - I use it quite often to untangle test outputs. If a test is hanging, that's the "only" way to get the name of the test that's currently running as well.

To clarify, running tests binaries in parallel is useful regardless. In a workspace, there's a separate test binary with unit-tests per crate, and serial execution is a bottleneck there as well.

It's just that if a specific problem is that there are many integration test crates, a more effective solution is to not have many integration tests crates, as it helps with both unnecessary sequential runtime and redundant work during build time.

I feel there's suboptimality on the Cargo's side there -- it is natural to just dump files into tests/, and it is natural to start each file with mod helpers; to share the code between them. There's clearly no pit of success there. If we were designing Rust/Cargo from scratch, I would feel very strongly about making sure that we have "compilation unit = directory with source files" model, because today's "let's put several CUs into a single folder" creates confusion and harms build times.

2 Likes

Oh wow:

  • I think what you are looking for is -- --test-threads 1 flag
  • TIL about RUST_TEST_THREADS environmental variable. Do we have it documented anywhere at all?

I usually have the following workflows for testing:

  • Running a single test when I am fixing a test failure -- here I use IDE functionality to run the test function at the cursor
  • Running a specific module/crate with tests, when I am doing feature work/refactoring -- here again I rely on an IDE to come up with a command to run all the tests in a module
  • Running all the tests as a sanity check -- I cargo t from the terminal. Now that I think about it, we should add cargo test --workspace as a runnable to rust-analyzer as well :slight_smile:

More generally, specifying all the flags I want to run a specific test (package, target within the package, name of the test, --exact if I want to run a single test, --nocapture because I want to see my debug prints) manually is tedious, so rust-analyzer has "Copy Run Command Line" action which puts something like:

cargo test --package ide_assists --lib -- handlers::extract_function::tests --nocapture

into the clipboard.

Great, I have two major problems with cargo:

  1. Fast linking (llvm lld) and crosscompilation do not work together: build.rs should ignore RUSTFLAGS · Issue #6375 · rust-lang/cargo · GitHub
  2. cargo test for workspace is not use as many cores as it can use

And this is great that you have time to working on (2).

The environment variable is at least mentioned in -- --help too (Yeah I don't think any of the stuff hiding behind the -- is optimal). But --test-threads is there too, and I didn't know about it (maybe just used to the env var since years past).

I've tried merging all the tests/ of ndarray into one crate now, and it kind of doesn't show that much of an immediate gain, and slows down the rerun-single-file case a bit. It has to be tried in CI, maybe this is more of a CI than programmer interactivity optimization..

I like the --test foo flag when developing something new or doing major changes to an area.

1 Like

Yeah, the main thing is that for libraries the optimization is mostly irrelevant, as compile times are fast enough either way. The thing which becomes significantly faster is recompiling all the tests after changing the library. For ndarray, here's what I get for touch src/lib.rs && time cargo test --no-run -q in two versions

// Single CU https://github.com/matklad/ndarray/commit/1a3852bddfa141cccb476a112e39880900a1503a
real 2.28s
cpu  5.91s (4.67s user + 1.24s sys)
rss  342.99mb

// Many CUs https://github.com/matklad/ndarray/commit/1a3852bddfa141cccb476a112e39880900a1503a
real 2.47s
cpu  25.27s (19.71s user + 5.56s sys)
rss  275.68mb

The wall-clock win isn't substantial as this is on a 12 core machine, but the CPU time win is quite noticeable.

EDIT: another significant differnce is the size of ./target, which is relevant for CI caching and small SSDs: 346M vs 780M

1 Like

It is documented in the rustc book's test chapter: Tests - The rustc book

1 Like

Would it be possible to have an (opt-in, probably) Cargo mode that would build individual tests as libraries and autogenerate a binary runner for them to avoid that overhead? Would that make sense?

Not without changes to rustc to allow generating such libraries and to the test runner to expose the functions necessary to run the tests in those libraries. The test runner is currently an unstable implementation detail and may not necessarily be the best interface, so stabilizing it isn't the best idea IMHO. Also cargo currently never generates rust code. In addition cargo rarely depends on unstable features of rustc for features that may becone stabilized in cargo. -Zbuild-std is the only exception I know of.

If the test runner is an unstable interface then we are allowed to change it right? We can’t have the worst of both worlds of it being unstable and we can’t change it.

Before rust ossifies we really do need to do some plumbing so that cargo scales to xl sized projects. It feels like the work hasn’t been done yet because what we had was good enough for hundreds of crates. But rust is getting bigger, thouands and tens of thousands of crates should be possible if we approach things in reasonably efficient ways. Such projects point out where the inefficiencies lie. Having a nightly mechanism to only link tests once sounds healthy even if it isn’t instantly stabilised. We now have a foundation with some long term funding - I assumed it was to invest in these kinds of things to take rust to the next level. At the moment people manually structure their projects to get around these problems which is the tail wagging the dog.

These scaling problems are good problems to have but I think it’s time to stop dodging them. (I don’t yet know of any project that is using 10,000 crates? Maybe google?, but by Rust 2024 it won’t be unheard of).

1 Like

I think what I am really suggesting is that the cargo team is a well funded team of 5-10 full-time devs + open source contributors. At the moment it feels like you and Alex hold the world on your shoulders. I mean no offence by this - please take it as the greatest of complements.

Indeed, changing it isn't a problem. Having cargo depend on an unstable rustc interface even once (if?) this feature is stabilized at cargo's side is what worries me. It would require cargo to potentially handle multiple versions of the interface. The master branch of cargo supports both stable and nightly at the same time. I believe to make life easier for contributors by not requiring them to install nightly.

By the way I have only done a couple of contributions to cargo.

:+1: