Pre-RFC: Test groups

I haven’t been able to find anything like this either on IRLO or among existing RFCs. Please, direct me to the proposal if it already exists.

The problem: cargo test / libtest currently implement coarse-grained test filtering. As far as I understand, these are the options:

  1. You can include/exclude all the tests (not) marked with #[ignore].
  2. You can exclude all the tests marked with #[should_panic].
  3. You can filter by name (by substring or full string matching).

There’re cases where it may be desirable to run only some tests. For example, there may be different reasons why the test is ignored by default. Consider this code:

#[test]
#[ignored]
fn im_ignored_because_im_long() { ... }

#[test]
#[ignored]
fn im_ignored_because_I_fail_in_runtime_outside_of_CI() { ... }

When developing locally, you may want to run test that are ignored because they take a lot of time, but not those which are guaranteed to fail. To do this currently, you need to either rely on by-name filtering or weed out unneeded tests with #[cfg]s (which requires recompilation).

Proposal: Add support for test groups in libtest and cargo test. Syntax is bikesheddable, but it should look something like this:

#[test(long)]
#[ignore]
fn im_ignored_because_im_long() { ... }

#[test(ci_only)]
#[ignore]
fn im_ignored_because_I_fail_in_runtime_outside_of_CI() { ... }

and then

cargo test -- --ignored --group=long

or

cargo test -- --enable-group=long

Using the same mechanism, you can disable certain tests if you want. For example, you may want to run long tests by default, but disable them for quick iteration with --disable-group=long.

An alternative I considered: do groups only for ignored tests, like this:

#[test]
#[ignore = "long"]
fn im_ignored_because_im_long() { ... }

and then

cargo t -- --ignored=long

This still solves the “groups of ignored tests” problem, but is kinda less flexible.

What do you think? I could write a formal RFC sometime soon.

2 Likes

You already can use conditional compilation to only include tests when particular features are enabled which you then do using cargo test --features.

Why does this need to be part of upstream libtest instead of a custom test framework on crates.io?

Conditional compilation requires re-compilation. Test groups would be a runtime feature, removing the need to recompile everything (both the crate itself and test crates, since there is no way to specify a feature only for tests) for every test run with different group chosen.

2 Likes

Rust doesn’t have a great support for custom test harnesses. You could do something like this with a procmacro, but how would you, for example, parse arguments passed to the test binary? You would need to completely replace a test harness, which is not optimal IMO.

1 Like

Take a look at how https://nexte.st/ works. I agree it's less convenient at first, but it also lets you experiment with real code without first having to get an RFC approved.

It’s possible to implement this in user code, either by completely replacing a test runner (what nextest does, a lot of work) or passing an environment variable (easy, but not a great UI) or making a wrapper around cargo test which passes said variable. These all are viable workarounds, but I don’t think they’re quite good enough to say that considering implementation of this feature in libtest is not worthwhile.

I'm saying something different - try this out with an env variable, get people to use it, and then that adoption shows a strong motivation for the RFC to be accepted.

Poor adoption of a solution with poor UX doesn’t say much about whether or not a solution with good UX should be implemented. It’s probably easier to use name patterns + filtering by name or #[cfg] attributes rather then groups implemented in user code, for a few reasons:

  1. Implementing groups in user code would require a procmacro. Procmacros break tooling, they cause spurious rust-analyzer crashes, they’re poorly supported by IntelliJ Rust and sometimes they cause compiler error to appear in a weird place.
  2. Passing an env var is just awkward. Wrapper would work, but it’s a change in workflow, a tool that needs to be installed in CI and it doesn’t work with other cargo plugins, like cargo-hack, cargo-miri and cargo-nextest.

I don’t expect good adoption of such a tool, because I personally wouldn’t use it for mentioned reasons.

1 Like

Is there prior art in other test frameworks (in other languages/ecosystems)?

Yes, notably @pytest.mark and JUnit @Category. Tasty in Haskell has test groups, but they’re different — all tests in one group are defined together.

2 Likes

CTest has the LABELS property for tests.

2 Likes

Poor adoption of a solution with poor UX doesn’t say much about whether or not a solution with good UX should be implemented.

I don’t expect good adoption of such a tool, because I personally wouldn’t use it for mentioned reasons.

I think what people are saying is: it's difficult to show this is a big problem if you're also claiming that a relatively small burden, such as an env var, is greater than the problem itself

(FWIW I agree with you that this would be a good feature! I use it in python a lot. Though my prior is also that it doesn't necessarily belong in the language itself)

I don’t claim that this is a large burden. Test groups, if implemented, would be a QoL feature. I think that QoL features are important too, and there’s no reason to keep libtest barebones, especially when something can’t be cleanly implemented in user code.

Nearly any tooling can be in some form implemented in user code. You can have external build system, test framework, package manager (C++, Java and Haskell are some examples), but I think there’s value in having some batteries included, and I think this particular battery is a good candidate for inclusion.

This would be less of a problem if Rust properly supported custom test harnesses, of course.

Can/could normal (sub)modules used for test grouping, using the existing filter-by-path/name functionality? That would preclude using the same test in two groups without some duplication/factoring to a common function though – is that a desired feature?

1 Like

That would work for the simplest cases. My main usecase for this feature, though, is grouping tests that are declared in different crates inside of a workspace, which makes this impossible. It also prevents grouping together unit tests and integration tests, and even just unit tests that need to be declared in different modules due to privacy rules.

2 Likes