Pre-RFC `#[test]` on `mod`

yanganto · June 3, 2022, 9:53am

Introduction

(deleted)

Some people want to set the result of a test case fail but still, run full of the test as mentioned here.

If Rust can provide #[test] on the module, we can easier run all the functions in that module, let the module become a test case with a set of functions that need to be tested, and also have only one test result of the module. If some tests fail, the other tests still run. Following is the thing I want to propose.

Proposal

Let #[test] apply to the module, and the module will be a single integrated test case, all the functions in the module will run, if any function fails, the test case fails. If more than one functions fail, the test case fails, and also all the error message are collected inside the test case, and still easy to debug on it, and the summary will keep as clean as possible.

#[cfg(test)]
#[test]
mod infra {
   fn infra_1_works () {...}
   fn infra_2_works () {...}
}
#[test]
fn it_works() {...}

running 2 tests
test infra ... ok
test it_works ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; 
finished in 0.00s

If there is one error in there, it will probably be like this.

#[cfg(test)]
#[test]
mod infra {
   fn infra_1_works () {...}
   fn infra_2_fails () {assert!(fail)}
}
#[test]
fn it_works() {...}

running 2 tests
test infra ... FAILED
test it_works ... ok

failures:

---- infra stdout ----
thread 'infra::tinfra_2_fails' panicked at 'assertion failed: false', src/lib.rs:0:0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    infra_test

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; 
finished in 0.00s

If there are more error in there, it will probably be like this.

#[cfg(test)]
#[test]
mod infra {
   fn infra_1_fails () {assert!(fail)}
   fn infra_2_fails () {assert!(fail)}
}
#[test]
fn it_fails() {assert!(fail)}

running 2 tests
test infra ... FAILED
test it_fails ... FAILED

failures:

---- infra_test stdout ----
thread 'infra::infra_1_fails' panicked at 'assertion failed: false', src/lib.rs:0:0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'infra::infra_2_fails' panicked at 'assertion failed: false', src/lib.rs:0:0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- it_fail stdout ----
thread 'it_fails' panicked at 'assertion failed: false', src/lib.rs:0:0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    infra
    it_fails

test result: FAILED. 0 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; 
finished in 0.00s

As you can see, all the functions in the #[test] module will run and the error message will be collected in the test case of #[test] module. The #[test] module is just a test case, and collect all the error message from each function.

Functions Inside the `#[test]` mod

The #[test] mod is a test case that is a set of the test functions and only all the test functions that pass the test case of the mod will be passed. All the test functions in the #[test] mod run dependently. By nature, we use the null input function as a test. and some helper functions with inputs, we do not need to tell whether the function is a helper or not.

If there is a useful function without inputs that should not be run in a test mod, we could use #[test(not)] to exclude it. We also discussed this in Zulip. Currently, the RFC does not propose #[test(not)], because this kind function should not be in a test mod.

The functions in #[test] mod do not guarantee to run in any form, the way to run these functions is the same as #[test] on fn. The #[test] on mod changes the test counting and summary on cargo test. The proposal does not change the way running the test functions, and the functions inside the #[test] mod are parts of a test case. This means the test filter works on #[test] mod but not any functions inside the mod. Because I think this may not be useful but has a higher complexity for implementation, If you highly recommended doing a filter please help me to refine this.

Nesting issues

Currently, #[test] on mod let a set of functions become a test. The proposal do not addressing nesting mod issues, the similar is not addressing with #[test] on functions.

#[test]
fn it_works() {
    #[test]
    fn it_fails() {
        assert!(false)
    }
}

running 1 test
test it_works ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

The behavior of nesting #[test] mod will be the same, only the first level no input function will become a set to be a test case.

Other test-related attributes

A test-related attribute still works on the test function, the #[test] on the mod is building a test case with a set of the test functions. Following is the example to work woth #[should_panic]

#[cfg(test)]
#[test]
mod infra {
   fn it_works () {..}
   #[should_panic]
   fn it_panic () {panic!("...")}
}

Currently, the proposal does not raise the test-related attribute on the mod level. If you highly recommended doing this because I think this may not be easy to read, please help me to refine this.

What are the benefits of this design

keep the summary of cargo test clean and show the important things

When the test cases become more and more, we can remove the #[test] of functions, and put #[test] in the mod.
Find issues with running cargo test once

For the original purpose, we can test everything in functions, and make sure they will be tested, even one of the function fails.

mathstuf · June 3, 2022, 2:24pm

#[test] is implemented as a proc macro. I think describing this in terms of what it does is useful. In particular:

what test function is registered to the list of test functions?
how does it call each test?
how should the generated function guard against cascade failures (e.g., #[should_panic] and unexpected panics).
can the #[test] on the module use #[should_panic] and friends?
does this nest? #[test] mod foo { #[test] mod bar { /* … */ } }
where should helper functions that are not tests live?

kpreid · June 3, 2022, 2:30pm

You linked a comment that says:

I've also heard folks requesting a feature to allow them to mark a test as failed without returning/exiting/panicking just yet.

I don't know about what the other folks want, but for myself, the reasons I would like to see a feature like this are not about tidying up the summary or shortening the code. They're about continuing to run a stateful process, logging multiple failures.

For example, I could write a loop over some possible input values, and make an “assertion” for each item. What I want at the end is a report of every failure (not just the first one), because the first failure may not be sufficiently informative; a pattern of failures can be much more so. Today, I can get that by having the loop explicitly print an error (which will be captured by the test harness) and set my own failure flag which I check at the end of the loop to decide whether to panic (and cause all the previous output to be reported). Improving on this verbose manual glue would be if I could just say test::fail() and continue producing results. (It'd also be possible to call more functions that did their own assertions in this style, without needing to explicitly pass success/failure back.)

Even better — for my purposes — would be if, optionally, this could also report multiple results in the summary, as if the test were multiple test functions. For example, right now I have a use for running a test for each variant of an enum. I resorted to using a macro to read the enum definition and derive #[test] functions — if single tests could produce multiple results, then this could just be a loop instead.

In all of these cases, I'm trying to produce information from a stateful process — that is, one function (and the functions it calls). A module-of-tests cannot be stateful in that way (without introducing mutable static state, which would work but is quite inelegant), so it doesn't help.

yanganto · June 4, 2022, 7:42am

Hi @kpreid,

Thanks for the instructions, there is something I can not get from your post. Could you help me think deeper about this issue?

The functions in the #[test] mod may not always simple things to use a loop to change some value to test. From my side, a test is a scenario, and it may not good that less loop and if in the function but all statements. Also the test without loop and if helps beginners to join a Rust project.

Could you help to know how to make a module-of-tests that can not be stateful? I believe the #[test] on the mod is a test set collection and nothing about states.

// Invalid rust code
#[cfg(test)]
mod test {
    let mut state = 0;

    #[test]
    fn it_works() {
        state = 1;
        assert!(state == 0);
    }
    #[test]
    fn it_works_2() {
        assert!(state ==1);
    }
}

Rust is a performance, reliability, and productivity language, the proposal try to let developer writing less code and keep things readable.

yanganto · June 4, 2022, 8:03am

Hi @mathstuf Thanks for the input. The pre-RFC is updated about the functions inside the #[test] on mod and nesting issues. For the implementation (The How questions), I hope I can find these out. If I have an update on this. I will tag you again. If there is a thing the proposal did not answer, please kindly tell me.

kpreid · June 4, 2022, 1:25pm

I shall try to restate my point: You wrote

I am saying that your proposed mechanism (#[test] modules) does not provide the additional capabilities that being able to mark a test function failed without editing that function would.

I agree with you that when possible, tests should be simple and straightforward in structure. However, sometimes more complex testing code is required for good coverage. Fail-but-continue would allow more useful reporting from such tests.

Your proposal does not offer this benefit; rather it is a shortening of the syntax and output of multiple test functions as we currently have them. Perhaps this is worthwhile, but it does not solve the problems which fail-but-continue would. That is all I am disagreeing with.

mathstuf · June 4, 2022, 2:26pm

How do I use the name filtering mechanism on these? Or can I only access the module name for this filtering behavior?

yanganto · June 5, 2022, 7:48am

Hi @kpreid,

Thanks for your response, and clear my problem. I had assumptions based on my experience with people using test cases in mind about why they ran into problems with test cases and wanted to fail-but-continue. So it seems not related from your end, that is my bad, I will remove the description about the issue in the introduction section to avoid the problem, and keep the proposal not related to the issue. Furthermore, I will explain what I am thinking about all of these, If you are willing to take more time on this, I will really appreciate it.

Just think we are developing a storage system at version 0.1, and there is a store instance there, and we try to operate on it. A test case is the following try to do basic operations in one test case.

#[test]
fn test_basic_operations() {
  insert_chunk();
  edit_chunk(); // Possible fails
  tag_chunk_deleted();
}

Why people want to fail with continue is that no matter whether edit_chrunk works or not we still want to know if tag_chrunk_deleted() still works or not. Such that #[test] on mod solves this issue in the following way.

#[cfg(test)]
#[test]
mod test_basic_operations {
  fn insert_chunk() {...}
  fn edit_chunk() {...}
  fn tag_chunk_deleted() {...}
}

Furthermore, This is easy to manage test cases when go releasing and focusing on different features without touching the test case itself. For example, the version 0.1 focus on the basic chunk operation, the version 1.0 focus on the storage interface

// version 0.1
#[cfg(test)]
mod test_basic_operations {
  #[test]
  fn insert_chunk() {...}
  #[test]
  fn edit_chunk() {...}
  #[test]
  fn tag_chunk_deleted() {...}
}

// version 1.0
#[cfg(test)]
#[test]
mod test_basic_operations {
  fn insert_chunk() {...}
  fn edit_chunk() {...}
  fn tag_chunk_deleted() {...}
}

mod test_storage_interface {
  #[test]
  fn s3_protocal() {...}

  #[test]
  fn samba_protocal() {...}
}

From my side, this is productive not only a syntax candy, and more about the test management things.

@josh also likes this proposal, there may be another good inside for this based on his experience in building something complex system. I am also willing to hear about that.

If you have more ideas or suggestions on this, please tell me. Many thanks.

yanganto · June 5, 2022, 8:33am

Hi @mathstuf, Thanks for pointing out this, I believe that you are an expert and know more about the mechanism.

The filter does not work on the functions inside the #[test] mod, because they are parts of a test case but not a test case itself.

In the future, when we look much deeper at the implementation, there may be something that needs to conquer, please give a hand on this if you can. Thanks in advance.

mathstuf · June 5, 2022, 4:17pm

I've been in the implementation in order to work on the impl for my own pre-RFC:

That's fine; I think it just needs to be considered and documented, that's all.

I can certainly offer my viewpoint, but I have no real weight to get anything merged . Being able to see corner cases and holes in combinations of things is just a knack I have .

system · September 3, 2022, 4:17pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Thoughts about integrating testing into the Rust documentation documentation	11	1689	March 25, 2019
Pre-RFC: Make #[test] and #[bench] more flexible ideas (deprecated)	8	2169	March 25, 2019
Error ergonomics language design	34	3764	March 25, 2019
#[test] and external test harnesses language design	22	10584	March 25, 2019
Proposal for Custom Test Framework support language design	10	1370	March 25, 2019