#[test] and external test harnesses

harness = false

The test crate is unstable, and likely to be for a while. It’d be good to experiment a bit on crates.io. To support this, Cargo allows specifying in Cargo.toml:

[[test]]
name = "foo"
harness = false

(Note: when you do this Cargo stops enumerating tests/*.rs implicitly and you need an explicit [[test]] section for every test file.)

This causes cargo test to not rustc --test when compiling that test file and instead compile it as an executable with a main() function. Then it’s up to you to test stuff and exit a non-zero status code on failure.

[dependencies] rustc-test = "0.1"

Testing with nothing but main() and assert!() is not very nice.

I’ve extracted the test crate, forked it so that it runs on stable Rust*, and published it at https://crates.io/crates/rustc-test/. Please note that I’m not interested in maintaining or developing a test harness, this is mostly to demonstrate what’s possible today on the stable channel. I’d be happy to give ownership on crates.io if someone is interested.

(* With some caveats: capturing test output and the asm! implementation of black_box are behind Cargo features disabled by default since they require unstable features.)

With this, you can create a list of tests with the appropriate structs, call test::test_main(), and it works just like rustc --test. This is great for “data driven” tests that are generated dynamically with the same code and different input an expected output (which is not possible to do with rustc --test today). See html5ever’s tests/tree_builder.rs for example. But if you have many tests functions, this is not as nice as slapping #[test] on them.

#[test]

Which leads to the point of this thread. Using an external test harness is possible today, but using #[test] with it rather then having to enumerate test functions yourself would be nice.

Most of what rustc --test does is implemented in src/libsyntax/test.rs. Namely, it generates a module that looks like:

mod __test {
  extern crate test (name = "test", vers = "...");
  fn main() {
    test::test_main_static(&::os::args()[], tests)
  }
  static tests : &'static [test::TestDescAndFn] = &[
    ... the list of tests in the crate ...
  ];
}

So I’d like to (first introduce as unstable and then) stabilize building blocks for external test harnesses:

  • A simplified version of TestDescAndFn, with just enough to describe what rustc --test generates. (So probably without dynamic tests.)
  • Some way to override which crate/function is used instead of test::test_main_static. Maybe #[test_harness] extern crate fancy_test;? (If benchmark are supported in this mechanism, TestDescAndFn would probably have to be generic over Bencher.)

Does this sound like it’s worth pursuing?

7 Likes

Support for external test harnesses is definitely a thing I’d like to see. Allowing external crates to hook into the current test expansion logic seems like a decent starting point, but it doesn’t seem like that would quite be sufficient to serve the needs of more featurefull frameworks. It might almost want to be a special type of compiler plugin?

This is just one idea, maybe someone will come up with something completely different. I mostly want to get the conversation started.

We talked about this a bit at the most recent work week (and I actually did similar work awhile ago), but we unfortunately didn’t reach many conclusions in our discussion. I’ll write down some of my thoughts at least and I know that @brson has opinions about this as well!

  • #[test] is very ergonomic today, and I would like to strive as much as possible to have custom test frameworks just as ergonomic.
  • #[test] is also quite limiting, and there’s quite a laundry list of features others have wanted from test frameworks that aren’t quite well supported, such as:
    • Custom arguments to the function and/or return values (e.g. #[quickcheck])
    • setup/tear down, global, module, and perhaps even hierarchically based
    • state as input to a test
    • various signatures on a test for just a different testing experience. For example maybe instead of an assert!-based failure tests could have a Result-based failure. Maybe I want to return something like Test::Pass to ignore a test at runtime.
    • The are other attributes at play here like #{bench] and #[ignore] we would also want to consider.
  • The primary point of src/libsyntax/test.rs is in my opinion doing the clever tricks necessary to reexport all test functions to get to the crate namespace. That part basically makes this transformation outside the scope of a simple syntax extension.

In terms of ergonomics, my “absolute ideal” would be to add this to Cargo.toml

my-custom-test-framework = "1.0"

and be done with it. There are various ways that this could be made to work, and there are various reasons why this could be construed as “too magical”. I would think, however, that if custom test frameworks are much less ergonomic than #[test] they won’t actually end up getting any use. I personally wish to strive as close to this as possible.

I don’t really know what an API we’d want to stabilize would look like (I doubt it’s what it looks like today). I would want to make sure that it has the ability to do all the things I mentioned above (and more if possible). Essentially the test harness should have as much control over it needs to.

1 Like

This sounds like an opportunity to introduce something which could be used to build #[test]; specifically, the “collecting items” behaviour.

There are various cases where you want some kind of index or registry of things defined elsewhere. Sub commands, tests, error types… possibly even on-startup functions.

Just off the top of my head thinking:

macro_rules! test {
    (() $(pub)* fn $test_fn_name:ident $($_tail:tt)*) => {
        // I'm assuming eager expansion here ($*):
        add_to_global_list!(tests,
            ($*module_path_as_path!(), $test_fn_name, false,
            test::ShouldPanic::No));
    };
}

macro_rules! inject_test_harness {
    () => {
        mod __test {
            extern crate test;
            fn main() {
                test::test_main_static(&::os::args()[], tests)
            }
            inject_test_harness! { @tests [$*read_global_list!(tests)] }
        }
    };

    (@tests [$(($test_mod:path, $test_fn_name:ident, $test_ignore:expr, $test_panic:expr))*]) => {
        static tests: &'static [test::TestDescAndFn] = &[
            $(
                test::TestDescAndFn {
                    desc: test::TestDesc {
                        name: test::TestName::StaticTestString(
                            concat!(stringify!($test_mod), "::",
                            stringify!($test_fn_name))
                        ),
                        ignore: $test_ignore,
                        should_panic: $test_panic,
                    },
                    testfn: test::TestFn::StaticTestFn(
                        as_expr!($test_mod :: $test_fn_name)
                    ),
                },
            )*
        ];
    };
}

Ok, ignore for a moment that we don’t have eager expansion, or the ability to use macros-as-attributes: those can be worked around. We’d need module_path_as_path! or something similar to be added as a built-in (currently, module_path! returns a string literal).

add_to_global_list! and read_global_list! would be used to append to and dump a global, named list of tokens. For sanity, it should be an error to add to a list that’s already been dumped (so that things don’t get lost); so long as the read is triggered at the end of the root module, it should be fine. It’s up to the user to sort out ordering issues with the actual contents. This allows us to construct the list of tests from the attributes.

Also, it’d be nice to have a way to inject a macro invocation at the end of the crate module. --test could just be a shorthand for --extern test=.. --macro-use-extern test --inject-macro inject_test_harness. This would also allow Cargo to specify one or more custom test harnesses.

Aside from the “there’s no stable way to capture output” issue, that should be enough to divorce testing from the compiler, and buy us a hugely convenient way to collect information from across a crate at the same time.

3 Likes

Oooh, very interesting. I didn’t think of making that connection, but yes, a general “collect things” mechanism would be very useful. I have something similar in Servo: CSS properties are declared in one place where they each have a name, a parsing function, an initial value, etc; and several other places generate code for each property (e.g. match on their names and dispatch to parsing functions). Currently this whole thing is a Python/Mako template… which is not pretty.

An important requirement for me that I rarely see mentioned is that I need to be able to decouple the test running from the test generation, for example I want to be able to run tests written in any Rust test framework in a Visual Studio GUI.

In this respect I like the internal design of Rust’s test framework - it is just a list of functions to run with some metadata about what the outcome is expected to be. If other test frameworks can compile down to a list of functions that makes the interface to the test runner very simple (e.g. we don’t have to define any specific setup/teardown interface if the frameworks can bake them into the test cases they generate).

Edit: I’m also less enthusiastic than @alexcrichton about adding implicit crate dependencies based on Cargo.toml.

3 Likes

I've been thinking about this, and it's similar to what I want. I'd like to have a separation between 'test framework' - the plugin that generates the list of tests at compile time - and 'test runner' - the thing that runs and reports them. Nearly all the code generation for testing currently in the compiler could be extracted into the 'default test framework plugin', while the 'test' crate becomes the 'default test runner'.

I think that the current TestDescAndFn type can probably be simplified. The information it's passing around about how to interpret tests can be test framework-specific, and just captured in the test functions themselves. The list passed to the test runner can just be pairs of (fn, name), where each function returns a TestResult of some kind.

For example, given the current syntax:

#[test]
fn mytest() {
}

The default test framework would expand that to something vaguely like

fn run_my_test() -> TestResult {
    let res = catch_panic(mytest).;
    if res.is_ok() { TestResult::Ok } else { TestResult::Err }
}

Or an ignored test to

fn run_my_test() -> TestResult {
    return TestResult::Ignored
}

Hopefully test frameworks can just be plugins. Test runners though are something else - they are a crate that defines a main function that accepts some custom data. We may be able to abstract this concept in a way that the std runtime entry and the test runner entry use the same mechanism.

A big +1 to this from me; that'd make it trivial to run tests in an iOS app. My current solution is a Python script to copy all of my test code into a separate crate so I can run it on iOS... rust-objc/xtests/build.py at e034af276081fed2be46301450d091f3d890ac72 · SSheldon/rust-objc · GitHub

Hi Everyone,

This is my first post to internals, so please let me know if I make any faux pas. To introduce myself, I’m Sam. For the past 3 years now, over in Ruby land, I’ve been a maintainer of the RSpec testing framework. I’ve been very excited about rust, and I’ve been working on a testing framework as a hobby project. The core part of the framework is called descriptor, and you can find the code here.

I’ve read through this thread and have a couple of thoughts that would make writing descriptor easier:

  • at the moment, describe (the function to create test groups) and it (the function to create tests) have to be invoked from main, or something called from main. My understanding is that rust doesn’t have a static context, and do we can’t just have files that have bare describes in them, but if there was an annotation a user could but in front of a describe that’d be great.
  • source locations of tests: at the moment, I’m having to pull source location with a macro. One problem with this approach is that if you nest block macros, you lose all line number information (see this example.
  • cargo test being a way to invoke descriptor_main(), preferably with any flags passed to cargo test being passed through. It’d also be nice if I could specify a directory (like spec/, perhaps user specified) where the compiler should look for test files.

I’m not sure how achievable any of this is, and I’d be happy to talk about my test framework’s design goals, how I see it going and so on. Do let me know if you’ve got any questions :slightly_smiling:

4 Likes

Oh, one further thing I forgot to mention earlier that might be useful. All my tests are dispatched in their own thread, which is pretty interesting I think, and might affect the design choices here a little.

Thanks for the reply @samphippen! The split @brson is talking about between a test framework and a test runner seems like it’d fit quite nicely into what you’re talking about. Specifically the descriptor crate would in theory largely be a test framework, which in this case would manifest itself as a plugin. As a plugin you could have any kind of syntax you want almost and you’d also have information naturally to info like file/line numbers.

The descriptor_main would probably be handled by your own test runner, but the split means that some could use the descriptor syntax for defining tests but something else for running them all (maybe to generate machine-readable output or something like that).

The stabilization plan for plugins is pretty long term at this point, but it provides us quite a nice vector for being extra flexible in how test frameworks are implemented!

Bump to rekindle discussion cause testing in Rust is really painful right now without setup and teardown.

We can get around per-test setup/teardown by wrapping the test bodies in a function that takes a closure, but is there any way to do a setup/teardown once for the entire test suite? I can achieve a global setup phase by abusing lazy_static, but AFAICT the destructor on the value created doesn’t get run at the end of the program, so I’m not sure how to achieve a global teardown phase right now.

/cc @wycats

@jimmycuadra Anything added into Rust proper is gonna take time to be designed, developed, and stabilized. Assuming someone does start doing all that work.

In the meantime, I recommend doing as I describe at the start of this thread:

Cargo.toml

[[test]]
name = "foo"
harness = false

[dependencies]
rustc-test = "0.1"

tests/foo.rs

extern crate test;
fn main() {
    let tests = collect_tests();
    setup();
    test::test_main(&std::env::args().collect::<Vec<_>>(), tests);
    teardown();
}

To go further, you can fork https://github.com/SimonSapin/rustc-test, add your own functionality to the harness, and use your fork instead of rustc-test in [dependencies].

4 Likes

I would like to link this Cargo issue her: https://github.com/rust-lang/cargo/issues/1924, it seems relevant.

Just want to double check: Right now setting harness = false in Cargo.toml prevents Cargo from passing --test to rustc, which disables both the test runner and the libsyntax code that collects the tests and generates the module which calls the runner. AFAICT there is currently no way to disable just the test runner part and not the libsyntax part.

Edit: Nevermind, I see now that --test just does the code generation, and the generated code uses the test crate which contains the runner, but that crate doesn’t need to be “disabled” in any way. But it is true that the libsyntax stuff is all or nothing, so you can’t use its test collecting capabilities but write your own main function to kick off the runner.

Given that we’re talking about a larger overhaul of the testing system, I wonder if there would be any value in an RFC to add two separate flags to control both of those things separately, i.e. --test expands to --test-main and --test-collect, but you could control the behavior using only one of the latter two arguments if you wanted.

1 Like

A few thoughts. [Just skimmed the thread so sorry if I am repeating things.]

Most importantly, I’d like to see everything in rustc related to tests turned into library (compiler plugin + runtime, probably). Because our current test framework is so ergonomic (yay!!), I think that is a superb, maybe even unparalleled, goal with which to stress the compiler plugin framework.

With that in place, there are a few fun things I’d like to add on top. Besides what is listed, it would also be nice to either run 1 process per test, or more efficiently continue with a new process where testing was left off after test failure. Finally, putting everything together, I’d like to ergonomic write unit tests for OS dev (or more generally, cross compiled + emulator, e.g. this would be good for Servo on Android). System software traditionally has been more poorly tested than most, and I’d like to not only smash that problem, but smash it with today’s stellar ergonomics.

1 Like

Would it make sense to allow libraries to auto-initialise themselves for tests? I’m just thinking of a simple way to get a logger initialised; extern crate env_logger; vs having to write env_logger::init().unwrap(); in every test or create some kind of set-up function just to initialise the logger.

It might make more sense if this is configurable (for test harnesses or more generally), e.g.

#[init]
extern crate env_logger;

That is like C++'s “life before main” and very very scary IMO. Most initialization is a smell and indeed if we had needs-provides we wouldn’t need it for log.

2 Likes

Yes, life-before-main. But (1) libraries can document what they do and (2) it can be disabled simply by not including the #[init] line (in this case manual initialisation would be needed).

Alternatively, I suppose a test-driver could work with a user-provided main function which does the init then calls some “special” function to run the tests.