Discovering the current test (name)

I am maintaining a snapshot testing library (insta) and for it to work best, it tries to discover the name of the test by inspecting the thread name.

Unfortunately this does not work if --threads is set to 1 on rust test (bug report here) but it also can run into platform limitations on the length of the thread name.

Since this issue is quite frustrating to me I was trying to see if there are no better ways to expose this information. In the ideal case there was a test support module one can import to access information about the test. Something like test::current_test_name().

Has there been some discussion already about providing minimal introspection to the integrated testing system?

3 Likes

Worst worst case we could provide an #[insta::test] macro which sets our own thread-local for the test name. Wouldn't even require syn/quote, theoretically.

(Practically: using syn with just the minimal derive support to extract the function name would be way easier than finding it manually, but it'd be reasonable doable.)

This reminds me of the recent discussions about adding an ability to skip/ignore the test based on run-time criteria. And that reminds me of the plans we had a couple of years back regarding custom test frameworks, to allow experimentation in the community.

I don't think there has been much progress since on the pluggable test framework, and, personally, I quite like that rust ended up having just one testing framework. I like that there's little variation in terms of testing across the projects, and it seems that libtest is actually good enough (contrast with Python's unittest/pytest). The "single test framework" goes against Rust's value of empowering the user to do whatever, but, similarly to the "single build system", I think in this case this brings a lot of productivity benefits.

So.... Maybe there's a space for an RFC which proposes some plan for the evolution of current libtest? As a couple of strawmans:

#[test]
fn my_new_test_param(t: &mut test::Tester) {
    if std::env::var("RUN_SLOW_TESTS").is_err() {
        eprintln!("skipping {}", t.name());
        t.skip();
        return;
    }
}

#[test]
fn my_new_test_thread_local() {
    if std::env::var("RUN_SLOW_TESTS").is_err() {
        let t = test::current().unwrap();
        eprintln!("skipping {}", t.name());
        t.skip();
        return;
    }
}
2 Likes

I'm with you in that it's nice that rust has a defacto test framework these days. I would not want this discussion to be a stepping stone to going back to that discussion :slight_smile:

In terms of your strawman proposal I vastly prefer the thread local option because a lot of systems already need to access this information very far removed from where the test is declared. I'm happy to try to work on an RFC for this.

As the author of the skippable test pre-RFC, I think that thread-locals are just about the worst possible solution to this because it means that one has to think differently about APIs such as async, thread::spawn, or just about anything under rayon. We already had issues with println!() and its thread local usage behind the scenes. Let's please not add new ones.

Personally, I think just having either a magic macro or symbol that is available in #[test] functions is suitable. I have crates which like to use the test name and I just pass it around explicitly from the top-level function (Vim's word-in-buffer completion helping out from there).

2 Likes

I very much disagree, most of my libraries with good test coverage end up with massive hacky macro_rules! based test-generators to do parameterized testing. Maybe it's possible to do this as some nice proc-macro framework that generates normal tests, but since you can't generate function names like add(1, 1) == 2 I still think there's a lot of room for more featureful custom test definitions.

What are we contemplating here? Does the #[test] attribute have a stable interface that must be obeyed, or can we add to it in some way? If we can add to it, I'd love to see a trait defined that tests can implement that allows forward-compatible extension of tests. Purely as a strawman:

// Configurations

#[derive(Debug, Default)]
pub struct TestConfigurationVersion2 {
    test_name: String,
    // Whatever else is being contemplated for this version
}

#[derive(Debug)]
#[non_exhaustive]
pub enum TestConfiguration {
    Version1,
    Version2(TestConfigurationVersion2),
}

// Errors

#[derive(Debug)]
pub enum TestConfigurationErrorVersion2 {
    // Indicates that the test cannot handle the given version of
    // the test framework's incoming configuration.  Ideally, we'll
    // add some information here that the test framework can use
    // to try and redo the test using an earlier version of the
    // configuration, but it's entirely possible to simple try
    // all supported variants of `TestConfiguration` until one of
    // them works, and to report the test as a failure if that
    // doesn't happen.
    VersionError,

    Failed(), // Something that derives std::error::Error

    /// If this is returned, then the test passed, BUT the test has a
    /// suggestion for another test to try.  This makes it possible to
    /// build intelligent fuzzers that use the results of prior tests
    /// to narrow down what caused the test to pass.
    PassedTestNext(TestConfigurationVersion2),

    /// If this is returned, then the test failed, BUT the test has a
    /// suggestion for another test to try.  This makes it possible to
    /// build intelligent fuzzers that use the results of prior tests
    /// to narrow down what caused the test to pass.
    FailedTestNext(TestConfigurationVersion2),
}

#[derive(Debug)]
#[non_exhaustive]
pub enum TestConfigurationError {
    Version1,
    Version2(TestConfigurationErrorVersion2),
}

// The test trait

pub trait TestTrait {
    #[allow(unused_variables)]
    fn constructor(
        config: TestConfiguration,
    ) -> Result<Self, TestConfigurationError>
    where
        Self: Sized,
        Self: Default,
    {
        match config {
            TestConfiguration::Version1 => Ok(Default::default()),
            TestConfiguration::Version2(..) => {
                Err(TestConfigurationError::Version2(
                    TestConfigurationErrorVersion2::VersionError,
                ))
            },
        }
    }

    fn test(self) -> Result<(), TestConfigurationError>
    where
        Self: Sized;
}

The idea is that #[test] would become more like a derive macro, converting all current tests into objects that implements TestTrait, but which have a do-nothing constructor and a test method that is compatible with the current test methods (there will be some fiddling to get this right, I banged the code out pretty fast without any real testing).

The main idea is that the interface should be forward-compatible. Although the configurations for any given version of libtest are fixed, we can add in as many versions as we wish via new variants to TestConfiguration. Similar statements can be made for TestConfigurationError.

In addition, a test is able to return a new test configuration that can be used for a different iteration of the test. That could be useful for feedback-driven fuzzing tests where (once you've minimized the configuration causing the issues), you want to save the configuration in a regression database.

There are a whole host of other issues to solve here, like whether or not async tests would be a good idea, etc., but this may be a ways forward that doesn't break older tests.

That assumes that test skipping will be done by random threads. IMO tests should only be skippable by the test thread itself. I've used test skipping a decent amount in Objective-C. I have never needed to skip a test somewhere deep within the test. My test skips have always been the first line of the test function (perhaps wrapped in an if-block). I'm baffled by the need or desire to skip tests deep within arbitrary code running on arbitrary threads.

2 Likes

I'm still preferring either -> ExitCode as a return type (and skipping based on the value) or attributes myself. The return code does use "magic numbers", but also works for test executables under tests/. The attribute is what I already have implemented :slight_smile: .

I agree, but feel free to read through the thread for the arguments in favor of panic-based or thread-local settings for more details.

2 Likes

How about an enum TestResult that implements Try? It could look like this:

enum TestResult<T, E> {
    Ok(T),
    Err(E),
    Skip,
}

Or, if tests should always panic in case of an error (so a backtrace can be collected with RUST_BACKTRACE=1), it could be just

enum TestResult<T> {
    Ok(T),
    Skip,
}

As mentioned, this comes with the problem that libtest is, AFAIK, perma-unstable. Any exposed type wouldn't be available on anything other than nightly.

Maybe, if we can't add a stable type to libtest, we could use ControlFlow from libcore for this.

I don't see how that is better than Result<T, E> (which is available today). impl Termination with a "magic" exit code value is about as close as I can think of right now. That does cover tests/ directory tests (which are separate binaries) as well (there will need to be some additional communication channel for such things anyways).

ControlFlow is also available today. Using Result::Err to indicate that the test was skipped is a bad idea for two reasons:

  1. The name Err doesn't make sense in this context
  2. Someone could accidentally use ? in the test. The rest harness would then display "1 test skipped" instead of "1 test failed".

Sure, but the API today is that #[test] functions must return an impl Termination. How would you suggest this work?

I'm of two minds wrt. programmatic runtime skipping of tests. On one hand, unstructured / ad-hoc runtime skipping would be quite powerful. OTOH, generally speaking, structure stuff interacts better with other tooling because it's less opaque and usually inspectable without needing to run everything.

I'm not sure how that would work in practice at in rust. But as an example of what I mean I'll mention testng, a java testing library. You can assign tests to groups and say that you want to run tests by group, and you don't have to instantiate all the tests - the library just parses the classes and looks at the annotation data. I'm not sure what that would look like from a Rust perspective. I don't think it'd be a good idea to skip compiling tests that aren't going to be run, so it might not actually gain us a whole lot to have them inspectable at pre-runtime-level.

OTOH I look at the build.rs situation where probably 90%+ of the use cases could be met in a more IDE friendly way if we just provided some structured/not-free-form way to indicate what they wanted done. And in the case of testing currently if you want to do complex testing you hack around the test runner anyway. So maybe structured / annotation driven would be better than ad-hoc / runtime driven.

So, after rereading everything, I'd lean towards a structure approach and allow it to be injected, sort of like the t: &mut test::Tester example parameter that was given, though maybe as a newtype around a string instead of a full on libtest-esque controlling object.

Just to be sure, when you say impl Termination, you mean this trait, correct? Reusing code from my earlier comment:

// Since Termination is unstable, this only works on nightly.
use std::process::Termination;

// Configurations

#[derive(Debug, Default)]
pub struct TestConfigurationVersion2 {
    test_name: String,
    // Whatever else is being contemplated for this version
}

#[derive(Debug)]
#[non_exhaustive]
pub enum TestConfiguration {
    Version1,
    Version2(TestConfigurationVersion2),
}

// Errors

#[derive(Debug)]
pub enum TestConfigurationErrorVersion2 {
    // Indicates that the test cannot handle the given version of
    // the test framework's incoming configuration.  Ideally, we'll
    // add some information here that the test framework can use
    // to try and redo the test using an earlier version of the
    // configuration, but it's entirely possible to simple try
    // all supported variants of `TestConfiguration` until one of
    // them works, and to report the test as a failure if that
    // doesn't happen.
    VersionError,

    Failed(), // Something that derives std::error::Error

    /// If this is returned, then the test passed, BUT the test has a
    /// suggestion for another test to try.  This makes it possible to
    /// build intelligent fuzzers that use the results of prior tests
    /// to narrow down what caused the test to pass.
    PassedTestNext(TestConfigurationVersion2),

    /// If this is returned, then the test failed, BUT the test has a
    /// suggestion for another test to try.  This makes it possible to
    /// build intelligent fuzzers that use the results of prior tests
    /// to narrow down what caused the test to pass.
    FailedTestNext(TestConfigurationVersion2),
}

impl Termination for TestConfigurationErrorVersion2 {
     fn report(self) -> i32 {
         use TestConfigurationErrorVersion2::*;

         // Since the set of variants can increase, this will
         // need to keep up over time.
         match(self) {
             VersionError: -1,
             Failed(_): -2,
             PassedTestNext(_): 1,
             FailedTestNext(_): -3,
             _ : unreachable!()
         }
     }
}

#[derive(Debug)]
#[non_exhaustive]
pub enum TestConfigurationError {
    Version1,
    Version2(TestConfigurationErrorVersion2),
}

impl Termination for TestConfigurationError {
     fn report(self) -> i32 {
         use TestConfigurationError::*;

         // Since the set of variants can increase, this will
         // need to keep up over time.
         match(self) {
             Version1: -1,
             Version2(err): err.report(),
             _: unreachable!()
         }
     }
}

// The test trait

pub trait TestTrait {
    #[allow(unused_variables)]
    fn constructor(
        config: TestConfiguration,
    ) -> Result<Self, TestConfigurationError>
    where
        Self: Sized,
        Self: Default,
    {
        match config {
            TestConfiguration::Version1 => Ok(Default::default()),
            TestConfiguration::Version2(..) => {
                Err(TestConfigurationError::Version2(
                    TestConfigurationErrorVersion2::VersionError,
                ))
            },
        }
    }

    fn test(self) -> Result<(), TestConfigurationError>
    where
        Self: Sized;
}

(Some of that was banged out without testing, please forgive any errors I made)

There will need to be a blanket implementation of Termination for Result<(), TestConfigurationError> that will need to take some more thinking, but that gives the gist of what can be done.

As a good case study here is go testing: testing package - testing - pkg.go.dev. They seem to be able to do all obviously missing things (test names, failing, skipping, hierarchical & dynamic tests) using a simple API and frugal language machinery.

2 Likes

We kind of just stopped talking about this... anyone have any further thoughts or suggestions?