Continuing the discussion from Testing the compiler:
I’d like to propose a Grand Unification of our testing system. In particular, I’d like to unify run-pass
, compile-fail
, and ui
testing. What follows is a write-up of some of my reasoning. Presuming we get some agreement, then I’m hoping to mentor people who want to work on this refactoring.
What do we want from a testing setup?
Let me start by elaborating the kinds of things I think we want from tests:
- We want to be sure that broken code fails to compile
- compile-fail does this, ui does too
- We want to be able to check how we format our errors
- ui tests do this today
- We want to maximize “happy accidents” in testing
- e.g., if I screw up formatting errors in some subtle way, I might not find it from targeted tests, but if we are checking that all of our compile-fail tests continue to work as expected, we’re more likely to catch the problem
- similarly, if I suddenly make
NOTE
andHELP
messages go away, I probably want to know that too -
ui
tests, since they test the complete output, maximize the change of catching subtle problems
- We want to make it easy to update tests
- e.g., if we change from using
-
to~
to do underlines, we should be able to easily update all of the reference files - similarly, if we change the test
-
compile-fail
is good if we add to the test, but bad if we make a systemic change to wording (or some other thing that it considers) -
ui
tests have this covered with the script
- e.g., if we change from using
- We want to make it easy to review
-
ui
tests make it easy to see what new errors etc look like, and make it hard for small diffs to sneak in unnoticed
-
- We want to make it easy to read the tests and know which scenarios result in an error
-
compile-fail
tests do this well, since the//~ ERROR
annotations are inline in the test -
ui
tests require you to load a separate file and correlate line numbers, not good
-
- We want to check the JSON output
- nothing tests this now, really
-
compile-fail
SORT OF does, but only sort of, because it only looks at some of the json fields - something like
ui
but for the json output would be better
One case that is not covered well today:
- We want to be able to easily enumerate related cases that should and should not work
- nothing does this very well today
- sometimes you can put them all into a compile-fail test
- if you need separate files, you either spread them across two directories (run-pass, compile-fail) or else use
#[rustc_error]
, which is annoying and a hack
What do I propose?
I think I would like to unify run-pass
, run-fail
, compile-fail
, and ui
tests into one sort of thing. However, I’d like to start by just unifying compile-fail
and ui
tests, and then move on to the others. Let me describe that first.
compile-fail and ui
These tests are very close. In fact, we could just copy all the compile-fail tests into the ui directory, run the “ui generation script”, and call it a day. But that would be a bit odd, because then we’d have all these //~ ERROR
lines that have no semantic meaning anymore. In fact, I think they are kind of useful, because (as I noted above) if you have a test like this, it’s nice to be able to tell which of the functions are errors and which are not while reading the source itself, without having to cross-reference (this comes from compile-fail/lifetime-bound-will-change-warning.rs
, which I picked at random):
fn test1<'a>(x: &'a Box<Fn()+'a>) {
// just_ref will stay the same.
just_ref(&**x)
}
fn test1cc<'a>(x: &'a Box<Fn()+'a>) {
// same as test1, but cross-crate
lib::just_ref(&**x)
}
fn test2<'a>(x: &'a Box<Fn()+'a>) {
// but ref_obj will not, so warn.
ref_obj(x) //~ ERROR mismatched types
}
fn test2cc<'a>(x: &'a Box<Fn()+'a>) {
// same as test2, but cross crate
lib::ref_obj(x) //~ ERROR mismatched types
}
So I propose that we extend the ui
runner so that it ALSO scans the file and extracts the //~ ERROR
annotations. It would then grep the output and check that we do indeed get errors on the lines where we say we will.
Note that this is a kind of “secondary” check. We don’t need to have //~ NOTE
annotations or any of that jazz, because the ui
test is there to check that the message is what we expected and that everything looks right. In fact, we could probably simplify the //~
annotations so that they don’t even include a message, just //~ ERROR
. It’s just there to help you when you skim a list of cases so you can see which ones result in errors and which don’t.
So this is the precise change then that I propose:
- Move all
compile-fail
tests toui
tests. - Simplify to just support
//~ ERROR
and//~ WARNING
; go back to grepping the output instead of parsing JSON.- Also, just support matching the “main” message, which is easy to grep
- Remove all the
//~ NOTE
,//~ HELP
, etc, because those are complicated to grep
- Generate reference tests, naturally
supporting run-pass, run-fail etc
Eventually I’d like to support other kinds of tests. I think I would want to just have the sort of test be determined by an annotation at the top (e.g., // run-pass
) rather than the directory it lives in. For a run-pass test, I’d like to have a reference file that tells you the expected output when it is executed.
other kinds of tests
Finally, if we want to test json output, we can make tests that use // compile-flags:--error-format=json
and just use the JSON output as the expected output. We might need to tweak the normalization code, not sure.
thoughts?
What do people think?
One concern that @eddyb raised is that he doesn’t like the way that the ui
test relies on a script to be maintainable. In particular, the expected workflow is to run the compiler and then check the diffs in the output, and use the script if you feel they are ok. He feels that this is too prone to error. I am sympathetic, but I cannot come up with another way to test all the notes and formatting completely and universally. However, my belief is that this concern is mitigated by two factors:
- First, the fact that
ui
tests would still check that//~ ERROR
and//~ WARNING
occurs on the right lines means you can hand-craft the tests, to some degree, to ensure you get the expected results.- You just also run the script in that case, but it’s not the “primary line of defense”.
- Second, I find that changes to
ui
tests are a joy to review, so I do expect to rely somewhat on reviewers, and the ablity to easily skim agit diff
, to detect “surprising” changes.- If anything, I’d expect we’ll do a BETTER job than we do today, since we’ll notice when smaller notes or other details change that are not tested by compile-fail.