[pre-RFC] Implementing "<test_binary> --list --format json" for use by IDE test explorers / runners

  • Feature Name: Implementing "<test_binary> --list --format json"
  • Start Date: 2023-02-08
  • RFC PR: none yet (based however on discussions in #107307)
  • Rust Issue: rust-lang/rust#107307

Summary

<test_binary> --format pretty|terse|json|junit already exists. (See #49359)

As does <test_binary> --list.

This proposal put both together i.e. <test_binary> --list --format json for use by IDE test explorers / runners for superior user experience.

Until stabilization, a topic for an upcoming RFC, this will be available under -Zunstable-options.

Details

Typical IDE test infras (in vscode or vside e.g.) do things in 2 phases (a) discovery (b) test execution.

While it is possible to do the above today for Rust in vscode, we do not have a good user experience. E.g. vscode-rust-test-adapter) implements a test window but there is no way to navigate back to the source code from there.

The reason is vscode-rust-test-adapter) uses output of "cargo --list" to build a list of tests across the workspace but it lacks the location of each test in the source code as well as whether it is ignored or not.

This is a pain for projects that contain even 10s of tests.

Motivation

During the discovery phase, IDE examines the workspace to list the tests in a test window (see e.g. vscode-rust-test-adapter)

Users then use that listing to run some or all of the tests - upon completion of which the test explorer nicely displays the outcome / timing / errors etc. Users also use this UX to start debugging the tests if necessary. See the Visual Studio IDE test experience.

This proposal is for implementing <test_binary> --list --format json to spit out all the necessary information to provide the user with a superior testing experience within their IDEs.

Guide-level explanation

Today <test_binary> --list spits out the following

D:\src\dpt\tests\x\a> .\target\debug\deps\a-ffd6346f1b390104.exe --list
p2dchecks::should_fail: test
p2dchecks::should_success: bench
p2dchecks::should_ignore: test

With this preRFC i.e. <test_binary> --list --format json, we would get a machine readeable output with all necessary information during the discovery phase i.e. without running the tests.

D:\src\dpt\tests\x\a> .\target\debug\deps\a-ffd6346f1b390104.exe --list --format json
{ "type": "suite", "event": "discovery", "test_count": 2 }
{ "type": "test", "event": "discovered", "name": "p2dchecks::should_success", ignore: false, ignore_message: "", source_path: "src\\lib.rs", start_line: 54, start_col: 8, end_line: 54, end_col: 22  }
{ "type": "test", "event": "discovered", "name": "p2dchecks::should_fail", ignore: false, ignore_message: "", source_path: "src\\lib.rs", start_line: 59, start_col: 8, end_line: 59, end_col: 19 }
{ "type": "test", "event": "discovered", "name": "p2dchecks::should_ignore", ignore: true, ignore_message: "nyi", source_path: "src\\lib.rs", start_line: 64, start_col: 8, end_line: 64, end_col: 21  }
{ "type": "suite", "event": "completed", "total": 3, "ignored": 1 }

Note

  1. Only --format json will be implemented during the test discovery phase. i.e. <test_binary> --list --format pretty|terse|junit will not work.
  2. Until stabilization, a topic for an upcoming RFC, this will be available under -Zunstable-options.

Reference-level explanation

<test_binary> --format json has been implemented as part of #49359.

This proposal, builds on the same json format and introduces the following:

name type description
discovery event This is psuedo-event as the listing happens in on-shot. It is being introduced to retain consistency with the overall json format
discovered event - ditto -
ignore field true | falseindicating whether this test is marked with [#ignore]
ignore_message field the message in the ignore attribute. e.g. we need to fix root cause. if the attribute is #[ignore = "we need to fix root cause."]
start_line field start line of the test function identifier
start_col field start column of the test function identifier
end_line field end line of the test function identifier
end_col field end column of the test function identifier

Drawbacks

None I am aware of.

Rationale and alternatives

I have considered the following alternatives main things is none of these are cross platform solutions that will work across all IDEs (vi, sublime, vs, vscode etc.) without either degraded user experience or non-trivial work or both.

While the following are all fun hacking exercises, they are all net pain.

  1. RLS /codelens does spit out this information. However test runners are often extensions that do not have access to the RLS channels spawned by the IDE. See an example of what I mean. It is still doable by spawning another RLS but that will degrade the user experience - more disk io and RAM consumption - as now 2 x RLS will be inaction per workspace.

  2. Reading the machine code and using pdb reader API to location the source code.

    This is also doable. We know the main method is just a call test_main_static as follows test::test_main_static(&[&it_works, &it_works2]). From there we can work out the address of TestDescAndFn and from there it's testfn member and use PDB reader to locate the source code information.

    But like seriously?

  3. Parsing the source code myself. Also doable-ish kinda. It will require some parsing library for rust and it is made even more complicated by multiple parameterized test libraties existing in crates.io. e.g. test-case, rstest etc.

  4. Making changes in another test runner instead of rust / cargo. Doable. And recommendation from rust team was nextest-rs. This doesn't work for me because it requires users to add another crate to their workspace - rust-analyzer.vs should work out of the box i.e. 0 friction, as should any other tool.

Prior art

Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are:

For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had?

Yes we can see the architecture of the Visual Studio IDE test experience extensibility.

Unresolved questions

What parts of the design do you expect to resolve through the RFC process before this gets merged?

Mainly soliciting feedback from users and other tool writers.

What parts of the design do you expect to resolve through the implementation of this feature before stabilization?

None.

What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?

Not aware of any.

Future possibilities

Some test runners and IDE e.g. VS take this to the next level - they continuously run tests in the background as the user types in the code. See. WallabyJs / NCrunch / VS Live Testing.

As the users type in the code they are informed about the test failures, code coverage etc. This is super super useful for folks infected with Kent Beck style TDD. (I can personally vouch that any large project can be developed that way - but this technique has a learning curve and does require some skill)

No reason why we should not have a similar experience for Rust as well - and it is my aspiration to create one. But one step at a time.

2 Likes

One thing that isn't clear to me is: how much of this is built into existing editors, and how much of this is extensions? As it stands this reads as mostly a feature request to help vscode-rust-test-adapter, which is ... unlikely to happen ("Add a feature to rust because a third party tool does not want to do it itself" is poor motivation). You do mention other editors but it's not clear what the status quo is, and we should probably explore that first (and see if they would benefit from this)

Reference-level explanation

Please fully specify the behavior of the flag, and specify when it does and doesn't work. What happens when tests fail? What happens in non --list mode? How do the events work?

You don't need to describe the code changes needed to make this work, you need to describe how it will work.

1 Like

As it stands this reads as mostly a feature request to help vscode-rust-test-adapter

Nope. Check out rust-analyzer.vs - I am unable to build a test experience in VS on par with other languages due to issues addressed by this RFC.

Please fully specify the behavior of the flag, and specify when it does and doesn't work. What happens when tests fail?

as explained above this is only for the discover phase i.e. "cargo build --no-run" > find the test exe > run it with --list --message-format json

What happens in non --list mode?

no changes.

FWIW, the information requested here is easily fairly standard metadata about the tests. This is, ultimately, just adding proper support for --message-format to cargo test --list.

There isn't any standard LSP interface for requesting test harness information (yet?), but any IDE that offers a test panel is going to want the list of tests, and where in the source the tests correspond to is a very non-controversial extension to that which while not strictly necessary seems easily universally applicable.

Obviously the language extension will need to interpret the output of the language server into the format expected internally by the editor... but that's the designed purpose of language servers. There's no IDE where you can just point it at a language server binary without some amount of extension specific configuration, at a minimum just to identify what files to hook up to this server via generic LSP support.

A proposal that was actually biased towards a specific IDE or test view extension would probably include some more interesting way of doing partial/incremental test discovery (e.g. vscode's interface likes "collect tests in file" as a unit); this proposal is just outputting more machine-readable information from the existing cargo test --list.

4 Likes

I would recommend using --format instead of --message-format to maintain consistency with the existing libtest flag. I understand it is unfortunate that different tools are using different flags (--format for libtest, --message-format for cargo, --error-format for rustc), but I would like to avoid adding another wrinkle. (Or perhaps that was not intentional?)

I presume the format here is from #49359? That doesn't seemed to be mentioned anywhere. Also, there's no readily available documentation for what #49359's format is that I know of, so it is a bit difficult to comment on it. I wasn't initially thinking these would be related, but I can see that it could be useful to use the same form. That may make it more difficult to put this on a path to stabilization, but probably seems like the right direction.

It would be helpful to explicitly state what the relationship is with the existing JSON format. The proposal seems like it adds a "discovered" event type, along with the filename/location information, but I'm needing to make guesses.

2 Likes

Folks, please respond on this thread if you have any feedback / objections to this pre-RFC. If not, I will formalize this soon and get started with the work.

@CAD97 @ehuss @Manishearth @kornel @kpreid thanks for your comments / confirmations.

Note: please don't spend time writing code for this until you have an accepted RFC, following the process on the RFCs repo. That will likely take a while; this is just a pre-RFC to get preliminary feedback.

I think one thing that would be very important to move things further is to reach out to the maintainers of various plugins and see if your solution matches their needs. Your stated motivation is predicated on tooling authors wanting this to work this way, and any actual RFC will need buy in from them to succeed.

This is not a specification. As it stands, this is still very underspecified.

This must be clear in the RFC.

1 Like

I think the proposal makes sense.

I'd also like to see a push for --message-format=json|junit (or whatever bikeshed name) to give stable machine-readable format covering execution timing, progress, and results of the tests. Just having integration for listing the tests, but not running them, seems incomplete.

1 Like

We already have unstable support for junit output of executed tests.

1 Like

hey @Manishearth

I think there are a 2 disconnect between us. Do you have a chat / call where we can resolve this in realtime as I feel I have been repeating myself? Or please let me know who the right person to speak to is. You could also drop by the tool's discord for a chat.

This is a rather straightforward change, is backwards compatible and there are positive responses on both this thread as well as the github one.

Tool author buyoff

I am the tool author for rust-analyzer.vs - yes it works well for me.

I'll get the buyoff from vscode-rust-test-adapter owner.

They are 2 different extensions for different IDEs. if you know of any other extension that lists tests today, let me know i'll get buyoff from them as well.

Behavior introduced by the proposal

  1. this preRFC is for changes only in the "--list" mode i.e. when the test binary just lists the tests without running any of them. "--list" argument is already present, this preRFC does not introduce it.
  2. for the test execution mode (i.e. without "--list" argument), this preRFC will have no impact.

Did that answer the following questions?

Please fully specify the behavior of the flag, and specify when it does and doesn't work. What happens when tests fail? What happens in non --list mode? How do the event s work?

"--list" works whenever it works today and does not whenever it doesn't. tests cannot fail as we are just listing them. events work exactly as they do now for "--list" mode. i am riding on the same infrastructure and not introducing anything - other than printing out in json when --message-format is passed.

does all that make sense? or did you mean write the above content in the preRFC?

I'm rather busy these days but I don't have any commitments today so happy to hop on a call: I'm manishearth#6784 on Discord, ping me and we cal talk.

It may be backwards compatible, but here's the thing: once this behavior is in the compiler, we must support it forever, future changes to it must be backwards compatible. This means getting it right. There are lots of areas in Rust where there's a low bar for new features (e.g. working on diagnostics, lints, etc), but adding a new output format like this ... not as much. This will be unchangeable, and must be supported forever: that's a pretty high bar, and the Rust project requires a lot of thought to be put into design before doing that.

So no, this is not straightforward.

No, I do not have the time nor the inclination to do that research: as an RFC author you need to demonstrate you've found and gotten buy in from the stakeholders.

Also you should definitely check with the people who maintain the official rust-analyzer extension, and extensions like it, even i

This is not a specification. I'm not asking for a fully formal specification, but the full scope and behavior of the flag should be evident from the RFC. For example, check out the output spec in the rustdoc JSON RFC. Just an example is not enough: someone implementing this should not have any questions about what choices to make.

Note that you don't have to work it into the pre-RFC above; a pre-RFC is just an informal way to get feedback and improve your RFC before you submit it. But you will need this for an accepted RFC.

I do kinda agree with @kornel here: it would be nice to have this be holistic; though that's a much bigger project. I think for this RFC to be accepted it would probably need to at least think about the overall problem and demonstrate that it gels well with most overall solutions.

hey @Manishearth thanks for spending time on call, explaining in detail the process and suggesting a a good plan for us to go forward! :heart:

my bad @ehuss - fixed up the proposal, please take a look.

FWIW, we (SourceGear) also have a Visual Studio 2022 (not Visual Studio code) extension for Rust, and we too have tried integrating its Test Explorer features with cargo test ---list, and we have encountered the same kinds of problems the OP did.

So, in very broad brush strokes, I see this idea as a step in the right direction.

That is not to say that I've done a proper detailed review of what's here so far. I see the concern about the time and effort to move this forward in the right way.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.