I'd like to propose a new feature for Cargo: the introduction of a CARGO_RUN_ID environment variable. This would solve an issue in identifying which processes or threads are spawned from the same Cargo command.
Problem:
In projects using libraries like SQLx for database testing, it's crucial to coordinate cleanup operations across multiple test processes. Currently, there's no built-in way to determine if separate test processes originated from the same Cargo invocation, leading to race conditions when cleaning up the test database. sqlx issue
Additionally, this issue extends beyond testing. Any scenario where multiple processes need to coordinate based on their origin from a single Cargo command would benefit from this feature.
Proposal:
Add an environment variable CARGO_RUN_ID that is available for the crate being compiled, similar to other Cargo-set environment variables (Environment Variables - The Cargo Book).
Properties of the generated ID:
Unique across Cargo invocations
Chronologically ordered
Given these requirements, a UUID v7 seems like a suitable choice for generating this ID.
Benefits:
Improved coordination in multi-process scenarios (e.g., SQLx database cleanup)
Ability to group logs or artifacts from related processes
Enables more effective strategies for cleanup and resource management in complex builds and operations.
Provides a uniform method for tools and libraries to identify related processes, benefiting the entire Rust ecosystem.
Nextest has implemented a similar feature with NEXTEST_RUN_ID, demonstrating the value of such an identifier.
I'm willing to implement this feature myself if the proposal is accepted. Would this be something the Cargo team would consider? I'm open to feedback or suggestions to refine this proposal.
A custom cargo testrunner could spawn several processes that run in parallel. (nextest does)
If our tests are using #[sqlx::test] any leftover databases should be cleaned up the next time cargo nextest command runs.
This is a race condition since sqlx wasn't designed to handle a multi-process runner.
Sometimes the current database gets dropped and test aborts.
CARGO_RUN_ID would be a way to group processes that originate from the same cargo command.
It makes the coordination a lot simpler compared to handling multi-process locking etc.
In the context of nextest, CARGO_RUN_ID would replace NEXTEST_RUN_ID.
For libraries like SQLX we want to do a similiar thing: only keep the latest test database run.
So for Sqlx we want to listen to CARGO_RUN_ID instead of an unknown set of {NEXTEST_RUN_ID, OTHER_TEST_RUNNER_ID, ETC...}
The command cargo nextest will run the binaries in parallel
So your more asking for cargo to support this now so that it can be part of the expected interface that test runners (e.g. nextest) use when spawning test harnesses (e.g. test binaries)?
Thanks for the feedback, I hope to address some of the concerns.
Library-focused solution:CARGO_RUN_ID primarily benefits libraries like SQLx and Insta that need to coordinate across test processes. It solves real issues:
Test runner agnostic:CARGO_RUN_ID would allow these libraries to work consistently across different test runners. Instead of handling NEXTEST_RUN_ID, OTHER_TEST_RUNNER_ID, etc., they'd use one standard identifier.
Future-proofing:
When Cargo implements parallel test binary execution, libraries using CARGO_RUN_ID will be ready without changes.
Not limited to testing:
I chose CARGO_RUN_ID over CARGO_TEST_RUN_ID because its utility extends beyond testing. It can coordinate any processes from a single Cargo invocation, allowing for broader future applications.
Exposing CARGO_RUN_ID to the plugin
Would be nice, but is not the main focus currently.
As I said, saying there might be use cases is insufficient on their own. If we want to design for this and call it out, we need to identify real users who could benefit.
Think I understand a bit what you're getting at, sorry for misunderstanding.
You don't think I should mention the point since it is speculative?
My intention mentioning it was just an extra plus, not a core part.
I'd lean towards not adding knowledge of this in Cargo -- NEXTEST_RUN_ID is very useful but its semantics aren't fully fleshed out, and the relationship between it and CARGO_RUN_ID may not be 1:1.
For example:
Nextest will at some point let you rerun failed tests. We probably want to use separate run IDs but may need a parent ID that groups those reruns together.
A single nextest invocation might lead to more than one Cargo build in the future. Many devs ask for a single nextest run to encompass something like cargo hack's --feature-powerset.
Not committing to particular semantics for this within Cargo means that these more concrete details can be fleshed out over time.
Also note that Cargo only passes down a very limited set of environment variables to 3rd-party plugins
Yes, I would imagine that if Cargo did support this, nextest would have to have a second implementation. But that's something nextest already has to do and it's not a huge deal.