Extraction of code as part of compiler integration tests for CI

Sorry, if my post appears to be less informed about the Rust CI. I am new to this forum and it was suggested to me on reddit to present my ideas here. I would like to know from you what you think about it regarding usefulness.

Some flaw I do see on the current testing is that there seems to be no keeping track of integration tests (in a structured manner) and they are not necessarily nice to extract the issue number or meaningful information for a CI system. However I might be flawed on this in the estimation of its usefulness or simply uninformed, what is there.

The ideas:

1. use Github Issues to extract code as integration tests (and test it)

2. run automatically chosen integration tests against nightly and stable HEAD

3. show results as part of perf

How is the typical workflow of handling an issue?

  1. The issue comes in(usually reports are not that clean)

On nightly x86_64 this

fn main() {
    let mut u = (1,);
    *&mut u.0 = 5;
    assert_eq!( { u.0 }, 5);
}

gives

thread 'main' panicked at 'assertion failed: `(left == right)`
  ...

Output of rustc --version --verbose

rustc 1.47.0-nightly (39d5a61f2 2020-07-17)
binary: rustc
commit-hash: 39d5a61f2e4e237123837f5162cc275c2fd7e625
commit-date: 2020-07-17
host: x86_64-unknown-linux-gnu
release: 1.47.0-nightly
LLVM version: 10.0

  1. It is manually tested (via copypaste to godbolt or to a local file) to be confirmed.

  2. It is tagged accordingly

  3. Hopefully soon somebody fixes the issue.

  4. A regression test is written and a PR is links to the issue.

  5. bors closes the issue on commiting the PR

The minimal test case might be used with minimal changes as regression test).

What if we can remove the steps 2 + 5 and make sure (backports/things) are not overlooked?

it is better to use computers for repetitive tasks

1. use Github Issues to extract code as integration tests

  1. In an ideal case the user creates issues looking like this

Staring forever on this code, this looks very broken me.

// run-pass
fn main() {
    let mut u = (1,);
    *&mut u.0 = 5;
    assert_eq!( { u.0 }, 5);
}
thread 'main' panicked at 'assertion failed: `(left == right)`
  ...

rustc --version --verbose

rustc 1.47.0-nightly (39d5a61f2 2020-07-17)
binary: rustc
commit-hash: 39d5a61f2e4e237123837f5162cc275c2fd7e625
commit-date: 2020-07-17
host: x86_64-unknown-linux-gnu
release: 1.47.0-nightly
LLVM version: 10.0

  1. The CI then parses the field in a simplified manner and either marks the issue as CONFIRMED or NEEDS-REVIEW and tests against HEAD of nightly and stable for occurence.

Alternatively the godbolt API [can be used](compiler-explorer/API.md at master · compiler-explorer/compiler-explorer · GitHub), but likely takes more effort.

  1. The reviewer adds additional labels for later placement of the integration test, fixes the integration test and pings the right people.

  2. The fix gets commited.

  3. bors closes the issue

2. run automatically chosen integration tests against nightly and stable HEAD

  • goal: keeping track of issues and rarely allow regressions on stable/nightly to happen

    Probably the CI has no logic what tests should be included or does not check missing tests or the incident could not have happened.

    Having no guideline how the issues must be linked for integration tests does not help either.

On 25 issues per day this will only get harder over time to miss tedious things and I have no idea how to estimate complexity from issues, before it becomes unbearable.

Maybe I am missing a repo, where you host all code issues?

3. show results as part of perf

  • goal: show that stuff works and does not regress
  1. The known working and known failing (but indended to be fixed) integration tests may be part of sporadic perf results and releases. Alternatively they could be scheduled by a fixed time interval.

How should tests be formatted?

Rust tests appear to be handcrafted, but with help of x.py and added not specifically [consistent](Adding new tests - Guide to Rustc Development](https://rustc-dev-guide.rust-lang.org/tests/adding.html).

"For regression tests – basically, some random snippet of code that came in from the internet – we often name the test after the issue plus a short description."

Related

When things go wrong

  • Integration tests in src/test have no conforming way to link to the according issue on a regression (or is this handled by using the filename?)

For example ui-fulldeps contains files like

  • issue-15149.rs (often without any explanation)

  • aux-build:issue-16822.rs (same folder, no explanation)

  • compiler-calls.rs without an .stderr (same folder)

  • plugin-args.rs with according plugin-args.stderr

The latter two are freestanding, but I found no overview or systemic explanation on what should be tested on these. Likely I am missing something here.

Curious

Why are valgrind tests not removed or moved to an archive? How Rust is tested

Things like extern crate issue_9188; are not explained in the dev-book.

Thank you for your input re: our CI. We (the infra team) appreciate your input and will respond to this when we are able.