We love tests. Oh my, do we love tests. I love tests. We’ve got more tests.
I’ve added two more auto builders to our CI. The ‘auto’ builders are those that bors runs before merging and prevent Rust from breaking, and each represents a distinct configuration for Rust. Today we test Rust in 36 configurations every pull request, 31 of which are required to pass before the PR can land.
They run a tool in the new build system called ‘cargotest’, which does one simple thing: downloads revisions of a few key crates, builds them and tests them. The crates it builds are chosen because they are prominent projects with broad dependencies. Today they are just two: Cargo and Iron. Between them are 76 transitive dependencies, the most battle-tested crates in the ecosystem, one of which is winapi.
bors gates on the cargotest builders, so it will be very hard to break any of these crates without some deep soul searching.
Automated regression detection is really, really hard. 10% is a decent threshold, but I wouldn't be surprised if noise can nevertheless cause many false positives and temporarily hide true regressions. So it should probably be a "heads up" notification, not a gating criterion, but we already have http://www.ncameron.org/perf-rustc/ in that direction, little point in duplicating efforts.
That’s what I meant with ‘nod’. The benefit would be faster turnaround time. If we already have the builders, it would be nice to get the most benefit out of them.
Great! Out of curiosity, I’d love to know which builders in particular (if any) tend to fail the most often, i.e. which platforms present the largest support burden.
Would love to see Diesel added to this eventually. We tend to stress the edges of the system with our trait usage, and have hit regressions w/ Trait visibility or something similar almost every month. (Worth noting that the core Diesel crate doesn’t rely on syntax extensions or anything unstable)