I’ve made some improvements to rustc-benchmarks recently that are worth announcing.

Most importantly, there is now a `compare.py`

script that compares the speed of two different compilers on some or all of the benchmarks. To run all benchmarks you do this:

./compare.py $RUSTC1 $RUSTC2

Sample output:

```
futures-rs-test 4.689s vs 4.668s --> 1.004x faster (variance: 1.001x, 1.008x)
helloworld 0.232s vs 0.230s --> 1.007x faster (variance: 1.009x, 1.012x)
html5ever-2016- 7.670s vs 7.669s --> 1.000x faster (variance: 1.008x, 1.009x)
hyper.0.5.0 5.304s vs 5.308s --> 0.999x faster (variance: 1.007x, 1.005x)
inflate-0.1.0 4.849s vs 4.884s --> 0.993x faster (variance: 1.019x, 1.009x)
issue-32062-equ 0.400s vs 0.396s --> 1.009x faster (variance: 1.014x, 1.021x)
issue-32278-big 1.872s vs 1.833s --> 1.022x faster (variance: 1.021x, 1.018x)
jld-day15-parse 1.903s vs 1.875s --> 1.015x faster (variance: 1.006x, 1.002x)
piston-image-0. 12.910s vs 12.932s --> 0.998x faster (variance: 1.010x, 1.006x)
regex.0.1.30 2.622s vs 2.629s --> 0.997x faster (variance: 1.020x, 1.018x)
rust-encoding-0 3.269s vs 3.245s --> 1.007x faster (variance: 1.022x, 1.022x)
syntex-0.42.2 0.240s vs 0.242s --> 0.992x faster (variance: 1.011x, 1.004x)
syntex-0.42.2-i 48.252s vs 48.070s --> 1.004x faster (variance: 1.011x, 1.006x)
```

For each benchmark it runs each compiler three times and uses the fastest run for the comparison. The “variance” values show the ratio between the fastest run and the slowest run for the two compilers, which is useful to see because occasionally (on my machine, at least) you get surprising results caused by high variance.

You can also specify benchmark names as additional arguments if you want to run a subset, e.g.:

./compare.py $RUSTC1 $RUSTC2 helloworld hyper.0.5.0

Second, all the benchmarks are now Cargo-ified and the makefiles are more consistent.

Third, all the benchmarks now have `Cargo.lock`

files which ensures the versions of external crates won’t change.

Fourth, the two `syntex`

benchmarks have been fixed. Previously, they measured compile time of the final crate, which is trivial, instead of the second-last crate, which is the interesting one. This explains the huge regression for those benchmarks on http://perf.rust-lang.org/ this week.

All these changes mean that it’s now easier to do methodical optimization work on rustc. Please ask if you have any questions.