I’ve just updated the compiler, now I have rustc 1.11.0-nightly (1c975eafa 2016-06-05), so I’ll wait for the next one.
That version (1c975eafa 2016-06-05) is nightly-2016-06-06.
Oh, OK, I was looking at the date… With this compiler using -Z orbit gives about the same run-time (it’s perhaps 1-2% slower).
I’ll do other tests on other programs. Sorry for the noise.
My comment on the PR:
Once MIR is on the likelyhood of us turning it back off seems very low, so I have to disagree with this. We must hold the line on performance - we have already regressed compile time performance badly this year. Do we have evidence that compile time, run time and code size are on par with oldtrans?
I see some encouraging runtime benchmarks in this thread, but not compile time, and not code size. Can somebody point me to anything definitive on them (apologies that I’m not familiar with all the measurements done so far)?
Quite simply, Is not making it the default the best way to discover and fix (compiletime and runtime) performance regressions? Can there not be a
-Z launchpad for anyone that needs latest nightly and is afraid about such regressions?
Here is the data I might expect to have before turning it on:
- Runtime of all test suites on crates.io, all nmatsakis’s test cases
- Compiletime of all rustc-perf benchmarks, all crates on crates.io
- Code size of all crates on crates.io
The averages of these three (aggregated) numbers should be as good or better than oldtrans, and there should be no obviously horrible outliers. This seems to be a pretty reasonable and easy set of data to acquire.
I think crater is probably also not a sufficient bar for testing ‘correctness’ too. We should probably do a full run off all tests on crates.io on all three tier 1 platforms, especially windows.
There is, in the PR I mention both
-Z orbit=off and
#[rustc_no_mir] (the latter was used in tests to allow us to bootstrap with
-Z orbit before everything was implemented).
Eventually yes, but there’s no reason to take this step in a rush, when we can first work to mine data across a large corpus of Rust code like crates.io. Passing that bar first reduces potential pain/churn in the nightly ecosystem, and has already been serving well to spotlight issues with the transition.
Fair. Performance wise, I assume Crater mainly gives information compile-time performance, and perhaps some run-time performance information from benchmarks and test?
I will try and get something running for correctness here. I agree it’d be good to cover windows better.
crater gives no information about timing of any kind.
I’ve been wondering how best to gather timing numbers. I’ve adapted a script from @jntrnr that can run and scrape timing information across all of crates.io, but I feel a bit nervous about building arbitrary crap from crates.io on my computer – seems like a security risk to do so, given that it may involve executing arbitrary code.
I can use an EC2 instance, but I’m not sure whether the numbers that would result are reliable in any way. Seems unlikely. The same might apply to a VM – though I’d expect the figures from a VM to be relatively ok, just a higher margin of error.
Perhaps a new user on my laptop would be a good compromise.
You could run the tool while booted from a live usb environment?
This seems like a good bet, and will work for at least linux, you would want to do it while you are not connected to the Mozilla network. For macs I can sacrifice one of the decommissioned bots, but we’d want to be careful about its network connection - alternately we may be able to temporarily re-image a macstadium machine (which is off the mozilla network), or commission a new one of those. Not sure if Windows has live images for this.
I prefer to use a VM manager. You could install Qubes (qubes-os.org) for Windows/Linux - a dedicated Xen provides fairly reliable timings.
Yeah, of course. VM is secure enough.
Not the most scientific benchmark, but Diesel’s test suite seems to be compiling at around the same speed with MIR. 77.01s baseline, 76.81s with
-Z orbit. Averaged over 3 runs, difference between fastest and slowest was ~1s in both cases. Anyway enough to say there appear to be no regressions there. (Full command was
cargo clean && time cargo rustc --no-default-features --features="sqlite unstable" -- -Z orbit) Nice work!
Compiling my personal project. Without MIR, 9m2.537s. With MIR, 7m52.177s.
Similar reports compiling hyper: no perceivable difference. Same number of seconds.