So, sadly, the perf.rust-lang.org
website has been broken for quite some time (basically since rustbuild landed). @Mark_Simulacrum has done some awesome work (which they can describe) building up a new system that uses the pre-built binaries produced by travis. This offers the promise of very precise info about which PR triggered a regression.
I’d like to spark a discussion on two topics:
- What are the minimum steps we can take to get some kind of results available again?
- I really dislike having no measurements
- Is it just a matter of needing to get the old server up and going again, or what?
- How should we structure our test suite?
@Mark_Simulacrum and I have had quite a few conversations on the second point and it seems like a good idea to get broader input.
My current take is that there are roughly four kinds of measurements I would like:
- Compilation times that target very specific parts of the compiler and workflows
- this includes incremental flows, i.e., build from scratch, apply diff, etc
- Regression tests for known performance issues (kind of the same)
- “Real-world” tests that correspond to frozen versions of actual crates
- this includes incremental flows, i.e., build from scratch, apply diff, etc
- Performance of generated code
- not currently measured at all; obviously somewhat different from compilation time, but perhaps can share infrastructure
Our current set of compilation-time benchmarks has grown somewhat organically and includes a smattering of the above categories. Perhaps we should carefully review them?
I made a brief stab at set of runtime benchmarks as well but that never quite got off the ground. Some more suggestions for entries there would be helpful.
Thoughts?