Help Needed: corpus for measuring runtime performance of generated code

anp · April 27, 2018, 2:18am

Update from my end:

2018-02-02 is the first nightly date since 2018-01-01 where all of the currently assembled benchmarks compile successfully, and I’m currently running a script to backfill data from there for the last couple of months using a spare machine I have. Once more of these have run, I’m planning to start exploring a few strategies for how to present the data. At a minimum I think we need to identify some statistics which can allow us to sort graphs for the benchmarks by some sort of “interestingness” metric, surfacing the most interesting graphs to look at. Right now I am assuming we want to know about a) runtime performance regressions and b) runtime performance improvements, and to focus on recent (6 weeks old? 4?) changes like those.

I don’t have much of experience dealing with this kind of data, so I’ve begun a bit of research on what kinds of analysis might be appropriate for finding interesting benchmarks, keeping a few notes at https://github.com/anp/lolbench/issues/7. So far, I’m pretty sure that:

we want to be very confident that something is a regression/improvement, not just noise
we don’t need to take any automated action based on the metric, other than making it easy for humans to know which benchmarks to look at
we don’t want to have to define lots of parameters up front for different benchmarks (there are too many individual benchmarks)
a solution should be as simple as possible so it doesn’t become a weird black box

I’ve collected a few ideas on that issue, would be great to hear from anyone with more experience with statistics.

While backfill benchmarks are running I’m going to try to tackle:

polish perf_events and publish to crates.io
upstream PMU measurements to criterion
make it really easy to contribute new benchmarks

Topic		Replies	Views
Pre-RFC: Stabilize `#[bench]`, `Bencher` and `black_box` language design	50	12034	March 25, 2019
Help us benchmark incremental compilation!	48	12503	March 25, 2019
Compiler Profiling Survey compiler	25	2825	January 27, 2020
Measuring compiler performance tools and infrastructure	4	1616	March 25, 2019
Let's talk about parallel codegen compiler	49	10184	March 25, 2019

Help Needed: corpus for measuring runtime performance of generated code

Related topics