Pre-RFC: Stabilize `#[bench]`, `Bencher` and `black_box`

brson · January 11, 2017, 6:59pm

I wonder if we could future-proof the Bencher type to be more abstract and use the already-working trick of overriding the test crate to provide alternate benching strategies. At least if we do start thinking about stabilizing Bencher we should see how much of its interface we can get away with locking down. The current interface seems quite specific, and I don’t have much conception myself of requirements for generic benchmarking.

Ericson2314 · January 11, 2017, 7:09pm

I’m not keen on stabilizing more hard-coded annotations in rustc. Sure they make things easier now, but they will amount to extra unneeded complexity that never goes away once we have a beautiful custom test harness solution.

It’s already awkward that #[test] and #[bench] are always defined regardless of what crates one links and how one is compiling—that means we can’t backwards compatibility solely define them in some library. For https://github.com/rust-lang/rfcs/pull/1133, it would most elegant to make all testing stuff (runtime and macros) come from single multi-phase test crate, or test and test-macros pair of crates.

tomaka · January 11, 2017, 7:17pm

The plugins-based approach has the advantages that test harness are not treated in a special in way by the compiler, and that you can use multiple test harnesses per crate.

brson · January 11, 2017, 7:20pm

In theory, sure. But it's quite hypothetical at the moment since there's not a concrete design, particular for plugin-based benchmarking.

alexcrichton · January 11, 2017, 7:33pm

I personally feel that we should take @brson’s initial proposal and move forward with that, stabilizing Bencher in std::test (I’d be ok eliding black_box, it often isn’t unnecessary due to how the closure works).

I agree with @nikomatsakis that the high-order bit here is that the benchmarking support we stabilize needs to feel built-in. That means you shouldn’t have to edit Cargo.toml or add extern crate annotations to get it working. Even reaching for Bencher is a great stretch over how #[test] works.

I’ve got lots of reservations about what Bencher actually does, but the interface is so minimal today that I think we can easily stabilize it and then enhance it over time with more bells and whistles in a backwards-compatible way.

I should also emphasize that replacing the test crate with your own is super unstable. This relies on the exact interface that libtest in-tree has today as the compiler will generate structures that libtest has. If you’re relying on --extern test=... shadowing the built-in test crate I would consider that a bug in the compiler that we didn’t properly gate it and I would also want to reserve the right to break it. The precise structures in libtest change over time, and there’s no reason they should be defacto stable.

brson · January 11, 2017, 7:44pm

I filed a bug about swapping the test crate: https://github.com/rust-lang/rust/issues/38998

carllerche · January 23, 2017, 6:29pm

IMO [bench] and others should not be stabilized. It doesn’t make sense to add “magic” to rustc vs. getting it moved out to a crate. I would expect that there should be a way to get something working in a crate with macros 2.0 which should be coming soonish.

brson · January 26, 2017, 7:19pm

I do agree that the long-term path forward is to lean hard on macros 2.0 for custom test frameworks. But I believe that making macro-based custom benchmarks work via macros 2.0 will still require redesign within the Rust test harness (there is nowhere currently to macroize the reporting features of #[bench]) in order to properly integrate with cargo test. Also ISTR that last time I asked nrc about macros 2.0 I came away with the impression that the needs of custom test frameworks were not fully accounted for yet. In other words, custom test frameworks are not going to automatically fall out of macros 2.0 - there is additional work to do to make it fit together.

llogiq · January 28, 2017, 8:14pm

I was under the impression that black_box(_) was the thing keeping us from stabilizing. Otherwise bencher is a mostly good enough solution building on stable macros 1.0. Swapping those out with an proc_attr_macro seems within reach.

oln · February 15, 2017, 9:11pm

One thing that would make things a bit more ergonomic while waiting for this to be stabilised is if it was possible to do #[cfg(bench)] or similar, akin to #[cfg(test)] which would make the annotated block only be compiled when running cargo bench. (alternatively a way to use cfg to detect if using nightly, but that would presumably be more complex.) This would make it easier to add benchmarks for internal things like rustc has. Conveniently, annotating something with #[cfg(bench)] with the current compiler will ignore the annotated block.

skade · February 15, 2017, 10:28pm

If you are using cargo anyways, it’s advisable to put benchmarks in the benches folder and switch to nightly before benchmarking.

iopq · February 16, 2017, 5:13am

Can't you just use the same nightly that corresponds to the stable version? There are many reasons to use a Nightly compiler for development. clippy is one, although I guess you're supposed to use the latest nightly with it.

skade · February 16, 2017, 10:42am

That came up as a question yesterday in a course I gave: do we release a nightly compiler directly corresponding to a stable compiler?

oln · February 16, 2017, 10:52am

This requires the functions that are benchmarked to be exported publicly though. My suggestion for a #[cfg(bench)] was to make it easier to benchmark internal functions that are not meant to be exported while we are waiting for benchmarking on stable.

cuviper · February 16, 2017, 4:56pm

AFAIK nightlies always come from the master branch, so the closest would be the nightly just before beta is branched off. This won't exactly match the eventual stable release, as there are often additional patches that get backported from master to beta in the process.

If you must match the stable compiler exactly, you can cheat: Setting RUSTC_BOOTSTRAP=1 will enable the same unstable features as nightly would. This is unsupported, of course, only meant for building rustc.

skade · February 16, 2017, 6:24pm

That was what I thought, but I think it makes sense to cut them.

I know about that trick, but you really can't recommend that for production use.

cuviper · February 16, 2017, 6:30pm

Well, I wouldn't recommend nightly for production use either, so... :shrug:

/me hopes for bench stabilization

skade · February 16, 2017, 6:56pm

We do regularly recommend nightly for use around development, such for things as rustfmt and clippy and your benchmarks are not your production program. This is a source of insecurity.

alkis · October 27, 2017, 9:21am

It is not wise to run benchmarks on nightly if you run production on stable. Both the production binaries and benchmarks should be compiled with the same compiler otherwise we further risk benchmark results not being aligned with production (even more so than they already are).

bluss · October 27, 2017, 10:37pm

Totally agree — that’s why the crate bencher exists, so that it’s possible to measure and validate performance fixes for stable releases. Now it would be great if someone had the time to make a better benchmark runner for stable…

Topic		Replies	Views
Idea: Semi-stabilization language design	37	4582	July 7, 2019
Getting more testing of unstable features	40	4401	March 25, 2019
#[bench] status libs	12	7800	March 25, 2019
Allow external crate to use unnecessary #[feature] on stable language design	38	3443	October 9, 2019
Keeping around unstable features until their replacements hit stable policy	6	1194	December 22, 2024

Pre-RFC: Stabilize `#[bench]`, `Bencher` and `black_box`

Related topics