An idea: "partial slice constants" to generalize testcase discovery

Sometimes there are things scattered across the codebase, and it's desirable to have a list of all things.

Examples: tests, entry points, web server routes, Google-style flags, performance counters, feature toggles.

Maintaining the list of things manually is error-prone. It is also problematic if things are in multiple crates.

Inspired by C# partial classes, I propose something like this:

partial const ALL_THINGS: &[Thing] = &[..];
// "[..]" means this is a definition of a partial slice constant.
// It's a compilation error to have more than one definition.

partial const ALL_THINGS = &[.., Thing(1), ..];
// "[.., x, ..]" means it's an extension or a partial slice constant.
// It has to be defined elsewhere.

partial const ALL_THINGS = &[.., Thing(2), ..];
// Another extension of the same constant.

The compiler transforms it into

const ALL_THINGS: &[Thing] = &[Thing(2), Thing(1)];
// order unspecified

Using this mechanism, say testcase discovery can be achieved by having #[test] macro expand

#[test]
fn my_test() {
    stuff
}

into

const ALL_TESTS = &[.., ("my_test", &my_test), ..];
fn my_test() {
    stuff
}

I think this is GitHub - dtolnay/inventory: Typed distributed plugin registration.

2 Likes

inventory is cool, but it relies on something Rust says it doesn't support - life before main. Said in another way it's only as portable as the inventory crate itself is and uses platform specific code.

Oh, I've never seen linkme before, but it's similar to what's being suggested here. Also platform dependent.

3 Likes

How does this help with manual bookkeeping? In your example, you still had to specify what exactly goes into the slice, didn't you?

Couldn't you solve this in a platform-independent way by means of a lazy_static! vector instead?

It helps with manual bookkeeping by letting you distribute the elements of the slice across the codebase, like #[test] annotations. Of course they all still must be specified, but at least they don't have to be specified in a single central location. This is basically exposing the linker's ability to combine items across a full binary even when they are defined in separate object files.

You cannot use lazy_static! to accomplish this. Where would the push calls go, and how would you ensure they all run before use of the vector?

There actually the two problems help solve each other:

  • you need to put the push() calls in executable code
  • but that means you can put them just before the place wherever they are needed. Of course, this requires yet another layer of laziness (so that they are added only once, before first use), but that should be pretty easy to abstract away, too. Perhaps a dummy OnceCell of which the factory closure constructs no real value but pushes the required elements.

You are missing the point entirely. "Just before the place wherever they are needed" is a single central location, exactly what we're trying to avoid. There is one consumer (e.g. the test harness itself) and many producers (e.g. #[test] declarations). Indeed, the producers don't actually need to be executed -- they are just data -- so ideally they would not be.

While this is pretty great, it suffers from the sadness that you need library constructors to run, which might not happen if you're trapped on an itty-bitty embeded environment. It would be great if the compiler could sort this out for us at link time, which could ostensibly be arranged through putting all these things in .rodata.ALL_THINGS or similar.

Unfortunately the cinch would be getting pointers to the start and end of such a section, and ensuring the linker actually goes and finds them, neither of which is reasonably guaranteed. Definitely would require stuffing metadata into your rlibs.

linkme (also mentioned upthread) by the same author provides literally the distributed slice of statics, using linker shenanigans.

TL;DR what it does:

  • name a new linker section
  • put known zero-sized names at the beginning and end of the section
  • distributed slice members are statics located in the new section
  • create a slice from the start marker to the end marker

If you only care about platforms with linker support for doing that, linkme works perfectly. The maintainer(s) explicitly welcome PRs to support more platforms; they just don't have reason themselves to add support for other platforms they don't know as well.

Obviously compiler support would eliminate the need for linker support (and mean support for wasm?), so it's still better than adhoc linker shenanigans. But people undersell linkme, which is mostly supposed to replace inventory for the distributed slice use case.


(Side note: the compiler already has distributed slice like support for #[test_case], though it is specialized for that one specific use case.)

4 Likes

Oh good someone has litearlly done the thing I described. Excellent. I will put this to nefarious purpose at some point.

1 Like

Similar idea which relies on attributes instead of introducing a new keyword and syntax:

2 Likes