Make std's thread builder hookable

eggyal · June 18, 2022, 9:20am

In a discussion on the #t-libs Zulip stream around a year ago, @the8472 helpfully suggested making std's thread builder hookable as one possible solution to a problem that I was then (and am still now) having. I've finally returned to working on this and would like to move this idea forward.

My use case is probably quite niche, though one can imagine such a hook could be useful in other cases too.

In my case, I am building a test-runner that also provides a custom profiler runtime for use with -C instrument-coverage. This runtime is thereby invoked by instrumentation calls that the compiler injects into the code under test.

I would like to run tests in parallel, yet know from which test the instrumentation calls are received. I can set some thread-local for each test when it is spawned, which the profiler runtime can then inspect—this works well for tests that do not spawn any subthreads, but obviously the thread-local will not be set in any spawned subthreads. If std's thread builder was hookable, I could properly initialise any threads that are spawned using it.

I currently see this hook working via two new attributes:

a #[lang = "thread_spawn_hook"] or similar lang item that one would use to decorated one's receiver; and
a crate-level #![no_thread_spawn_hook] or similar, which disables the default (no-op) receiver otherwise provided by stdlib.

std's thread builder would then call the registered hook when spawning a new thread. I think this design will be zero-cost, as calls to stdlib's (no-op) default receiver should be elided by an optimisation pass.

Is this a reasonable approach?

Obviously this cannot be stabilised without an RFC, but I also appreciate the bandwidth for new feature proposals is very limited—and this may be far too niche to justify such limited resource. If there's broader interest, I will work on a pre-RFC for more detailed feedback.

In any event, would I be able to merge experiments in this area into nightly?

bjorn3 · June 18, 2022, 11:37am

I think it should be a function like the panic hook (std::panic::set_hook) Having an attribute requires a lot of special code in the compiler that likely needs to be duplicated for each compiler backend. A function to set a hook can be implemented solely in libstd.

eggyal · June 18, 2022, 11:39am

True, but that's not zero cost as we'll need to store (the address of) the set hook somewhere and perform an indirect call to it every time a thread is spawned.

bjorn3 · June 18, 2022, 12:05pm

The cost of spawning a new thread completely dwarfs a single indirect call. If you are particularly worried about it, you can store a 0 value in an AtomicPtr in case there is no hook and check if it is set before doing the indirect call.

kornel · June 18, 2022, 2:35pm

I wish Rust had a better design pattern for "hooks" than a shared global mutable state. This approach breaks down as soon as there's more than one thing in the whole program that wants to use it, and since it's globally accessible, there's no way of preventing or controlling this.

eggyal · June 18, 2022, 3:04pm

stdlib could maintain (still in shared global mutable state) a collection of hooks into/from which you can register/unregister.

josh · June 18, 2022, 5:00pm

If you're profiling code, you likely don't just want to know about threads started by Rust; you also want to know about threads started by C or similar. Given that, I'd suggest tracing mechanisms that allows catching newly spawned threads from the outside. PTRACE_O_TRACECLONE for instance.

I don't think we should have an arbitrary Rust-specific hook mechanism running on thread creation.

eggyal · June 18, 2022, 5:03pm

Not profiling, but determining each test's source code coverage.

You're right that hooking std's thread builder would not catch threads spawned elsewhere, but (for my use case at least) that could be a documented limitation (with running test's serially/in child processes as an escape hatch).

I don't know enough about ptrace... I guess if I can ensure tracees break when they spawn threads (in order for the tracer to inject setup of thread-local data) that would work. I don't think determining coverage through ptrace would be practical. I see that PTRACE_O_TRACECLONE may be exactly what I need! Thanks

system · September 16, 2022, 5:04pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Should std::panic::set_hook have a per-thread version? libs	4	2831	May 26, 2019
Idea: "Ambient data"/"Current execution context"/ Was: Provide a thread-local scope with hooks for `std։։thread։։spawn` language design	10	2635	March 25, 2019
RFRFC : std thread at_start(callback) libs	20	3909	March 25, 2019
Guarantees about threading in `std`	4	753	March 25, 2019
Scoped threads in the nursery (maybe with rayon?) libs	29	6827	June 4, 2019

Make std's thread builder hookable

Related topics