Global Executors

jtgeibel · November 15, 2019, 2:07am

Don't GUI frameworks typically require that UI calls are made from a single thread? I'm not familiar with any rust GUI crates, but it seems like most GUI applications would want a way to spawn UI tasks on a single-threaded executor, and other tasks on a shared threadpool. It seems like there would need to be a way to spawn tasks onto specific threads (or just the current thread as suggested elsewhere), or onto the shared threadpool.

Ralith · November 15, 2019, 6:22am

Finally, having some kind of executor available as part of the libstd would enable us to make the main function and test functions into async contexts, setting up an async environment for you automatically.

For reference, tokio already handles these cases just fine: async main, async tests. I don't see why std needs to do anything here.

Matthias247 · November 15, 2019, 6:56am

I don't think a language feature for that is necessary here for a variety of reasons:

As others said, multiple executors running in a programm will be reality. I worked on a Java service in the past which included at least 4 java eventloops (Netty, NIO2 executors, and 2 homegrown things). On Android people having an android mainloop plus an IO framework is common. Some libraries will be tested exactly against one version of their depdencies before being released - and they then expect the dependency to work in production the same way as when tested. Switching the behavior depending on other dependencies can lead to unexpected results. Obviously the same is true for memory allocators, but I think for those the contracts are understood a lot better due to years of experience.
I think if people want to have the possibility to register and use global executors, it can be done in a library which acts as a facade and registration point. log is a good example here. Whatever executor is running could register itself as a Spawn enabler via a global function, and others could use it. But for the remaining reasons I am not even convinced that this is always a benefit.
those implicit things have sometimes proven themselves to be the hardest and most weird acting for users. E.g. check questions around SynchronizationContext and TaskScheduler in .NET. And people wondering why their async code works in a WinForms application, but not in a Console application - even though they can await in both environments just fine.
spawning is the smallest and most uninteresting problem. Most libraries do not just need to spawn tasks, they will also need to perform IO, use timers, etc. If they still need to provide an adaption layer for that, then just adding one function for spawn is not that big of an addition anymore. And I doubt we can standardize IO, timers, etc. in the near future. We are doing baby steps in that direction (via Stream, AsyncRead and Co), but it will take time. I think if libraries want to make sure they are interoperable with multiple runtimes, they should simply provide their own IO traits the functions and let their users implement them - and/or provide compat layers for the common runtimes.
I like @CAD97's mention of structured concurrency. It actually has not that much to do with the question and proposal itself, but if you purely structured concurrency principles then just forwarding spawn capabilities (or things like Kotlins CoroutineScopes) is actually not that much of a burden anymore.

Nemo157 · November 15, 2019, 9:21am

I agree that a global executor/spawn API probably needs more time to bake in the ecosystem before it is added to std. Particularly I feel that adding just task parallelism doesn't greatly increase the set of libraries that can be written agnostic of async runtime. There are a significant set of other global resources commonly used (TCP connecting/binding, timeouts, file IO). We need more experiments like runtime into how a global runtime can be abstracted over, either as a whole as runtime did it, or as a series of pick-and-choose components.

But, even without the global executor I think it would be worth exploring async fn main and #[test] async fn foo as soon as std::thread::block_on_task is available. There are a lot of libraries that don't (need to) use any of the globals mentioned above, they take in impl {Future, Stream, AsyncRead, ...} and return impl {Future, Stream, AsyncRead, ...} and only use task-internal concurrency.

Primary benefit I see of async fn main is doc examples, futures 0.3 examples are full of

# futures::executor::block_on(async {
...actual example
# })

if rustdoc were to detect top-level await and implicitly change to using async fn main then this wrapper could be dropped. On the other hand in real code a lot of frameworks wouldn't want to use async fn main, e.g. GUI frameworks that need to own the main thread.

Runtime agnostic libraries shouldn't need to be pulling in Tokio in order to run tests.

pygy · November 15, 2019, 10:18am

What about dynamic rather than global scope?

Not as a general language construct, but for a few chosen variables, like the allocator or the executor?

Marwes · November 15, 2019, 10:51am

I agree that a global executor/spawn API probably needs more time to bake in the ecosystem before it is added to std . Particularly I feel that adding just task parallelism doesn't greatly increase the set of libraries that can be written agnostic of async runtime. There are a significant set of other global resources commonly used (TCP connecting/binding, timeouts, file IO).

To add a data point to this, I wrote the async client in redis-rs and making the task spawning there executor agnostic was simple, even without a common Spawn trait. Just return a Future and ask that the user spawns it or otherwise make sure it gets polled.

    #[cfg(feature = "tokio-executor")]
    pub async fn get_multiplexed_tokio_connection(
        &self,
    ) -> RedisResult<crate::aio::MultiplexedConnection> {
        let (connection, driver) = self.get_multiplexed_async_connection().await?;
        tokio_executor::spawn(driver);
        Ok(connection)
    }

    /// Returns an async multiplexed connection from the client and a future which must be polled
    /// to drive any requests submitted to it (see `get_multiplexed_tokio_connection`).
    ///
    /// A multiplexed connection can be cloned, allowing requests to be be sent concurrently
    /// on the same underlying connection (tcp/unix socket).
    pub async fn get_multiplexed_async_connection(
        &self,
    ) -> RedisResult<(crate::aio::MultiplexedConnection, impl Future<Output = ()>)> {
        let con = self.get_async_connection().await?;
        Ok(crate::aio::MultiplexedConnection::new(con))
    }

A Spawn trait might be more convenient and a bit more efficient, but there aren't anything major to prevent inter op between executors today. The bigger problem with interop currently (which remains to be done in redis) is having common IO abstractions as those need to be embedded deeper in the library.

Having a global executor, or even a "current" executor would only help make it a little easier to make redis-rs runtime agnostic and that is only if the library limits itself to spawning the task on the global/current executor. For maximum flexibility I still want a version of the above functions which allows the user to freely spawn the future on whatever task they want which brings me back to returning a "driver" future or taking a Spawn argument (*).

(*) With a current executor this could be done as

with_current_executor(my_custom_executor, async {
    get_connection_with_current_executor().await
}).await;

but that isn't especially pleasing.

pygy · November 15, 2019, 11:31am

The dynamic scope could be in Rust as is implemented by keeping a global stack of weak pointers to executors, and using the lifetime system to pop said stack... (on mobile right now, I’ll elaborate later on if needed).

Edit: err, no, that wouldn’t fly with a sync code...

Edit2: actually, it could work, if the executor pushed itself on the stack before resuming, and removed itself when done.

Edit3: well, you’d need one stack per thread...

withoutboats · November 15, 2019, 1:15pm

Please don't make these kinds of acrimonious accusations - there's not even an RFC here, let alone a fast track, and throughout my blog post I emphasized moving slowly and carefully. This kind of comment sews distrust and bad feelings and it's absolutely baseless.

This blog post was part of a series (like the one on async destructors) of throwing out basic sketches of medium term improvements to async/await. The goal is to see what sticks and direct my work according to that. The response to this post has made clear that std::task::spawn should not be particularly high priority right now.

Ralith · November 15, 2019, 4:48pm

They don't need to; the point is that the implication of that a std executor is needed for this is false, as it's already possible.

Centril · November 15, 2019, 10:15pm

I believe it's possible to see bad aspects and good aspects of #[global_allocator]:

Bad ones:

It's a one-off hack, and not a small one at that (I just checked the compiler internals for the feature and it contains a lot of checks which would take real effort to specify).
It creates new considerations and complications for const fn (that were probably not considered at the time).

Good ones:

It makes an existing singleton configurable.

Looking at the actual implementation of #[global_allocator] or thinking about what a formal specification might look like, I do not see a whole lot of code & spec reuse for #[global_executor]. The additional challenge is therefore that we add around as much additional complexity that #[global_allocator] entailed. Worse, because we are not constrained by a general mechanism, there's also not a check on the uniformity of the mechanism keeping us honest and ensuring we don't make mistakes.

So we introduce #[global_executor] and the next time we say "it's just three deprecated APIs instead of two". I don't agree with saying "just", let's at minimum acknowledge the cost. (I should say that I personally have low tolerance for being OK with foreseeing the deprecation of a language feature, but ymmv.)

Also, to elaborate on my note re. "keeping us honest" above, I think that with a one-off solution, there's a lot of temptation to use that capability making the feature perfect for the specific use case but less fitting into a generalized language feature.

Moreover, while I'm generally not a fan of singletons, it's worth noting that the global allocator is an existing singleton as opposed to a new one like the global executor.

To me it feels strange to draw such opposition between these two mechanisms when in fact the language which originated type classes has support for exactly this feature in backpack.

This comparison does not work for me at least. A global allocator is truly fundamental in a way I don't believe global executors would be. Imagine not having the former and instead having to thread the constructed allocator from fn main down everywhere. When some languages have GC, I think such threading would be so onerous as to be a deal-breaker for most users. A global allocator is something almost all applications are interested in. Meanwhile, I don't think the same can be said for e.g. global thread pools or global executors for async.

Matthias247 · November 16, 2019, 12:19am

Imagine not having the former and instead having to thread the constructed allocator from fn main down everywhere. When some languages have GC, I think such threading would be so onerous as to be a deal-breaker for most users.

FWIW, zig is doing exactly that. I definitely see something good at it - especially for the use embedded and low-level applications where all allocation failures have to be handled and choosing between different allocators is important.

However Rusts model with infallible default allocations and a global default allocator certainly makes things easier for higher layer projects - which is still where Rust is mostly used.

Centril · November 16, 2019, 12:39am

That's a good point. I can certainly see how not having a global allocator would be appealing for some embedded use cases especially in the more memory constrained targets. Games might also be another case (in favor of arena allocation), but only in some places. That said, it wouldn't be fun not having a global allocator in rustc's code-base.

comex · November 17, 2019, 7:16am

As usual, I strongly oppose giving std magic abilities that I can't replicate if I'm designing my own std alternative (e.g. for a special environment like a kernel). That includes #[global_allocator].

However, I don't see the situation as urgent, because user code can get most of the same functionality already by abusing extern declarations. For the same reason, I'm not necessarily against adding more #[global_FOO] attributes.

But I strongly believe the plan should be to eventually subsume them into a general feature that any library can use. In other words, I disagree with this:

That said, I agree that singletons are problematic. Maybe we don't need true singletons. It's just an idea, but I've thought about proposing some kind of "crate generic parameters" feature, where you could pass in things like allocators and executors as parameters that apply to an entire crate (or perhaps a module). Basically a form of dependency injection at compile time. It could come in handy for testing, since you could replace dependencies with mocks, without having to put generic parameters on all your types for that sole purpose...

Edit: So more or less similar to Backpack, which was previously mentioned

smj389 · November 17, 2019, 9:43pm

On Windows and Mac OS, the default Global Executor could use the native thread pool. On Linux, e.g. Libdispatch could serve. That would avoid controversy about a home grown default and wouldn’t require as much effort.

josh · November 18, 2019, 6:06am

That would still be a controversial choice.

xurtis · November 21, 2019, 3:28am

I think it may also be reasonable to consider that, even though you may not want a single executor per program but you are unlikely to use more than one executor per operating system thread. Having a dynamic approach could better allow for systems where different executors are bound to certain operating system threads.

xurtis · November 21, 2019, 4:05am

This turned into an unnecessarily long rant so a TL;DR: a large amount of software and libraries will want to be able to execute asynchronously without much consideration of particular requirements and will want to do so easily and portably. These will essentially expect to be able to use a global executor with a sufficiently general interface to be used for most cases. For software with specific executor requirements, there is nothing in this proposal to use additional executors with specific behaviour which can simply be run on a dedicated OS thread, but the global executor must always be able to run and blocking or long running operations should not occur in tasks running on the global executor.

I think the general idea of a global executor makes a lot of sense. At some level in every application there are two kinds of resource that are inherently allocated from a global pool: space (memory) and time (time to execute on a CPU).

At higher levels both can be partitioned and allocations managed from the partitions managed differently. At the interface of the operating system memory is managed at the global level through sbrk, mmap, and their non-POSIX parallels and time through operating system threads.

Both are abstracted in a way to provide a portable runtime interface in std that allows all code in all libraries to allocate either without interfering with each other. In order to maintain this interface whilst performing allocation in a way that minimises interactions with the global operating system allocators are used within the runtime, but the choice of allocator implementation is up to the user.

Whilst more difficult, a design that provides greater control of these resources for systems-level development would allow a developer if library A to opt-out of the global allocators (using the operating-system level primitives for allocation directly) without interfering with the use of the global allocators in library B within the same process. Note that this would be a requirement in addition to the portable global allocators rather than an alternative.

The issue of opting out of using either global allocator is that you can't force a downstream dependency from using the global allocators without forcing them to provide a mechanism to specify an allocator specifically.

A potential way around this is to limit the scope of 'global' for the purposes of allocation of these resources from the entire process down to the current task (a la futures 0.2 contexts), thread (with TLS) or process depending on the capabilities of the target platform. Doing so could allow spawn to inherit the allocators of the task that spawned it and additional spawn_with_executor to spawn a task with a specific executor so that libraries with specific requirements could ensure that certain actions use executors that best fit those requirements without imposing those requirements on the rest of the system.

As memory is re-used in a manner that time is not, such a scheme wouldn't work for memory without tracking the origin of allocations dynamically.

I am aware that this was removed with good reason. Just highlighting the trade-offs made between tightly-coupled code, portability, and ease of development. It seems from that discussion that uncoupled libraries can't require dependencies to use a particular executor without it expecting one in its interface (or a global one).

A global executor wouldn't prevent a different spawn mechanism being used if desired. Additionally, looking at other languages, it doesn't seem that there is a great requirement from the runtime for global management of memory, execution time, or inter-process communication (including files, streams, and synchronisation primitives) and so there is little expectation that this singleton approach will need to generalise much.

orthoxerox · November 21, 2019, 7:53am

When I read the blog post, I practically convinced myself that a single global executor would be a good idea. But when I started reading the comments I realized that I have always known it wasn't such a good idea after all.

.Net has a global executor that is used by default when new tasks are created. When the program is a GUI program or a Blazor application, the executor ensures that the continuation is scheduled on the same thread that the async fn started executing on, so the UI thread can update the UI.

This means that every time you want to spawn a task in a function that 100% doesn't update the UI, you have to modify the task before awaiting it (or use a clever hack where there's a wrapper task around every public function of your library that erases the sync context).

So I have to say that a single global executor is a bad idea by default, unless we can demonstrate that all libraries and library consumers can obtain the correct executor for their tasks without jumping through hoops.

withoutboats · November 21, 2019, 11:25am

What your describing is quite different from this proposal: in .NET's runtime, all async tasks are immediately spawned. But in Rust, that is not the case (and much has been made of this fact). Nothing would be spawned onto the global executor unless explicitly spawned using task::spawn or the like.

orthoxerox · November 21, 2019, 11:45am

That's true, but what should executor-agnostic libraries use to spawn tasks? If they use task::spawn, will this cause issues when they are called from a program that uses a GUI-friendly global executor?

Topic		Replies	Views
Blog post: A formulation for scoped tasks language design	20	2792	July 31, 2023
Global executors 2 language design	4	797	September 2, 2020
Why "bring your own allocator" APIs are not more popular in Rust?	41	8170	July 31, 2021
`async_std` and `futures` related question, need help:) libs	3	1033	July 12, 2020
Blog post: Async Cancellation language design	10	6554	February 18, 2022

Global Executors

Related topics