Case against “maybe `async`”

I was reading about Rust plans for 2023 (as outlined by Nico) and also about questions raised by ZiCog:

Why do we have synchronous calls to OS services at all? Why do read() and friends hang up the caller for ages rather than return immediately and then indicate completion later, with some kind of signal or event? Or even crudely providing something that can be polled for completion?

How would our programming languages look if we never had synchronous system calls?

How would the code look that those compilers generated?

Without synchronous, blocking, system calls normal threading as we know it, pthreads in C, std::thread in Rust and many other may never have needed to exist. Would all languages then be async by default?

And then I did a tiny experiment: what would happen if we pretend, for a moment, that there are no difference between sync and async functions?

Yeah, async function return reference to function frame and make it possible to poll them to the completion (they are generators deep in heart, not a regular functions), but still, it's obvious how one may convert async function into sync one in an “extra naïve way”. So we can take something like this:

extern "C" {
    // Make sure calculated value goes “somewhere outside” and not optimized away
    fn my_c_function(x: *mut i32); 
}

pub async fn square_async(num: i32) {
    unsafe { my_c_function (&mut (num * num)) }
}

And then make it sync with an extra-naïve, super-dumb executor:

pub fn square_async_to_sync(num: i32) {
    let waker = Waker::noop();
    let mut cx = Context::from_waker(&waker);
    let mut fut = pin!(square_async(num));
    loop {
      if let Poll::Ready(res) = fut.as_mut().poll(&mut cx) {
        return res
      }
    }
}

And then we compile it, compare to synchronous version… and we get the exact same machine code, of course. Simple dead code elimination, nothing magical.

But then… what's the point of “maybe async” feature? It seriously looks like a solution in a search of a problem.

What we really need is to solve a completely different task from the same post: “traits for being generic over runtimes”.

If we can do that then we may provide a “naïve, synchronous, runtime”, postulate that attempt to use await with async function from normal, synchronous, function injects dummy loop with noop waker… and voila: done. Everything works. Tiny change to the compiler, tiny change to the language.

And I may not understand some deep implications but I couldn't see how any “maybe async” feature would be able to magically remove all these pollings and wake up machinery if I wouldn't be able to magically switch my function from poll_read to read — and if I'm not mistaken then “maybe async” proposals don't include anything like that.

3 Likes

So I guess your proposal is that we don't need "maybe async" at the type level, because we can document when implementations are actually sync and block_on them (or now_or_never them, if you want to panic for an await instead of busy looping (like you do, which is the absolute worst thing to do) or using a lightweight but not nothing executor (like block_on)) instead?

It works in the sense that if you only write correct code things work as they would with proper "maybe async" support, but it immediately crumbles into a giant silent footgun the moment any actual async work is done... or the moment your "not actually async" implementation is called from code that wants to be async and now is suddenly blocking the executor.

The point of trait generics is writing generic code. If you're solely trying to fit through an existing interface, "async in name only" can be a reasonable internal, non-public implementation detail choice to make things work. But it's a horrible choice for public API that's supposed to play well and integrate with other code. The point of "maybe async" is that being async isn't just the permission to await (which includes the scheduling a wakeup) but it's also a communication of intent; intent that polling doesn't take any significant amount of time and it definitely shouldn't be calling OS functionality to block the thread.

If you have an environment where IO blocking doesn't exist, then everything "being async" is almost enough of a solution, sure. But Rust lives in an environment with blocking, await cancellation point visibility is critical to writing unsafe code because cancellation safety (leaking stack frames) is stricter than unwind safety (running stack destructors), busy looping is the absolute worst way to handle "polling" an underlying blocking resource (you want to block/suspend execution instead of wasting electricity and starving out other threads), and it still doesn't handle actual compute loads starving out tasks (i.e. those structurally multiplexed via select! and/or join! and thus polled from the same thread instead of being spawned as independent tasks).

I do fundamentally believe that "everything async" is the better choice for a new "scripting" language. Go is kind of that. But as much as "systems" versus "scripting" is a false dichotomy, Rust is firmly on the systems side that strongly wants if not needs to be able to reason about async effects at a type system level.

That's like dynamic typing: rather than detecting a programming mistake at compile time through the type system, compile anyway and let the code do whatever.

If this is done, I would expect this to at least raise a lint that is deny be default. Because it would be very error prone otherwise.

I may be misunderstanding, but what about tasks in async executors meant specifically for CPU bound work? Like Bevy's AsyncComputeTaskPool and ComputeTaskPool. In the spawned futures, since you do cpu-bound work between each await point, polling them is expensive.

Now, why would a CPU bound thread pool run a future, rather than a closure? Well, this is an easy way to enable calling async APIs, with no loss of generality (you may just not await anything if you don't want to). So if Rayon were written today, it could have been written as an async executor that ran CPU bound futures, rather than just taking closures. (Likewise, Bevy has ParallelIterator just like Rayon, but running on the async executor)

So most of time a future mean "this is probably IO bound and polling is supposed to be quite cheap" but other times it means "this is actually CPU bound, but it is encapsulated as a future so that we can easily call async APIs". (A further complication is that there is no threshold that makes a code go from IO bound to CPU bound, and it's quite possible that a nominally IO bound code has some anomalous input that makes it behave as CPU bound)

Yeah, using async for CPU bound work is done, but it still runs into the pitfall that select!/join! style concurrency including a CPU bound task starves the other. It's generally better to use non-async contexts for CPU intensive work for that reason, using something like ctx.runtime().local_block_on(async_task()) to "switch colors," but it's also generally easiest to do so just by being async from the start and keeping that potential footgun in mind (since you already should be when doing async anyway).

It's a directionality thing. For "async input," accepting async { blocking_work() } gracefully is a relaxation of domain, and thus adequately handled by documentation. For "async output," producing a task which blocks the thread when polled is an expansion of range, thus a (safe but nonetheless) violation of the implicit expectation. It's more of an issue when talking about traits, thus an API surface which is expected to be generic over, of course. (It's just that Future is a vocabulary bit of generality, and not playing nice with select!/join! is unfortunate.)

There's of course no universal hard cutoff for when a poll is too CPU intensive. But a reasonable soft idea is "would show in a profiler." At that point, you've starved other tasks on the thread for a time period worth measurement granularity. The heuristic used for "cooperative preemption" (i.e. compiler inserted) is typically a potential yield at any control flow back edge.

2 Likes

But this is the exact same story as when you take function that is designed to work in Tokio and try to push it into async-std. Or, heck, even use code which assumes multithreading executor and you run it in Tokio test where it's not available. Like in this topic here.

Maybe. But we don't really, have a choice. We already have that problem in Rust, right now, today. It's even in the Niko list under traits for being generic over runtimes name.

The short summary: you are correct, of course, in saying that allowing to call just any random async fn is dangerous but it's not different from calling normal async with wrong executor.

And, even worse, “true” “maybe async” wouldn't magically solve that issue! You would still be able to convert “wrong” async feature into synchronous code!

I agree, but “maybe async” is entirely wrong abstraction! When “maybe async” is useful (if we, somehow, may distinguish code designed to work with different executors) it's not helpful, if we couldn't do that then “maybe async” doesn't add anything interesting to the table.

The short version of my post is: purely synchronous executor is not special. You either have a way to keep “oil and water” separate (keep Tokio features out of async-std executor and/or keep “purely async code” from being executed in synchronous environment) or you don't have it.

Special-casing “pure synchronous code” is not helpful! Yes, problems that you are talking about are real, but “maybe asyncdoesn't solve them and if would find a way to solve them (would it be “trait transformers” or anything else) then it would stop being useful!

2 Likes

Why would you assume that? The initial announcement of Keyword Generics Initiative even has an example of a read trait that works both in sync and async:

I'd say that they do have switchable read and other essential traits in their design.

Because I couldn't see any proposals to do that.

Yes, but they don't include the important part: how to make sure existing, already written code may use them without rewrite.

And if their (pretty sensible, if you will ask me) solution to solve that issue is to rewrite all the async crates then the question immediately arises: why not adopt some other solution which may keep code written for different executors separate? Because that's badly needed, too, it's even in the Niko's list under Traits for being generic over runtimes name (except it's, somehow, comes after “maybe async” there even if without these “maybe async” is, pretty much, useless).

Let me repeat what I already wrote: purely synchronous executor is not special.

Any solution which would ensure that you couldn't mix features designed to be executed with different executors would automatically make “maybe async” trivially implementable: just inoculate async fn code into the synchronous function that calls it and dead code eliminator would do the rest.

And as you correctly noted we would need these traits and would need to manually adopt them and do what's needed to deal with them… even without “maybe async” proposal.

If “maybe async”, by itself, doesn't solve any real problems and doesn't make any other problems easier to solve… then why so much activity is around that proposal?

Think about how would you use Tokio-based code and Embassy-based code in the same program and “maybe async” would just happen automatically if you have that.

4 Likes

Crates are already going to need to rewrite to use the async traits in std, once they exist.

I can appreciate the concept of "what if everyone wrote all code using async, including synchronous code". I also think that would be widely perceived as a usability issue, and in any case, doesn't help the mountain of existing synchronous code. It'd be a monumental amount of churn.

1 Like

Yes. And that's my point: if we would need to rewrite everything anyway then it's better to do that once, not twice.

No. It's the other way around: if we would write code using async and would make async because like const behaves (allow use of function in async context, but also allow use of that same function in sync context) then we wouldn't need any of that “maybe async” nonsense.

Why that's not a problem for const functions?

This reasoning doesn't work nicely because const functions restrict the capabilities of the function (you can only use features that are available in const contexts, which are a subset of what normal functions can do), while async functions enlarge the set of capabilities (you can now .await Futures).

From the point of view of the caller, you can call const functions in both const functions and normal functions, so they are less restricting. On the async side that role is taken by normal sync functions, because you can always call them in async functions.

The problem with this is that calling blocking sync functions in async code is bad, or if it's not blocking often doesn't compose well (think for example of Option::map, the closure inside needs to be sync even if map is called in a async context).

"Just let async functions be callable from sync functions" is not gonna work. You showed an example where you can just poll it once, but that's going to work only for those async functions that .await nothing meaningful and could just be sync functions (i.e. the most useless ones!). For all the others this is not gonna work and you will need an actual executor in order to not busy loop.

2 Likes

So what? We already have that problem that not all async functions are callable from all executors. What makes synchronous executor special and deseriving special language support?

Also the only ones that makes sense to call in synchronous context.

Yes, but as we all know problem is deeper: we already can cause lots of trouble by trying to mix async code designed to be run with different executors.

Making it possible to use special dedicated 'static executor doesn't make it neither better nor worse.

If anything we may want to have generic async keyword but it shouldn't be picking between async and not-async.

Rather I would say that was we want is somthing like async<Tokio>, async<Embassy> and async<Sync>. With the possibility of making that type parameter generic.

To add, the usual concept that Rust encourages is that if a function uses/requires some resource (e.g. the Tokio executor), it should either reach it by some function argument or, if it's a global/ambient resource, lazily initialize it when first used. The most notable example of this is rayon's compute thread pool, which is a global resource that spins up the first time it's needed unless you've manually initialized it beforehand. async_std also follows this pattern. lazy_static/once_cell exist to make implementing it more straightforward.

I conjecture that Tokio doesn't do this because they consider it important both that it's possible to run multiple executors side-by-side and that you don't end up doing so (or even initializing the global runtime) unintentionally. It's a reasonable enough choice to make, especially since Tokio has abundant performance relevant configuration available. It's mainly just unfortunate that it's not easily apparent what functionality is tied to the executor, and it's not fully non-controversial; some Tokio-async-using libraries (especially those providing a blocking API) wrap their functionality in an as-needed lazily initialized runtime.

Additionally, being async-generic is different than being executor-generic. In a perfect world (syntax irrelevant), async fn would mean “I may .await in an executor-agnostic manner” and async(tokio) would mean “I may .await in a way that requires the Tokio executor” (and/or reactor). It's also worth noting that the primary reason async code becomes executor-specific is spawn. Dynamic dispatch to spawn is fine (the subsequent polling is necessarily dynamic already), if a bit involved (to support alloc-constrained environments[1]), something there's vague intents on doing (Context is separate from Waker to enable passing additional context), and meaningfully simpler than maybe-async or executor generics.

If it helps, you could think about fn as shorthand for async(never) fn. In fact, if “maybe async” uses syntax along the lines of async<A>, executor dependencies could be slotted right into that A as an option alongside never and agnostic. (But on the other hand, support for implicit / ambient resources is useful even in purely sync code.)

As a final note, while now_or_nevering a future which cannot return pending optimizes the same as being written sync when everything can be inlined, that's no longer the case when function calls aren't inlined for any reason, be that code size or just that it's from a different compilation unit. And async functions can't do direct recursion either.


  1. To make fn spawn(impl Future + 'static) dynamic, the simplest answer is fn spawn(Box<dyn Future>), but that enforces the use of Box-allocation. The dynamic implementation thus needs to look something more like (pseudocode) fn spawn(&mut Option<dyn Future + Any>) so the provider can take the value and allocate it however they want, including into type-specific machinery like with embassy. (The std panic machinery has something similar with dyn BoxMeUp.) ↩︎

5 Likes

The blog post Let futures be futures seems to make a great argument against maybe[async]. IMHO, the points there are pretty reasonable.

1 Like

While I do agree with essentially all of boats' points — code doing non-trivial work (anything you wouldn't want to happen during a singular Future::poll) can't meaningfully be async generic — but a surprising volume of code[1] boils down to just reshaping and moving data from point A to point B. Such "pipeline" code is functionally pure except for whatever effects are involved in taking data from the source and in giving it to the destination. It's the same procedure whether it's in a fn(impl Iterator<In>, impl Fn(Out)) or async fn(impl Stream<In>, impl Sink<Out>) shaped box.

But the real "solution" to this case is that such code shouldn't be BLUE or RED or GREEN; it can and should ideally be written in a "colorless" sans-IO form (e.g. fn(&mut State, In) -> impl '_ + Iterator<Out>). Interestingly, it comes down to essentially external iteration ("here's more data") versus internal iteration ("give me more data") again, as it often seems most big-picture design questions do.

An unfortunate side effect of the "anything could block and rarely advertises that" fact is async sometimes being misused for only "poll safe (won't block/starve your executor)" functionality instead of also "(potentially) utilizes async concurrency." Any guarantee of compatibility with the now_or_never executor is imo abusing the async qualifier. The use isn't completely without merit — as OP notes, it's is the straightforward way to implement "maybe async" — but one of the takeaways of "let futures be futures" is that async as a marker of concurrency paradigm is useful, and doing this dilutes that.

We don't need generalization over monadic effects in order for the monadic patterns (e.g. map/filter/reduce) to be available and useful on our monadish types (e.g. Result/impl Iterator/impl Future/etc). In fact, not generalizing allows names to be more specialized. It's not the end of the world if monadic transformers need to be rewritten across the different monadic effects, as the combinators themselves aren't that complicated, so long as "domain logic" is written to be minimally effectfull and the glue code for applying it in a monadic pipeline is minimal, straightforward, and obviously self evident.


I do wonder if language support for resumable semicoroutines will help here, as it allows writing an impl with internal iteration but exposing an API for external polling. If you're only potentially async because of callbacks, not work you're doing yourself — the exact case "maybe async" is usefully desirable — then, modulo the impl details, a "library style" poll providing events to an outer event loop is, generally speaking, a preferable API to the "framework style" inner loop with callbacks. It's just much more convenient to write internal iteration than state machines, thus the popularity of async.await sugar.

The "full sync world" equivalent would be putting the work loop into a worker thread and mutex switching between the parent and child thread via a blocking bidi message handoff channel. That should even be able to support borrowing cross thread, since it's functionally synchronous... something to play with later, I guess.


  1. There's roughly three "kinds" of code:

    • Moving data from providers to consumers, maybe restructuring, aggregating, splitting the data; "pipeline" code.
    • Coordinating task fulfillment, maybe in response to some sort of event; "dispatch" code.
    • Everything else; arguably "system" code (as in mechanism, not as in "low level"). This is of course a large category, but you could be surprised by the volume of code that fits cleanly into one of the first two categories.
    ↩︎
6 Likes

I disagree.

boats' post is largely focused on "intra-task concurrency". As they say themselves:

So it is exactly these combinators for intra-task concurrency that readiness-based futures were designed to optimize.

As Rain has compellingly argued in the past, “heterogeneous select is the point of async Rust.”

They then go on to talk about reqwest, which apparently uses async Rust to "more easily able to abstract over the difference between multiplexing requests to multiple connections with HTTP/1 and multiplexing requests to the same connection with HTTP/2."

So I cloned the reqwest repo and looked for instances of select!, join!, merge!, and zip!, as well as other related APIs like futures_util::future::*. I also recursed into hyper, h2, h3, and quinn, which together seem to provide the actual backend for HTTP requests.

What did I find? Well, most of the crates had no uses at all, excluding test and example code. The exception was hyper, which didn't use any of the macros, but did have some uses of futures_util::future::select.

Thus, those crates, far from demonstrating the unique abilities of async, are largely examples of code doing non-trivial work that could be meaningfully async-generic. For the code that did use future::select, making it support both async and sync would probably require the sync version to use some kind of wrapper that creates threads for you. You might end up with a few threads per connection that don't really need to exist. But it would hardly be prohibitive.

From my perspective the biggest advantage of the async ecosystem is not intra-task concurrency but cancellation. I haven't looked into how these crates handle cancellation, but I think they do take advantage of straightforward async cancel propagation. While there are obvious sync analogues for channels and mutexes and to some extent even futures, there's no analog for cancellation. There could be a norm in the sync world that all work should be cancellable, not in a crazy abrupt way like pthread cancellation, but with a way to request cancellation that causes all blocking operations to start returning errors. OS kernels tend to work this way, because they're nonstandard environments that can invent their own synchronization primitives, and because you need to be able to kill processes even if they're in the middle of some kernel function that's blocking on a condition or something. But in typical Rust environments, no such norm exists.

I can imagine that if there was investment in making maybe-async work – in other words, investment in making a suite of APIs that are fully compatible between async and sync mode rather than just mostly compatible – then perhaps a bubble in the sync ecosystem could arise where there is a norm of cancellability. C++ recently added std::jthread and std::stop_token which are trying to do something similar. When operating in such a bubble, you would have to deal with the risk of accidentally calling a blocking function that doesn't support cancellation, but that's no worse than the risk of accidentally calling a blocking function in async code.

...But that's probably just a dream. After all, if blocking OS APIs don't themselves support cancellation, then the Rust wrappers for those APIs couldn't support it either, at least not without making their implementations much more complex and probably slower. And at that point, it's easy to argue that you might as well go all the way to async.

Personally, I just can't be happy with that solution, because there are just too many dissatisfying things about Rust's async design that make me never want to use it (debugger support, Pin, ABI stability / FFI compatibility, compile times, generated code quality, ease of accidentally failing to poll futures). But I digress.

3 Likes