Zig Colourless Async in Rust

A well-known problem of async in many languages, including Rust, is function colouring. The crux is that async and blocking code are not compatible with one another, leading to an ecosystem split between sync and async libraries. The Zig language's async system "solves" function colouring by parametrizing all code across blocking and async modes of execution. This post details Zig's solution. I want to explore how Zig's solution can apply to Rust.

Zig uses special keywords for async functions. The async keyword is used to call async functions and runs the function until it reaches its first await/suspension point. Unlike Rust, async function calls in Zig use different syntax from sync calls, which is an important point we'll touch on later. Zig also has await, which, like Rust, suspends the current async function to wait for an expression to complete. An async call in Zig looks like this:

fn foo() {
  var frame = async read_socket(); // The frame is the "future" of the read_socket call
  var msg = await frame;  // Awaits the frame (Rust equivalent is frame.await)
}

This code makes sense if read_socket is async, but it also works if read_socket is sync. In that case, the async line blocks on read_socket as if it was called normally, and the await line returns the message from the socket without needing to suspend foo. foo also changes from an async to a sync function. The same thing applies to "synchronous" function calls. In Zig, a sync function call is simply written as read_socket(), which behaves like await async read_socket() if read_socket is async. This also makes foo async. Since the behaviour of all Zig code is defined for both sync and async environments, Zig provides a compile-time switch called io_mode that sets all I/O primitives as async or blocking. This leads to the whole library to be compiled as either async or blocking, allowing the same codebase to support both "colours".

Zig's secret sauce for eliminating colouring is the fact that its code is generic across async and sync APIs. In Rust, the usual answer for this is effect systems, a well-explored topic on this forum. A simpler way to abstract across sync and async functions is to model both as special cases of generators. Sync functions can be thought of as generators that never yield; async functions are generators that keeps yielding until completion. More specifically, sync and async functions can be modeled as functions that return generators. For example, the following functions:

fn foo(n: u32) -> u32 {
  n
}

async foo_async(n: u32) -> u32 {
  tokio::time::sleep(1).await;
  n
}

would desugar into:

fn foo(n: u32) -> impl Generator<Return = u32, Yield = ()> {
  || { n }  // This is a generator, not a closure
}

fn foo_async(n: u32) -> impl Generator<Return = u32, Yield = ()> {
  || {
    let fut = tokio::time::sleep(1);
    // This loop is the desugared await
    loop {
      if let Completed(val) = fut.resume(...) {
        break val;
      }
      yield;
    }
    n
  }
}

After desugaring, both foo and foo_async return generators with identical signatures, so it's possible to interact with both functions with the same surrounding code. However, there is one caveat regarding function calls. The desugared return type works with foo_async(), which returns a future that can be represented as a generator. However, the sync call foo() returns u32, which is the output of the generator, not the generator itself. The problem is that Rust's async and sync calls are semantically different operations sharing the same syntax. Async calls return the "frame" of the function without executing the body, while sync calls execute the function to completion. The solution is to separate "async calls" and "sync calls" into distinct language constructs, like Zig does. In Zig, async foo() always returns a frame that will be awaited later, even in sync contexts; foo() always drives the function to completion, even in async contexts. For this post, I will use async foo() for async calls and completion foo() for sync calls.

With the separation of async and sync calls into different operations, we have 3 async-related language constructs: fut.await, async foo() and completion foo(). async foo() desugars to foo(), which returns the function's frame/generator. fut.await desugars into something like:

loop {
  if let Completed(val) = fut.resume(...) {
    break val;
  }
  yield;
}

Note that if fut is a "sync generator" that never yields, fut.await also won't yield. Lastly, completion foo() desugars to (async foo()).await. Like with await, the expression is only async if foo is async.

Since we already model both sync and async functions as generators, and the async language constructs all desugar into code that works with generators, our language constructs work on both sync and async functions. Furthermore, for constructs like await and completion, the code's asynchrony depends on the asynchrony of the functions it's calling. This allows the asynchrony of an entire application to be determined by the asynchrony of its I/O primitives, since asynchrony propagates all the way up the callstack until main, which cannot be async. As such, main is responsible for driving the entry point of the application, which may or may not be async, to completion. Also note that the compiler can determine the asynchrony of a function statically by checking whether it's possible for the function to yield. If a function contains an await or completion for any async function calls, then it is marked as async even if it never actually awaits. This allows sync functions to be optimized into regular calls instead of generator state machines. Additionally, main can determine whether the application's entry point is sync or async at compile time and drive the function differently using that information (eg throw the function on Tokio if it's async, otherwise just do blocking execution).

I'm still not sure how to set the asynchrony of I/O primitives. Zig uses a global switch, but that requires their standard library to provide sync and async versions of I/O functions. Doing this in Rust means merging async-std into std, which is a big change. Also, I don't know how executor-dependent APIs like tasks will work under this system (what's a synchronous version of tasks? Threads?). On a related note, some code patterns just don't work in both sync and async contexts, even if they do compile. For example, something like join!(foo(), bar()) will likely call foo and bar sequentially if both functions are blocking, which may lead to deadlocks if the code was written with async in mind. Zig treats this as a fact of life, which may not be OK for us. Lastly, my proposed distinction between sync and async function calls will break almost all existing async code, so it can only be introduced in a new edition.

6 Likes

First of all, the original article that coined colours of functions described a few different issues with them, and the problems are not equal. In my understanding the main two points were:

  1. Hard split between colors of code where one color of functions can't call the other color functions (I assume in reference to sync JS unable to wait for an async result).

  2. Sync and async calls use a different syntax.

In Rust the problem #1 is mostly solved. Async runtimes can provide block_on(async) and spawn_blocking(sync) to bridge the two. Generically abstracting over sync and async methods may be cumbersome, but it is doable if you really need it.

And the problem #2 is IMHO not a problem at all. For example, languages with unchecked exceptions are "colorless" over errors. You call throwing and non-throwing functions the same way, using the same syntax. Is that a good thing? Rust disagrees. Rust users appreciate explicit (colored) try operator ? and Result return type.

In the same sense explicit .await in Rust is not a limitation, but a feature. Presence of a suspension point in a function is significant, because there's a return statement behind it!


https://without.boats/blog/the-problem-of-effects/

18 Likes

Using block_on and spawn_blocking to mix sync and async gets hairy when you have callstacks of sync-async-sync or async-sync-async calls. In both cases you end up with an I/O call blocking another thread due to spawn_blocking, which could have been avoided if sync and async weren't mixed. Also spawn_blocking isn't feasible on no-std environments and single-thread executors. So while it is possible to mix colours in Rust, doing so is far from ideal.

I agree that #2 is not a problem on its own. The problem is that, because sync and async code must be written differently, you end up with sync and async versions of the same APIs (ie reqwest::blocking and reqwest). Code that compose over these I/O APIs must in turn choose either the sync or async version, leading to an ecosystem split. This problem also shows up in error-land where libraries need to export try_* versions of their methods to compose over fallible functions as well as infallible functions. Colouring is not as big of an issue for fallibility because the Rust ecosystem decided that any library API that can fail should return a Result instead of panicking (with allocation failures being a notable exception). This essentially paints all fallible APIs with the same colour, sidestepping the colour mixing issue. The async equivalent would be to make all I/O libraries and APIs async (not a bad idea IMO).

My proposed syntax does preserve explicit suspensions. The only statements that can yield are .await and completion.

1 Like

I've linked to withoutboats' post, because they make a case that fallibility and asynchrony are both an "effect", and have the same kinds of "coloring" problem in Rust.

Result splits functions into "fallible" and "infallible", and you have regular APIs, and try_ APIs. If you have infallible function and you want to make it fallible, you need to change the return type, which in turn requires all your callers to change how they call that function. The callers may become fallible too, or use .unwrap().

Async is very similar. Splits functions into "sync" and "async", and you have two kinds of APIs. If you have a sync function and want to make it async, you need to change the return type, which requires all your callers to change how they call that function. The callers may become async too, or use block_on().

And to stretch this analogy further, the Zig solution is to make everything possible to use as async. This is similar to auto-wrapping function results in Ok (Result<T, Infallible>), so that you can handle all fallible and infallible functions with ? and try_ APIs.

7 Likes

Yeah fallibility and asynchrony being parallels of each other makes sense. The Zig async solution, applied to fallibility, would involve making error-primitives like ? work with fallible and infallible functions and allow all code to be written with proper error-handling idioms even when they're composing infallible functions. My proposal only tackles asynchrony because I believe that the async colouring problem is much bigger than the result colouring problem in terms of ecosystem impact. IMO fallibility colouring problems don't come up enough to warrant a fix, but async does.

withoutboat's post mentions effect systems and TCP closures as solutions for a special case of the colouring problem. I don't think Zig's approach is an effect system since it doesn't really modify the type system. It does achieve the same result as TCP closures by allowing a closure's asynchrony to be inherited by its caller, and it doesn't require inlining.

It's worth noting that this was discussed in the await bikeshed; this is effectively "implicit await, explicit async".

In that model, if you want to immediately synchronously call a function, it's foo(). If you want to get a think to evaluate it later, it's || foo(). This works identically for sync or async code.

I honestly quite like this design. It's very similar to what Kotlin uses, and I think near optimal for a "scripting target" language. I really would love to see a language emerge that uses this model pervasively.

But it's not the right design for Rust. First of all, Rust is a "systems target" language, which means that the behavior of the program should be much more predictable than it could acceptably be in a scripting language. More importantly, though, .await has a significant impact on borrow checking and what you can do with a thunk (closure, future), so it makes sense that this is explicitly marked and different from pure sync code.

Inasmuch as you have a single global switch between async and sync, yes. But only because there's a single global switch; as soon as you have both modes in the same program, you have the problem of code being one "color" or the other, even if using them is the exact same syntax (since they'll still need to be codegen'd at least twice).

Again, if there's a single global switch, yes. But if there's not, you have to monomorphize for any mix of sync and async you end up using.

Such environments can implement this as just "run the closure". There's no way to do work concurrently anyway, so your options are to ban sync operations entirely, or to allow them and just block the executor. There's no magic that can do sync operations without blocking concurrency, other than blocking using the OS parallelism primitive (threads).

I think maybe the disconnect here is that you expect a way to swap the entire program to be fully async or fully sync. That's not really a good option for Rust; a key ability of Rust async is the ability to mix sync and async code. (Plus, even if you swapped every std blocking function for a nonblocking version transparently, you still have code fully able to block, whether on an expensive CPU bound computation or just making their own calls into blocking FFI/OS routines directly.)

9 Likes

What's your opinion on Zig, a system programming language meant to replace C, adopting this async model?

In my proposal the only suspension points in async code are await and completion (in the "implicit await" model, this is foo()). await is already an explicit marker. The problem is with completion, since it looks exactly like a normal function call even though it can suspend. If we mark completion with more explicit syntax to indicate that it can suspend, it would solve the problem of predictability. Going this route will require all sync calls to be marked the same way, which is a massive back-compat hazard.

Also, what impact does await have on the borrow-checking of the containing function? Can't async functions borrow fine across await points?

With regards to monomorphization, Zig's approach acts like an extra type parameter. I guess the fact that the code is generic across sync and async I/O does make it sound like an effect system. I'm in favour of having a global switch because it makes all I/O monocolour. Doing so avoids mixing sync and async in the same program, which isn't ideal.

In this situation I'd use the global switch to make all the I/O sync or async. This is equivalent to banning sync all operations in async context (and vice versa).

Why do we want to mix sync and async code (by sync I mean blocking synchronous I/O)? The workarounds used to mix sync and async I/O aren't ideal and people generally recommend against mixing sync and async. You are right that blocking code will always exist in async contexts, so function colouring is an inescapable problem, but there's still value to a solution that eliminates most cases of the issue.

By the way do you have a link to the thread about "implicit await, explicit async"?

1 Like

I haven't written/seen enough Zig to have a developed opinion. If what you've said is accurate, though, Zig runs the sync function synchronously at the point of async call() (runs to first await point), so await it is a no-op. In my hypothetical model, both synchronous and asynchronous code would be equally deferred when you create the thunk, and only evaluated once you evaluate them (notably, using e.g. join([f1, f2]) to evaluate multiple thunks concurrently).

My hypothetical runtime would also transparently push work into worker threads if a blocking call had to be made. Ideally, it would even shift coroutine threads into the spawn_blocking thread pool if they don't yield quickly enough, with all the management complexity that brings. The caller should not be able to tell the difference between a function implemented via blocking OS calls or expensive computation; they should be functionally identical to cooperative multitasking (which should be the default).

I did say this is a scripting language fantasy, not a systems language fantasy :slightly_smiling_face:

Because it is. Just one where you've artificially limited it to a single choice at compilation time (which I should also note prevents any dynamic linking between libs that made incompatible choices (which I guess is no worse than the Rust status quo of all of the compiler flags have to be exactly equal?)).

This is unavoidable, though; there is no consumer OS which provides an async option for every synchronous call. Even if there were, you still need to support OSes where they don't, or even perhaps just the last version of the OS that was before everything could be async but's still in support. (A major feature draw of Rust making it suitable for performance sensitive work is that it doesn't really have a runtime (anymore than C does, anyway). That's actually a major selling point of our async design: that you can bring your own executor. Perhaps your executor is specially tuned to your hardware and beats the standard ones; we shouldn't preclude you using it.) Plus, this doesn't address locks at all; especially that some locks might want to be released during suspension, and some locks might need to be held over suspension points.

This is the main thread, and there's this other thread as well. Unfortunately, though, much of the discussion is scattered around the entire async.await bikeshedding, so there's a lot lost inside all of that.

It's also potentially worth checking out Kotlin's suspend fun/[semi]coroutines documentation.

2 Likes

Does it not? What does Zig do when you have async-enabled program, and use the same function foo() in both sync and async contexts?

If it always compiles foo()'s body as async, then for a truly sync call (e.g. from a C callback) it would need to generate some extra wrapper function that creates an ad-hoc event loop and drives it to completion for foo(). Such runtime-dependent thunks are quite "magic" for a systems language.

Or if it compiles foo() twice, once with a state machine transformation, and once without it, then that sounds like Rust's monomorphising generics with a hidden <A: AsyncCaller> type parameter.

7 Likes

That's true. Changing a function from async to sync changes the monomorphization of the entire call tree. Note that changing between different async functions also produce different monomorphizations. Even without my proposal you can still end up with multiple monomorphizations of the same function when composing over different async calls.

As long as it's impossible for foo() to suspend. it will always compile to a sync function, regardless of whether it's called from sync or async context. The only way for there to be two monomorphizations of foo() is if it takes a Fn parameter and it is called with both sync and async functions. Same thing would happen if foo() is called with two different async functions.

Under my proposal, sync functions are semantically equivalent to async functions that don't suspend. The fact that sync functions don't compile into state machines is an optimization detail, not a facet of the type system. The type of an sync function is no more different from an async function than an async function is from another async function. Put another way, a sync function is a concrete type that implements Future. Therefore I don't see it as an effect system.

Yeah locks are IMO a major weakness of Zig's async system, since sync locks mix badly with async. The only solution I can think of is to extend the asynchrony effect to all locks that cross await points, by turning them into async-aware locks in async context. This is admittedly hacky.

Also true now that I think about it. All major async runtimes that support file APIs use spawn_blocking with the sync std::fs calls for this exact reason. However, many other operations can be done both synchronously and asynchronously. Even if there are some cases where mixing is necessary, there's still value in solving the colouring problem for the other cases (eg TCP, channels). Being able to both unify both colours for most operations and mix colours when necessary would be ideal.

The linked proposals don't address the colouring problem. They instead aim to replace await with sync function call syntax. This is a key similarity (and shared weakness) between the linked proposals and mine. However, my proposal also preserves await and "ports" it to sync contexts. This means that async code can still be written using explicit await, even if it's not necessary, and the code would still work in sync contexts. This addresses the biggest concern raised with the linked proposals, which is that suspension points are implicit. Perhaps we can recommend I/O libraries to use await instead of sync call syntax, solving both the colouring and implicit async problems.

But isn't an async function a function that returns a Future, rather than directly implementing it? Also, how would you represent closures under this proposal? Sync closures implement a well defined type, but async ones don't because the return type is different for most of them.

Also what about a function that always needs to be sync (i.e. never yield, for safety reasons) and takes a closure? Can that closure be async?

It seems like the next "step" in colorless/bicolor async would have to at least have that

  • Sync code can continue calling previously sync functions
  • Async code can continue calling previously sync functions (with a warning!)
  • Async code can call into async versions of previously sync-only functions

As a sketch:

// foo will monomorphized for sync code and for async code.
async? fn foo() -> usize {
    let n = something_else().await(); // something_else must be <async?>
    n + 4
}

fn sync_foo() {
    let _n0: usize = foo(); // works fine
    let _n1 = foo().await; // error
    ...
}

fn async_foo() {
    let _n0: usize = foo(); // works fine but calls the sync version, warns.
    let _n1 = foo(); // doesn't work! We don't know if n1 is a usize or a Future. Inference should make this rare.
    let _n2 = foo().await; // works fine, we know we want a Future.
    ...
}

I'd say if a function is async? it should be fine to call block_on(foo()), it's not going to have any branching tree of execution. So join! needs to call into some pure async function. This would mean that

  • async? functions can call sync and other async? functions
  • async functions can call sync, async, and async? functions
  • sync functions can call async? and other sync functions.

IMO this is not true. You've left out the passing of Context/Waker callback objects down async calls. Effectively, in case of an async function, the Generator has a different R (resume) type, a Context instead of (). That is, if you wanted colorless functions, you'd probably have to make sync functions-generators accept a Context resume object as well (and just not use it), or change the way wakeup is done completely... and I don't think either of that is realistic.

I don't know how Zig does this and I remember being disappointed by that blogpost back when it was published, as it doesn't go into these important details at all.

2 Likes

Okay, so I've looked into Zig and looks like basically they're going the C++ way and using LLVM coroutine support to implement async functions. This means that
Edit: Looks like this was changed / they don't use LLVM coroutines any more. Some of the points below may not apply.

  • Each async fn needs an allocation of its coroutine object, called "frame" in Zig, and the language relies on compiler magic to elide these allocations where possible (which is not always).
  • Suspension is done by accessing this frame object via compiler intrinsics and passing it to some event-loop machinery as a pointer.

This is very different and a lot more heavy-weight compared to Rust's approach where all a Future needs to function is to be passed a Waker object reference, which is lightweight, designed to be agnostic to the runtime and it's just a regular function parameter, no compiler intrinsics or magic needed. This also makes Futures a lot more portable between compilers. It's bad enough already that we have to deal with the exception handling machinery to implement panics (I've been bitten by this).

This is why I don't like those "Colored functions are bad!" blogposts, as they present colorless functions as a clear win without explaining the tradeoffs and costs. I suppose in runtime-rich laguages this is a lot more clear cut, as those costs are largely already paid for (and to be honest I'm not sure what the point of async/await is in JavaScript), but this is not the case of Rust and I believe having colored functions actually ends up being less trouble.

Pretty much the only doubt I have about Rust's approach to async is how well it'll fare with io_uring and completion-based APIs in general (as opposed to readiness-based APIs such as epoll), this is an area of active developement, so I suppose we shall see.

I believe they migrated away from LLVM coroutines in 2019: rework async function semantics by andrewrk · Pull Request #3033 · ziglang/zig · GitHub

More information here: The Coroutine Rewrite Issue · Issue #2377 · ziglang/zig · GitHub

It seems they now attach a "result location" to each expression.

Ah, interesting. So if I understand right, the way they do it is kind of similar to Rust except they use one type for several things and the pointer is passed down the call chain kind of out of band...

From what I understand Zig async doesn't poll using a Waker (when Zig coroutines suspend they can communicate directly with the global environment, which is how wakeups are coordinated). That's why Zig can unify the generators. Unfortunately this, as you said, doesn't actually work for Rust. This means sync functions can't actually desugar into Generators like async functions. It should still be possible to unify sync and async functions on the syntax level, and have different monomorphizations.

Edit: Actually, if you make sync function-generators accept an unused waker, you can just have the compiler optimize the waker out. Every call site of resume for sync functions would be just like a normal sync call. The compiler probably already does this for non-suspending async calls. If the compiler can guarantee this optimization it can also keep the ABI of sync functions the same as before. While the use of a specialized ABI for non-suspending generators may appear strange, Rust already does this with Option-like enums. The only situation where the waker optimization can't be applied is when the sync generator is inside a trait object, which shouldn't happen too often.