On why await shouldn't be a method

No, it really cannot. Because shallow coroutines are a fundamental part of the design of async/await (and indirectly, Rust in general, but that's a bigger conversation).

Theoretically, Rust could introduce a completely new system, like Scheme's call-with-current-continuation (or some sort of effect system), and then in that case we could implement await as a method.

But even then, that would be a new system, the old async/await system would still behave in the old way (because of backwards compatibility).

And it seems really weird to suggest that we implement a trait method now (even though it doesn't behave at all like a trait or a method), just because hypothetically 5 years from now we might get some fancy systems that make trait methods possible.

No, because .await is far less magic. It's just a keyword. It's not any more magic than return or break. It's easy to explain, and easy to understand.

But a trait method involves huge restrictions and/or massive compiler magic to support (and likely completely new features which haven't even been RFC'd yet). It's incredibly difficult to explain, especially with regard to all the restrictions.

I agree, effect systems are very cool. But you seem to be misunderstanding something: the Rust community has been waiting for async/await for literally years.

There is a strong pressure from important projects that want to use async/await, and there are deadlines to meet.

We're not going to wait for 5 years to debate a complex effect system and then implement async/await.

And that's assuming that an effect system would even be accepted (it likely wouldn't, due to the massive ramifications throughout the entire language).

That's not true. The .await operator inserts a loop and creates a branch in the state machine. A method cannot do that.

By its very nature it must affect the control flow, because of the fact that it needs to yield.

This is not just some implementation detail, it's a fundamental part of how async/await works (just like how return is a fundamental part of how ? works).

It's also not the same as synchronous blocking, because .await actually transforms the control flow of the async function (unlike synchronous blocking, which is just an ordinary function call, which doesn't affect the caller).

3 Likes

await could just as easily be a keyword inside an .await() construct. But if such a construct is adopted, there's less need to make it a keyword in the first place; keeping it a normal identifier would let it be subjected to ordinary name-resolution rules.

I'm not sure what you mean by this. If you mean that it would be a method that expands to compiler magic, I explained a few posts above why that doesn't work.

I mean, it would be a keyword like any other: its usage syntax would just coincide with method call syntax. The compiler would recognise the construct at parse time instead of after name resolution, and you wouldn’t be able to actually declare a method named await, because await would be a reserved word. Likewise you wouldn’t be able to declare variables, functions or other items named await. In other words, something similar to the sizeof keyword in C, which (most often) looks like a function call when used.

Sure, that’s perfectly valid. In that case it’s essentially just a stylistic debate between .await (which is inconsistent with field access) and .await() (which is inconsistent with methods).

Since they’re both inconsistent, there’s no objective benefit to one or the other.

The Rust team feels that the extra () is just unnecessary noise, so it’s better to go with the shorter option.

1 Like

I'll concede this point because it's not actually important to me. It's @HeroicKatora who seems to be arguing for await truly being implemented as a trait method. I'm probably never going to write an executor myself, so the existing Future API is sufficient to my needs.

What I care about is the notation and the surface implications of the notation. I should have written "It seems to me as if everyone who is arguing that the await operation is too magical to use method-like notation should like field access-like notation even less." (boldface words changed, everything else is the same).

I will stipulate that await is a keyword. I do not see how it follows that f.await is easier to explain than f.await(), particularly since there are a bunch of people saying that they find f.await confusingly similar to field access syntax.

My argument has always been that I think f.await() is more likely than f.await to indicate, to people who are reading async functions, the correct mental model for how it behaves. Namely: autoderef applies; it invokes code that depends on the concrete dynamic type of f; it may perform an arbitrarily lengthy operation before returning to the caller; and it does not introduce a branch into the surface control flow.

(It is also a suspension point and a cancellation point for the coroutine, as @RalfJung points out; this is communicated by the name await.)

It creates a branch in the state machine, but this is invisible in the surface control flow. Again, my argument has been that it is harder to debug async code if you have to think about the state machine; as evidence for this I point at what Promise-ful code looks like in JavaScript, particularly if you don't use .then and arrow functions. So I see a primary virtue of async/await as being that it hides the state machine, and I am in favor of notation that reinforces that hiding.

I don't think I ever said we should do that? I just meant to indicate that, in the absence of an effect system, the programmer has to be aware that there's a bunch of existing things written with function or method call notation, that perform synchronous I/O or unbounded computation, and therefore must not be used from an async context. So I don't see "we don't have an effect system the await operation can block" as a valid argument for "the await operation shouldn't look like a method call".

1 Like

(You wrote this while I was replying.) I believe I have explained why I don't see the () as unnecessary noise. Have you or some other team member written out an explanation of why you do see it as unnecessary noise? I don't recall seeing that in withoutboats' blog post. It could have gone by in one of these threads and I missed it; if so I would appreciate a pointer.

1 Like

Thank you for noting this. The point of arguing .await() is not that then we must make it a trait but that it is less special than the field operation and arguably methods 'appearing on types' with result values is something users of the language are already more comfortable with than fields whose access is by-value only (which is not possible for fields added via Deref etc.). This, I think, applies regardless of whether .await() would be integrated into auto-deref or not.

I may be content with .await being adopted when ensuring fut.await() is never possible, i.e. explicitely forbidden, even if Output turns out to impl FnOnce() -> T. Otherwise, that would be inconsistent with field syntax and it would make it much harder to ever modify the syntax.

unsafe fn does not behave as fn itself—as previously mentioned. And a keyword in front of the function declaration that changes the result type seems like a more puzzling functionality to learn than only being allowed to use a method in a particular way.

Not the yield keyword itself as a function, but one containing it: foo.map(|| yield 0) will behave very differently from foo.map(|| 0). That is the realization of the above argument, the transformation that yield applies to the function affects the result type and is much more implicit than whatever semantics await fn would get, precisely because it is completely implicit and many times drop-in. Meanwhile, await fn very explicitely announces in its declaration that it's not a standard function but the function head would still be the correct interpretation. However, that is still secondary to the syntactical considerations of .await() itself.

And good to see you agree that the syntax without yet having a trait or any of the other side-loaded stuff is mostly a stylistic debate :slight_smile: See the first paragraph for the reason why I think the noise makes the semantics clearer, and the second and third for why I think it is more evolvable and a smaller commitment for the future, which I may overvalue but generators and resumption arguments seemed as the intuitive follow-up to await and would highly benefit from the extensibility.

1 Like

Oh, well that's an entirely different argument, then! I was only arguing against trait methods.

I don't have a problem with people who prefer the .await() syntax.

But none of that is true:

  • .await does not auto-deref, and .await() wouldn't auto-deref either.

  • Field syntax can run arbitrary user code and can block.

  • .await does introduce a branch into the surface control flow.

    It must do this because the async fn returns a Future, and the Future's poll method must return to the Executor. It cannot synchronously block.

    Thus, when a .await happens in the code, there is an implicit return (just like ?), and an implicit loop.

It's fine to prefer the .await() syntax, but you should do it for the right reasons, not based on misinformation.

How is that any different from saying that ? creates a return which is invisible to the surface control flow? In both cases they're creating implicit returns.

To be clear, the "blocking" that an .await does is very different from the normal synchronous blocking (e.g. synchronous I/O).

It's different both from a mental standpoint, and also very different from an implementation standpoint.

Rust likes to make the programmer aware of low-level details, so of course Rust makes a distinction between synchronous and asynchronous blocking. Just like how it makes a distinction between the stack and heap.

Niko wrote a couple very good summary posts in the "A final proposal for await syntax" thread:

There's also a lot of other information floating around, but it's mostly in the actual discussion threads (which happened months ago, and covered essentially every possible option, and all the pros and cons of every option):

foo.map(|| yield 0) is currently (on nightly, with the generator feature enabled) a type error because map (at least the commonly known methods called map) takes some form of Fn while that's a Generator. In the future once generators are stabilized I would expect it to be a syntax error because I see almost no chance of the "looks like a closure but includes a yield" syntax for generators being stabilized. There's not much point in really using generators syntax as any kind of example with how far off stabilization it is.

3 Likes

It doesn't now, but the consensus of the thread specifically about that seems to be that it should, so I am assuming that change will be made.

I understood you to be writing deliberately contrived code in that post. Yes, in principle field access can run arbitrary code and block, but that’s not what Deref is for.

  • .await does introduce a branch into the surface control flow. It must do this because the async fn returns a Future , and the Future 's poll method must return to the Executor. It cannot synchronously block. Thus, when a .await happens in the code, there is an implicit return (just like ? ), and an implicit loop.

Saying this indicates to me that either you do not understand the distinction I am trying to make between the “surface control flow” and the “state machine,” or you reject it. Let me be excruciatingly precise about what I mean so we can rule out the first possibility.

Here’s a simple async function. Just to be stubborn, I’m writing it with my preferred syntax.

async fn a_then_b(Context& ctx) -> Result<Data, IoError> {
    let a_output = do_a(ctx).await()?
    let b_output = do_b(ctx, a_output).await()?
    postprocess(b_output)
}

The surface control flow of this function is the same as the control flow of a hypothetical synchronous version:

fn sync_a_then_b(ctx: &Context) -> Result<Data, IoError> {
    let a_output = sync_do_a(ctx)?
    let b_output = sync_do_b(ctx, a_output)?
    postprocess(b_output)
}

The only early returns in the surface control flow are due to the ? operators. Early returns due to await exist only in the state machine, which is something not entirely unlike

enum AThenBState {
    BeforeA(future_a: DoA)
    BeforeB(future_b: DoB)
}

struct AThenB<'a> {
    state: mut AThenBState,
    ctx: &'a Context
}

fn a_then_b(ctx: &'a Context) -> AThenB<'a> {
   AThenB { state: BeforeA(do_a(ctx)), ctx: ctx }
}

impl Future<Output = Result<Data, IoError>> for AThenB {
    fn poll(self: &mut Self) -> Poll<Self::Output> {
        match self.state {
            BeforeA(future_a) => {
                if let Poll::Ready(a_output) = future_a.poll() {
                    match a_output {
                        Err(e) => return Poll::Ready(Err(e)),
                        Ok(a_output) => {
                            self.state = BeforeB(do_b(self.ctx, a_output));
                            return Poll::Pending;
                        }
                    }
                } else {
                    return Poll::Pending;
                }
            },
            BeforeB(future_b) => {
                if let Poll::Ready(b_output) = future_b.poll() {
                    match b_output {
                        Err(e) => return Poll::Ready(Err(e)),
                        Ok(b_output) => {
                            return Poll::Ready(Ok(postprocess(b_output)))
                        }
                    }
                } else {
                    return Poll::Pending;
                }
            }
        }
    }
}

Skimming the docs for futures-rs gives me the impression that it would be even more complicated than that in reality, but this should be sufficient for illustration. Details of how you write an impl Future by hand are not the point. The point is first that await’s early returns exist in the state machine and not in the surface control flow, and second that the state machine is a big ball of hair that you don't want to have to think about most of the time. In fact, it's so hairy that thinking about it is liable to confuse you into writing bugs.

It creates a branch in the state machine, but this is invisible in the surface control flow. How is that any different from saying that ? creates a return which is invisible to the surface control flow? In both cases they’re creating implicit return s.

The difference is what you have to be aware of, in order to understand what the function does. The ? desugar turns sync_a_then_b into

fn sync_a_then_b_desugar(ctx: &Context) -> Result<Data, IoError> {
    match sync_do_a(ctx) {
        Err(e) => Err(e),
        Ok(a_output) => match sync_do_b(ctx, a_output) {
            Err(e) => Err(e),
            Ok(a_output) => Ok(postprocess(b_output))
        }
    }
}

and you have to know that in order to understand the function. Therefore, the early returns created by ? are part of the surface control flow.

By contrast, you do not have to be aware of the state machine in order to understand what await does, so its early returns are not part of the control flow.

To be clear, the “blocking” that an .await does is very different from the normal synchronous blocking (e.g. synchronous I/O).

It’s different both from a mental standpoint, and also very different from an implementation standpoint.

Yes, it is very different in implementation, but I do not agree that it is very different in terms of the mental model that an end programmer needs to have. In fact, I think an end programmer—by which I mean anyone who isn’t involved with writing the executor itself—should use a mental model in which async functions are running synchronously, in a cooperative multitasking environment in which await is the only blocking system call.

I think this because my experience with writing async code myself and with debugging async code written by other people (specifically in Python and JavaScript) says that this is the most useful mental model for bug finding purposes. For concreteness, look at the bug described by @theduke (more details) over in the other thread. The key insight they needed to have, in order to resolve the bug, was that an await in the wrong place caused their coroutine to block waiting for the wrong (combination of) events. This insight would have been easier to come to if they had been constantly aware that await is a suspension point. It would have been harder to come to if they had been constantly aware that await causes an early return within a complicated state machine that is created by async.

Rust likes to make the programmer aware of low-level details

I’m going to point at withoutboats’ third requirement for zero-cost abstractions here:

Improve users’ experience: The point of abstraction is to provide a new tool, assembled from lower level components, which enable users to more easily write the programs they want ot write. A zero cost abstraction, like all abstractions, must actually offer a better experience than the alternative.

Not having to be aware of the state machine is what makes async/await a valuable abstraction over raw futures. But if we go too far, and make it too easy to forget that await blocks the surface control flow, then it stops being a better experience again.

Have you or some other team member written out an explanation of why you do see it as unnecessary noise? Niko wrote a couple very good summary posts in the “A final proposal for await syntax” thread: […]

Thanks, I saw these go by but had forgotten what they said due to the length and speed of the discussion.

You will not be surprised to hear that I very much agree with the observation that

since I just spent a bunch of time arguing that that is the correct intuition for people to have, that it facilitates debugging in a way that no other mental model achieves; and that I think Niko is wrong when they say, a little further down,

When writing Async I/O code, I imagine one has to do a lot of awaiting, and most of the time you don’t want to think about it very much.

My experience has been that you do have to do a lot of awaiting and you do want to think about it every single time, because “the scheduler may choose to run another thread here” translates directly to “so you better be holding the correct set of locks at this point, and you better be maintaining all the invariants observable by others, and you better have chosen the correct thing to wait for.”

8 Likes

That's not a good assumption. Random people liking an idea is not "consensus". In fact I don't even see consensus in that thread, just discussions.

The only "consensus" seems to be that it's possible to add auto-deref in the future in a backwards compatible way, if it is ever added.

Also, the only consensus that matters is the consensus of the Rust Lang team. And at least one member of the Rust Lang team is against it (at least for the time being).

It's true that it's a bit abusive to use Deref in that way, nonetheless, field syntax can execute type-specific user code. And it can block.

Yes it was contrived, it was just the simplest example I could give. Some crates use Deref to do interesting (non-contrived) things. I make no comment on whether they should do these things, but it's certainly possible.

And as I explained in another post, a lot of popular languages have the ability to run custom code when accessing/setting a field. This is widely accepted and considered idiomatic! So the idea of field access being equivalent to a method is already quite common (just not in Rust).

Quite right. And the same is true for async / await.

Let me explain very clearly, step by step.

Let's look at this function:

async fn foo() -> i32 {
    println!("Before");
    bar().await;
    println!("After");
    return 5;
}

We're going to ignore all the implementation details about unsafe pointers and the state machine and all that. Let's just focus on what the code does, from the programmer's perspective.

First, the async fn gets transformed into fn foo() -> impl Future<Output = i32>. The mechanics of how this is done don't matter. The point is that it returns a Future.

So when you call foo(), it returns a Future, which you can then poll (polling is a part of the public API of Future):

let x = foo();

// Needed for pin safety, but irrelevant to this discussion.
pin_mut!(x);

// Gets a Context somehow. There's many ways to do this, depending on what you're trying to do. It's also irrelevant to this discussion.
let cx = ...;

let result = x.poll(&mut cx);

println!("Done");

What happens when it runs x.poll(&mut cx)? Well, first it prints Before, then it calls the bar() function, and then it awaits. This awaiting will check whether the bar() Future is ready or not.

If it's ready, it then continues with the rest of the foo() function (which prints After and returns 5). And so result is now Poll::Ready(5).

On the other hand, if it's not ready, then await immediately returns Poll::Pending, which means result is now Poll::Pending.

And lastly it prints Done. So, assuming bar() wasn't ready, that means it printed Before, and then Done (not After).

This is not what would happen if await was a blocking call: in that case it would wait for bar() and foo() to finish before printing, so it would always print Before -> After -> Done.

That means that the await really did return, and that return is visible from outside of the async fn.

It isn't just an implementation detail of the state machine. It's a part of the public API of Future, which the programmer can observe.

This is exactly the same as an fn which uses ?: the ? really will cause a real return, which is visible outside of the fn.

Does the programmer need to always be aware of this? No, most of the time they don't, but sometimes they do.

For example, it's important when understanding the interaction between async / await and multi-threaded or blocking code (and why it's bad to use blocking code inside of async).

I don't think that mental model is correct, because unlike languages like Python and JavaScript, Rust is multi-threaded.

So there really is a distinction between async/await and real threads, which you need to be aware of when dealing with multi-threaded code.

It's fine to have a more vague concept most of the time (ignoring details can be useful!), but sometimes the distinction really does matter.

I have no problem with people thinking of .await as a "suspension point", without any regard to the actual implementation of how it works. That's perfectly fine.

But your point was that field access cannot block, but method calls can block, and await is basically the same thing as blocking, so therefore it makes more sense as a method call.

But my point is that the await isn't actually blocking at all: it's immediately returning! It's actually the opposite of blocking. That's why it needs to use a complicated state machine transformation (unlike blocking method calls).

So I see no problem with await having weird syntax, since what it's doing is fundamentally weird.

Most of the time you can ignore that weirdness and pretend that it's like synchronous blocking, but there really is a difference, and sometimes that difference is important.

I agree, but I don't think that has anything to do with .await vs .await(). The state machine will be the same either way. People can get used to it either way. People need to be aware of it either way.

Absolutely, I just disagree with your claim that .await() will be significantly better in that regard than .await.

I think that after the initial shock has worn off and familiarity has worn in, people will make mistakes with either syntax. People will become complacent with either syntax. Adding a bit of syntactic salt with () won't necessarily make people more alert.

I could be wrong on that! But that's my opinion right now.

(As a meta note, I'm glad that we're able to have a civil debate about this, rather than it devolving into logical fallacies and ad hominems)

5 Likes

Under the mental model where async is a cooperative multitasking environment, calling an async function and not immediately awaiting it is equivalent to spawning a new thread, so it still makes sense from that perspective that you don’t see “After” before “Done”.

In fact, I’ve rewritten your example to use the old SysV context switching functions:

Playground Link

The interesting parts:

fn bar() {
    // Context switch back to MAIN_UCONTEXT
    unsafe { swapcontext(FOO_UCONTEXT.get(), MAIN_UCONTEXT.get()); }
}
extern "C" fn foo() {
    println!("Before");
    bar();
    println!("After");
}

fn main() {
    unsafe {
        FOO_UCONTEXT.init_for_makecontext();
        // Set up FOO_UCONTEXT to run foo()
        makecontext(FOO_UCONTEXT.get(), foo, 0);
        // Context switch to FOO_UCONTEXT (similar to calling poll)
        swapcontext(MAIN_UCONTEXT.get(), FOO_UCONTEXT.get());
        println!("Done");
    }
}

No async or await here. But just as in your example, it deterministically prints “Before” followed by “Done”, and never “After”. In a cooperatively multitasking environment, blocking is observable.

Whether the results of using crufty old APIs is interesting is left to the reader. But on the other hand, you wouldn’t normally call poll manually in real async code, so the fact that await is observable if you do so is not all that meaningful either, I’d say…

4 Likes

That's very interesting, but it kind of proves my point: you needed to use an unusual system in order to achieve the same thing that async / await achieves, thus proving that async / await is a weird / different system compared to normal blocking functions in Rust.

It's true that poll is rarely used in regular code, however I don't think it's meaningless. After all, it's literally the fundamental API which is the foundation for the entire Future system. So it's important to understand it, even if you don't actually call poll very much.

As a more practical example of the differences, you can run many Futures concurrently on a single thread. That implies that some sort of return is happening (to clear out the call stack so it can start running a different Future). You can't do that with blocking calls (well, normally anyways).

1 Like

This is not correct. Both are monadic bind operations, and you can happily live inside the monad without worrying about the extra control flow they introduce (whether it is short-circuiting for ? or a loop for await). The reason you see a difference is that for await you position yourself "inside" the monad, keeping the abstraction opaque so you don't have to worry about how futures are driven/pulled, and so on; while for ? you take an "outside" look, you insist on caring about the outermost "driver" that has to do something with the error case.

If you keep the same level of abstraction, what happens in the error case of ? is just as out-of-scope as the loop driving a future to completion. Both are part of the "runtime".

(And FWIW, many other languages consider the ? monad to be omnipresent enough that they do fully emerge it into the language. That's called exceptions, of course. This clearly demonstrates that ? can be just a function call -- it is, in languages like C++ or Java. And the way in which some people love the fact that Rust does not make this choice, the way Rust makes the monadic bind for ? explicit, is exactly the same way in which await makes the monadic bind for futures explicit.)

10 Likes

Matthias247 mentioned a very interesting area where async / await differs from synchronous blocking functions.

Basically, because Futures can be dropped, your normal intuition that functions always run to completion is wrong.

So it actually is important to understand that .await implicitly returns, even if you choose to ignore the state machine and the Executor runtime.

1 Like

This is the same thing as panic (unwind) safety, though: if a function that you call panics, you need to clean up state in destructors. The linked code would have the same behavior if the await translation were a synchronous blocking call with the possibility of panicking.

4 Likes

Right, but I think there is a difference in practice between them: people tend to treat panics as “something went wrong, abort the program”. So they’re only used when the program is buggy, and they rarely happen. In other words, they’re an exception.

But dropping Futures is quite common, and happens all the time. It doesn’t indicate that there is a bug in the program, it’s just normal and expected.

Another difference is that people don’t usually catch and recover from panics, but recovering from a dropped Future is normal and expected.

Another difference is that panics are loud, so panic unsafety is found quickly. But drop unsafety is very subtle, and hard to find, and hard to debug.

So people are used to pretending that panics don’t exist, but now they are confronted with a situation which is similar to panics, except it’s normal and not considered an error, so they get confused.

In other words, people don’t usually think about panic safety, but they have to think about Drop safety with async.

Or to put it another way: using await is like using panic! (and thus involves control flow and unwinding). But people certainly don’t think of it that way!

5 Likes

While I agree that “Drop safety” in async functions is much more likely to be tested, and is part of the “happy path”, as opposed to “Panic safety” in synchronous functions (and async functions), if you’re doing something unsafe, you are required to be panic safe with the singular exception of if you are the binary and you set panic=abort.

And panics aren’t at the process level, they’re at the task level. It’s quite common to have your worker thread (and potentially an async task now!) wrapped in a catch_unwind that will log the error and recover to continue on working with other tasks.

So where dropping a future is an external actor telling you to cancel, a panic can be seen as this task saying “something went wrong I don’t know how to deal with, cancel me aggressively.”

If you aren’t doing unsafe, the “worst” that can happen with either is a deadlock, which is equally as bad in a panicking situation, as you’re not going to find out what went wrong, and the failure is propagated when it should be contained.

I agree that dropping a future is basically another unwinding mechanism. People need to consider it more, even though Rust’s explicit errors make it easier to ignore as “just when something terminal happened”.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.