Pre-Pre-RFC: async methods & bounding async fns

We’ve discovered an important fact.

The “normative” use case of async/await expects that all futures are Send. The default API for executors allows the executor to assume that it can move the future between threads, and so it bounds the future as Future + Send. Exceptions, like embedded programming, don’t use that default API.

What this means is that the vast majority of async fns will have to produce Send futures. For free async functions, this isn’t a problem: every time you call one, we have the concrete type available, and so we can typecheck that it is Send. But for trait methods called on generics, we don’t know. That’s the whole motivation of bounding async fns in the first place.

That is to say, with the original proposal, pretty much every trait would write its async methods async(Send) fn. This seems bad: if 95% of use cases are going to go one way, that way should probably be the default.

But if we decided that async fns return impl Future + Send by default, we need a way to opt out. @aturon proposes that they would just not use the sugar, and instead write the impl Future version:

trait Foo {
    // no Send bound
    fn foo(&self) -> impl Future<Output = i32>  + 'all {
        async {
            // body
        }
    }
}

An alternative would be to support a ?Send bound in this position:

trait Foo {
    async(?Send) fn foo(&self) -> i32 {
        // body
    }
}

@Nemo157 I’m especially interested in hearing your thoughts about this because I think in your embedded use case you’re using a single threaded executor. Are you taking advantage of it with non-Send futures?

5 Likes

@withoutboats Interesting. If we’re going with a Send default which should cover most use cases and an explicit syntax exists for more general needs, then perhaps we can leave the sugar out initially, letting people use the salty version and then see what sugar we really need?

Oh & I forgot that there’s pretty much no need for typeof or outputof if the default is to be Send: there’s no reason to support something like outputof(T::foo): ?Send, its only at the definition site of the trait where you could meaningfully make this change.

So if we go with the Send default, all of the questions around typeof can be dropped out of this discussion and postponed; we just have to decide on the definition site opt out mechanism.

Here’s a list with all the reasons why I think the “inner return type” approach is the wrong choice:

  • Pro: It makes our code a tiny bit shorter
  • Undecided: Learnability - The “outer return type” approach requires an explanation for what impl Trait and the 'all lifetime mean. The “inner return type” approach is shorter but it requires an explanation about the return type and its lifetime as well. Explaining it as part of the signature is not possible.
  • Con: The “inner return type” approach is unlike the rest of the language. Other function signatures are simply what they are. There is AFAIK no precedence for a keyword changing the rest of a signature in such a way that it could be expressed by another signature
  • Con: Incompatibility with abstract types. The abstract types RFC has been merged. It is very likely that we would want to use the feature with async functions after it has been implemented. The “inner return type” approach will prevent us from using it.
  • Con: Choosing Send as default bound is problematic. I always imagined async functions to be a great way to implement cooperative multitasking solutions in embedded programming. { async { … }} is an acceptable workaround, but should we really discuss a workaround for a feature that is not even fully implemented? Do we really want this notational split between asynchronous functions that return a type that is Send and those that don’t? Edit: I moved this into the last point. The Send thing is just a complicated rule.
  • Con: Inability to specify bounds. Specifying bounds cannot work the same way as it does for other functions. An async(Trait) syntax was proposed to allow specifying bounds, but it has downsides: It’s at the front which makes it look like it affects the function (but it affects the return type) and any such notation is inherently inconsistent to how it’s usually done
  • Con: Notational split between async fn and initialization pattern. As a result, newcomers will think the initialization pattern to be an advanced feature. In reality the difference isn’t big at all.
  • Con: We need to remember a set of rules to be able to tell how the signature will look like after the transformation. Especially the rule around Send that only applies for an async method declaration in a trait definition is difficult! We should think twice about introducing indirection that could easily be avoided!

Edit: I revised the “inability to specify bounds” point and the “remember a set of rules” point


I like calling the lifetime 'all instead of 'in. I think both are good name choices.

6 Likes

This is not what it's about. It's about the mental model of async-as-an-effect. async fn f() -> T is significantly clearer in that sense than async fn f() -> impl Future<Output = T>, and doubly so in the presence of lifetime parameters, even/especially if we had some magical 'in lifetime.

async is and should be usable without first learning about impl Trait, or anonymous types, or futures; just as sync functions are usable without first learning about the call stack or the function traits. And for that matter, even once you've learned those things it shouldn't be necessary to deal with them in the general case.

In this sense, the inner return type approach is very much like the rest of the language. The function provides a T when its body has completed, not a future. The future is only the mechanism by which this is accomplished, and you only need to care if you need concurrency. (This is the same reasoning I have for explicit async/implicit await.)

Like futures, abstract types are a means to an end. Given another way to name an async fn's associated type, they are unnecessary. Such a mechanism may arguably be better---abstract types, again, involve learning about anonymous types and inference, while directly writing typeof(foo)::Output or even just foo::Output or foo is quite a bit more straightforward.

The outer return type approach does not help here. In standalone async fns, auto traits already "leak," and any other traits may/must be implemented explicitly by naming the type. In traits, the bound must be part of the trait and not the method, or else implementations will not have to fulfill it.

1 Like

I really disagree with this, as someone without much knowledge of async functions I would think this is an function that does something in the background and returns a T. This is not at all what is happening and I think it makes it less clear when we don't specify the output of a function directly. This would mean we would have to remember precisely in what way async changes the output insted of just looking behind the -> like with all other functions.

I think impl Trait is quite intuitive even without knowing about anonymous types. It just returns something that implements Trait.

async fn is not the only place an abstract type would be used and I think it would be greatly beneficial to learnability to to have async use the same syntax as other things that do a similar thing instead of having unique syntax that isn't really used anywhere else.

I think fn foo() -> impl Future<Output = T> + Trait (the async can be left out of the docs) is a lot clearer than async(Trait) foo() -> T. A large factor in this is that the Trait bound is at the normal place for the return type not the start of the function. At the start it looks like something about the function as a whole instead of the return type.

Another big benefit is that other functions that return a future have the same signature.

4 Likes

impl Future<Output = T> for me is a good indicator that you have to do something before you have your T. This is even without knowing what impl means precisely.

Without this I think we would see many questions about why this async function does not return a T like writen in the return of the function. I can see many people trying to use the value as a T only to get an error indicating that is is not a T at all and still having to learn about futures.

2 Likes

And you would be almost entirely correct. Just replace "in the background" with "when polled" or "when awaited" and that's exactly correct!

But poll is not defined on T and other things that do this for you take a Future<Ouput = T>. I understand where you're coming from but I think leaving the future out of the signature makes it less clear overall.

1 Like

Suppose try fn foo() -> i32 would produce something like:

fn foo() -> impl Try

As a caller you would know you’re getting a fallible i32 result. It’s arguably still a kind of i32.

Of course the two must be composable:

async try fn foo() -> i32

or:

try async fn foo() -> i32

Composing the two is definitely something we should consider—that is one major benefit of effects-as-language-features over effects-as-wrapped-return types.

2 Likes

I think that mental model is a legit way to look at it. However, I consider it a leaky mental model given how async programming works in Rust. At the end of the day, the function returns a future and every user needs to learn about that fact because:

  • manual future implementations will always exist and they return futures
  • when using the initialization pattern the fact that a future is returned surfaces in the signature
  • you always need to pass your future to an executor in order to start it

Hiding the underlying mechanism cannot and be done and doing it partly just muddies the waters.

I program a lot in JavaScript and it's the same thing with JavaScript promises: They represent eventual values, but they're objects in their own right because there's plenty of stuff that you can do with them directly. That's why I consider JavaScript async functions as simply normal functions that return promises. This way of thinking has worked well for me over the years.


(Edit: Removed incorrect response to @rpjohnst )


Exactly!

I think it should be this way around because the future would wrap the result. I don't see any compatibility issues with try fn no matter whether we use the outer return type or inner return type approach. Interestingly, the most recent proposal for try fn used an outer return type approach for try fn. (Off-topic side note: I'm currently unconvinced that try fns are a good idea)

2 Likes

Thanks for clarifying this to me.

I'm keeping the point about bounds, but I rephrased it. Here's how I understand it now:

  • Bounds on async method declarations in trait definitions ensure that all implementations fulfill them
  • Bounds on async functions and method definitions are useful to get compiler errors if they're not fulfilled. However, if they are fulfilled and we left them out it would still compile because auto-traits leak.

I don’t think that the returning the outer return type is advantageous. @MajorBreakfast has done a good job enumerating the arguments for the outer return type, here is the argument, from my perspective, for using the inner return type (which is mostly not to do with the number of characters).

It puts complexity front and center

Looking at this signature:

async fn foo() -> impl Future<Output = i32> + 'in

This signature is quite complex - framing it as “more characters” is in my opinion doing a disservice to how complex this is:

  • It uses impl Trait
  • The trait has an associated type
  • It has multiple bounds conjoined with +
  • One of those bounds is an explicit lifetime

This is an advanced signature to understand, and I think it would be very intimidating for new users. Claims about how users need to understand all of this to know how async works anyway are ignoring the fact that users can start with a very fuzzy model (“it returns a future”) and only gradually, as they gain a better grasp of Rust’s type system, fill in the precise definition.

It appears configurable, but isn’t

Looking at that return type again:

impl Future<Output = i32> + 'in

This has several components that appear incorrectly to be configurable; that is, based on Rust’s grammar, users would imagine that they could modify many parts of this signature, but they can’t. I’ll enumerate.

The lifetime is not meaningfully configurable

Users might imagine they could drop the + 'in, or otherwise specify a different set of lifetimes. This is incorrect. Well, maybe not, because we haven’t specified our requirements - if we allow any lifetime signature that “captures all input lifetimes,” it becomes configurable but in an even more confusing way: what lifetime you can put in the future (or if you need to put one at all) depends on what the input lifetime is. That is, all of these signatures are valid, syntactically different, and semantically equivalent:

// no input lifetimes, so no output lifetime required
async fn foo() -> impl Future<Output = 32>

// only one lifetime, so you can capture just that instead of 'in
async fn foo<'a>(x: &'a i32) -> impl Future<Output = i32> + 'a

// Lifetime elision would have the same meaning as 'in here
async fn foo(&self) -> impl Future<Output = i32> + '_

// Because 'a and 'b both outlive 'r, using + 'r is fine
//
// (I might have gotten this wrong and it might be supposed to be:
//     'r: 'a + 'b
//  I can never remember..)
async fn foo<'a, 'b, 'r>(x: &'a i32, &'b i32) -> impl Future<Output = i32> + 'r
     where 'a: 'r, 'b: 'r

In contrast, these very similar signatures are invalid:

// There's an elided input lifetime, and no output lifetime
async fn foo(x: &i32) -> impl Future<Output = i32>

// You've used '_, but the elision defaults dont capture
// the lifetime of the x variable
async fn foo(&self, x: &i32) -> impl Future<Output = i32> + '_

// Even though you used the only named lifetime,
// there is an elided lifetime here
async fn foo<'a>(x: &'a i32, y: &i32) -> impl Future<Output = i32> + 'a

Ultimately, none of this configurability is valuable either: the only valid signatures are all functionally equivalent. We could sweep away configurability by mandating that only the 'in signature is allowed, but that only mitigates, not eliminates, the underlying problem that you must specify, every time, that the return type has a particular lifetime, unlike any other kind of function.

The return type is not meaningfully configurable

You might imagine that you could replace impl Future with something else, in a few directions:

// Maybe you want to use a trait object:
async fn foo() -> Box<Future<Output = i32> + 'in>

// Maybe you think you'd be able to return a stream:
async fn foo() -> impl Stream<Output = i32> + 'in

// Maybe you think you can add any trait:
async fn foo() -> impl Future<Output = i32> + Copy + 'in

None of these work, and I want to pause on the last one. Given that the whole point of this change is that impl Future<Output = i32> + Send + 'in would work, you have to know that the only additional bound you can add is an auto trait. By using the normal bound syntax, we introduce an expectation that any trait will work here. But they won’t, because only the auto traits can be inferred for the anonymous futures.

Well, that’s not quite true: if there’s a blanket impl of the trait for all futures, you would (according to the rules of our type system), return that, since your type will implement it. For example:

// It is the case that every Future implements IntoFuture
async fn foo() -> impl IntoFuture<Output = i32> + 'in

Presumably, with the return type of this function, you can’t treat it as a future, only as an IntoFuture. This wouldn’t be useful at all, since given that you have an impl Future, the compiler can figure out that that type implements IntoFuture.

Once again, there is some configurability, but you have to deeply understand both the language and the Future API to know what you can and can’t do, and the things you can do are overall not useful - that is, the only useful thing is adding + Send.

It is less ergonomic in several respects

Originally, we leaned heavily on the unergonomics of losing lifetime elision to justify the internal return type. @aturon has introduced the special lifetime syntax variably called 'in or 'all or 'input in this thread, which mitigates, but does not completely eliminate, that unergonomics - that is, even with this feature, you do have to write + 'in, which is less ergonomic.

However, in my opinion, the 'in does not carry its weight. Its primary use case would be to be the mandatory lifetime you have to write when writing out an async fn. I think it would be better to use a syntax in which you don’t have to write a lifetime at all than coming up with a special lifetime to make writing the lifetime easier. Outside of this use case, it doesn’t have a very compelling motivation.

Then there’s the argument that can be reduced to character count, but the truth is just that -> impl Future<Output = ?> + 'in is a lot of additional code to add to the signature in addition to the async keyword. Rust function signatures already often run to multiple lines, we don’t have a lot of real estate to spare in the function signature.

There’s a last point of ergonomics that I don’t think has ever been brought up: omitting the return type. If you wrote:

async fn foo() -> impl Future + 'in

You might reasonably imagine that this works just like omitting the return type of the function normally: it defaults to (). But that’s not true: any interior return type will be accepted here, so you could return an i32 or anything else and it will still compile. Moreover, when polling this future, the compiler won’t know what type it actually returns, so calls to this will ultimately not typecheck.

This is both a potential pitfall for users expecting a different behavior, and less ergonomic even if you have the correct expectation. If you want to write an async fn that returns (), you have to write:

async fn foo() -> impl Future<Output = ()> + 'in

It renders inoperable a compelling mental model for async

@aturon has emphasized to me the importance, for him, of supporting a mental model in which you can understand async fn as sugar for fn -> impl Future with an async block inside of it. For me (and I think similar for @rpjohnst), the compelling model for learning how async works is rather different.

I see async as a modifier which can be applied to different syntactic forms. Those forms look the same as they did before, but have two differences:

  • Instead of evaluating to T, they evaluate to a future of T
  • You can await other expressions inside of them.

That is, the desugaring of async fn to something with async block inside of it is not an important early point of understanding: the initial, intuitive mental model, is that you stick async on the front of a block, a function, or a closure, and instead of evaluating normally, it has a delayed evaluation.

This mental model relies on the function signature looking like a normal, synchronous function signature, and matching the interior return type. You can take any function you have, add async to the front of it, and the resulting code will still be valid Rust, only now you can await futures inside of that function. That’s a very powerful tool for understanding in my opinion, and it becomes diminished by instead returning the outer return type.

Conclusion

This is the same reasoning I used when writing the RFC, and the only new problem introduced in this thread since the RFC is the problem of bounding an async function in a trait definition. The outer return type provides an obvious way to solve that problem (though I’ve argued above that that solution introduces more confusion), but there are also several syntaxes proposed in this thread that solve the problem for the inner return type syntax. I don’t think that the bounds in traits problem is enough to shift the balance away from the decision we made in the RFC thread.

22 Likes

I think there’s a shorter and higher level abstraction of the argument I’ve made above:

When talking about explicitness, we often focus on the way that it reveals to the programmer. But explicitness has another dimension: explicitness gives control to the programmer. By making something explicit, programmers have direct access to make choices about it (i.e. they could write the code one way or another). When we don’t give programmers meaningful choices, explicitness is often a hindrance - this is where people start talking about boilerplate, when you have verbosity without control. (The classic example of boilerplate to me is Java’s public static void main - reveals a lot without ceding control.)

Using the outer return type brings information about that return type to the surface in the code, but it creates for users an expectation of control that will be confounded, causing, I fear, both confusion and frustration.

11 Likes

I believe that the learnability aspect works out even better with the outer return type approach. Imagine that we get an error message like this:

error[E12345]: type mismatch resolving `i32`
  --> src/main.rs:42:38
   |
42 | async fn foo(a: &i32, b: &i32) -> i32 {
   |                                   ^^^ expected impl Future<Output = i32> + 'in, found i32
   |
   = note: the return type of an async function must be wrapped in `impl Future`

If a beginner types the wrong return type, the compiler (e.g. through RLS) suggests the correct one. It immediately shows the beginner that there's something to learn here. She could either copy-paste it and continue or open the user guide and see the explanation for what it means. The point is that it's not hidden. Ignorance is bliss until it bites you later on.

I agree that other configurations do not make sense. The big benefit is that the initialization pattern is just a tiny step away (as it should because there's hardly a difference!):

async fn foo(a: &i32, b: &i32) -> impl Future<Output = i32> + 'in { ... }

// Convert to initialization pattern: Add async block instead
fn foo(a: &i32, b: &i32) -> impl Future<Output = i32> + 'in { async { ... } }

Any trait that is implemented by the anonymous future type can be specified. E.g. Unpin could be inferred automatically if the future type is not self-referential. Specifying it would then be valid. (Edit: Listed as an example in this thread's initial post)


It's true that this mental model is compelling. I just think that it is not worth the price of what we need to give up to get it.

Considering async as a modifier that transforms functions/closures/blocks into (returning) a future is IMO more compelling. The only real addition to the language would be the async block and everything else can be explained with it.

The integration with the rest of the language would work better. Method signatures would work like they usually do. There would be no requirement to know about any hidden behaviors like that methods declared with the async syntax in trait definitions silently specify the Send bound on their output. And, it's not just the bounds problem, it's also that the inner return type approach makes the abstract types RFC not work (it's not clear whether its alternative the ::Output approach will work). The decision against this should be very deliberate because there's no going back after it ships.


Thanks for your detailed response!

3 Likes

The init_foo() above still carries all the complexity and boilerplate mentioned earlier.

I think something like the following would be simpler:

impl Foo {
    fn new(a: i32, b:i32) -> Foo {...}
    async fn foo(self) -> i32 {...}
}
2 Likes

You make a very good points about impl Future<Output = i32> + 'in appearing configurable without actually being able to configure it. You have convinced me that there is some kind of sugar needed however having

async(Send, Sync) fn foo(&self, &i32) -> i32 {/*..*/}

desugar to

fn foo<'a>(&'a self, x: &'a i32) -> impl Future<Output = i32> + Send + Sync + 'a {/*..*/}

is not intuitive to me at all. I feel like using async should be an implementation detail and not something that should be visible in the documentation. If you choose to refactor your code later to construct the future in another way it completely changes the signature of the code. This would be an backwards compatible change but would not feel like it at all. Another point is that to very similar functions can have completely different signatures depending on the way they create their future. Ultimately I feel like changing the signature this significantly is to much sugar and can go against the learnability of the general case. Ultimately there would still have to be a point where you have to use this.

Another option to consider.

What would be more natural to me is changing the lifetime interference for async functions. We could have

async fn foo(&self, x: &i32) -> impl Future<Output = i32> {/*..*/}

desugar to

fn foo<'a>(&'a self, x: &'a i32) -> impl Future<Output = i32> + 'a {/*..*/}

More concretely the output of an async function returns a future that has the lifetime of the input variables. This is still some sugar when using async functions but I feel like this is a lot more intuitive. You really don't have to think about the lifetimes at all because the lifetimes of the input variables of an async function will always have to outlive the returned future. This is not only the most intuitive default but the only option for lifetimes in async functions. (The only exception can think of is having an input and not using it at all. In that case the future technically does not have to outlive the input.) We already have special rules for using self in the signature and I feel like they really help with being able to forget about lifetimes in most cases. I feel like this change is in the same category of using the obvious meaning by default.

About trait bounds

I feel like this is something we would wan't eventually and I don't think the proposed syntax would be good for this. A signature like

async(Clone) fn foo() -> i32

feels like the Clone is about the function as a whole not the output. A big benefit of

fn foo() -> Future<Output = i32> + Clone

is using the same syntax as the rest of the language. It is imidiatly clear what is hapening and the Clone feels at the right place at the end of the signature.

To summarize I feel like we need some sugar but I am not satisfied with such a drastic measure. I would love to see some more bikeshedding for this before we move forward.

1 Like

I went back to the RFC to look at the interior/exterior type discussion: https://github.com/rust-lang/rfcs/blob/master/text/2394-async_await.md#the-return-type-t-instead-of-impl-futureoutput--t

Lifetime Elision: Now that we’re talking about 'in, what about just changing the lifetime elision rules for async functions? I don’t think that makes async any harder to learn, since it’s still needs to capture those lifetimes anyway.

Polymorphic Return: Rather than “a non-factor for us”, this thread reads to me like it is a factor, just that the polymorphism is on additional trait qualifications rather than the carrier type. (Personally, I really don’t like the look of async(Copy).) And the RFC does mention three different uses of a polymorphic return today, just arguing that they won’t be needed eventually.

Learnability: The RFC says there are arguments “in favor of both”.

Also, with trait aliases, couldn’t something like trait Async<T = ()> = Future<Output = T>; resolve a bunch of the problems? Like async fn foo() -> impl Async to return (), and async fn foo() -> impl Async<i32> to actually return something.

I also liked the questions about combining async with try, though I don’t have answers here. It feels like there are situations for both -> io::Result<impl Async<i32>> and -> impl Async<io::Result<i32>

(If this has all been discussed before, please let me know; I don’t want to re-open things if I’m not adding anything. I didn’t follow all these discussions that closely.)

2 Likes

@withoutboats: Given all these conceptual difficulties that async fn is running into, perhaps we should revisit the idea that adding keywords will make this stuff ergonomic?

If we introduced only the bare minimum needed to support coroutines into the language and outsourced all desugaring to macros, as I’d proposed back in the day, we wouldn’t need to design a single perfect keyword that covers all use cases. The minimalistic design was much more flexible and composable.

IMO, we should introduce async functions to the users as regular functions, whose return type impl’s Future trait, and then add sugary macros like #[async] on top of that. That way noone will be confused about what’s actually going on underneath.

3 Likes