Pre-RFC: Generator resume args

I’ve been looking into how futures/async/await works in rust, and the usage for TLS seems very inelegant to me. For one, it requires host support for thread-local storage, which makes porting to #![no_std] more difficult than it should be.

Summary

Add arguments to generators. This also means that the yield keyword will “return” those arguments in a tuple.

Motivation

There are two major motivations here.

The primary reason to add generator resume args is to remove the requirement on TLS for async/await.

Currently, the await! macro is implemented as follows:

macro_rules! r#await {
    ($e:expr) => { {
        let mut pinned = $e;
        loop {
            if let $crate::task::Poll::Ready(x) =
                $crate::future::poll_with_tls_context(unsafe {
                    $crate::pin::Pin::new_unchecked(&mut pinned)
                })
            {
                break x;
            }
            // FIXME(cramertj) prior to stabilizing await, we have to ensure that this
            // can't be used to create a generator on stable via `|| await!()`.
            yield
        }
    } }
}

Since it has no direct way of retrieving the Context<'_> from the resumee’s scope, it must grab it from thread-local storage (TLS). While benchmarks have shown that TLS isn’t particularly detrimental to performance, its usage creates confusing control-flow and may prevent some compiler optimizations.

The other motivation is that generators in other languages generally support returning values from a yield and to have a complete implementation, rust should as well.

Guide-level explanation

This RFC is largely internal. The only user-exposed change would be that generators support resume args.

It’s important to note that if the Resume associated type is a tuple, the tuple is unpacked for generator entry point. For example, if Resume is an empty tuple, the generator takes no resume arguments (the resume method would still take an empty tuple as the args parameter however).

An example of using this feature in a simple generator:

// `example_gen` implements `Generator<Yield=i32, Return=i32, Resume=(i32, i32)>`.
let mut example_gen = |a, b| {
    let (c, d) = yield a + b;
    c + d
};

match Pin::new(&mut example_gen).resume((1, 2)) {
    GeneratorState::Yielded(3) => {}
    _ => unreachable!(),
}

match Pin::new(&mut example_gen).resume((4, 5)) {
    GeneratorState::Complete(9) => {}
    _ => unreachable!(),
}

Reference-level explaination

This RFC modifies the Generator trait slightly to make room for arguments:

pub trait Generator {
    /// The type of value this generator yields.
    ///
    /// This associated type corresponds to the `yield` expression and the
    /// values which are allowed to be returned each time a generator yields.
    /// For example an iterator-as-a-generator would likely have this type as
    /// `T`, the type being iterated over.
    type Yield;

    /// The type of value this generator returns.
    ///
    /// This corresponds to the type returned from a generator either with a
    /// `return` statement or implicitly as the last expression of a generator
    /// literal. For example futures would use this as `Result<T, E>` as it
    /// represents a completed future.
    type Return;

    /// The type of value that this generator can be resumed with.
    ///
    /// This corresponds to the type that the `yield` keyword evaluates to,
    /// as well as to the type of the closure-like arguments to the generator.
    type Resume;

    /// Resumes the execution of this generator.
    ///
    /// This function will resume execution of the generator or start execution
    /// if it hasn't already with the supplied resume arguments. This call will
    /// return back into the generator's last suspension point, resuming
    /// execution from the latest `yield`. The generator will continue executing
    /// until it either yields or returns, at which point this function will return.
    ///
    /// # Return value
    ///
    /// The `GeneratorState` enum returned from this function indicates what
    /// state the generator is in upon returning. If the `Yielded` variant is
    /// returned then the generator has reached a suspension point and a value
    /// has been yielded out. Generators in this state are available for
    /// resumption at a later point.
    ///
    /// If `Complete` is returned then the generator has completely finished
    /// with the value provided. It is invalid for the generator to be resumed
    /// again.
    ///
    /// # Panics
    ///
    /// This function may panic if it is called after the `Complete` variant has
    /// been returned previously. While generator literals in the language are
    /// guaranteed to panic on resuming after `Complete`, this is not guaranteed
    /// for all implementations of the `Generator` trait.
    fn resume(self: Pin<&mut Self>, args: Self::Resume) -> GeneratorState<Self::Yield, Self::Return>;
}

Using resume args to replace TLS in async/await

Currently, the poll method for GenFuture, which is how a Generator is converted into a Future temporarily moves the passed context into TLS so that the await! macro above can retrieve it.

fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
    // ...
    set_task_context(cx, || match gen.resume() {
       // ...
    })
}

If generators could have resume args, this method could instead be:

fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
    // ...
    match gen.resume(NonNull::from(cx).cast()) {
        // ...
    }
}

The Resume associated type for that example would be NonNull<Context<'static>>.

Unfortunately, await would require a little compiler-magic to work: the async function transform would need to keep track of the most recent Context returned from yield. The macro, if it were written in user code, would look something like the following:

macro_rules! r#await {
    ($e:expr) => { {
        let mut pinned = $e;
        loop {
            let cx = /* get most recent context */;
            if let $crate::task::Poll::Ready(x) =
                unsafe { $crate::pin::Pin::new_unchecked(&mut pinned).poll(cx.cast().as_mut()) }
            {
                break x;
            }
            /* set most recent context */ = yield;
        }
    } }
}

Drawbacks

  • This feature adds more complexity to the Generator trait and slightly hurts its usability (by forcing the user to pass an empty tuple to the resume method even when no resume arguments are used.
  • There are a few places where compiler-magic is required.

Rationale and alternatives

Rationale

  • This feature would allow async/await to “just work” with #![no_std] and it may be more favourable towards optimizations.

Alternatives

  • Just don’t implement generator resume arguments.
    • I believe that the positives of implementing it outweigh the negatives.

Prior art

I remember seeing a post somewhere about generator resume arguments, but I can’t find it anymore unfortunately.

I’d like to expand this section.

Unresolved questions

  • Are there any issues with lifetimes or borrowing across yield points that I may not have thought of here?
  • Is there a better way of representing resume arguments in the Generator trait?
  • Should the initial arguments to the generator be different from the yield result type?

Future possibilities

I’m not sure, would like some suggestions here.

7 Likes

Two minor notes:

Generators are currently “experimentally accepted”, which means that they’ll need another RFC before stabilization, and that, practically, they currently only exist as an implementation detail of async. There is no guarantee that they’ll ever be available for stable usage.

Whatever form generator resume arguments take, they should probably work like Fn arguments do; that is, they’re “unpacked” in the source call (though currently the “actual” argument is a tuple as an implementation detail only)).

1 Like

Generators are currently “experimentally accepted”, …

Oh, thank you, didn’t realize that. In a way, that makes the situation better, since, the user-friendliness isn’t a huge factor.

Whatever form generator resume arguments take, they should probably work like Fn arguments do

I was thinking about that too; basically I don’t care either way, but if it isn’t exposed to the user, the less the magic the better, right?

I’ve updated an old playground from a previous discussion to work with the latest nightly (and I guess this is likely the last time it needs changing till generators themselves are RFCed since futures_api and pin are stable :smile:). This implements a basic generator with resume arguments along with the mapping from Future to Generator using it.

1 Like

I woke up this morning and for no other reason tried to figure out how to do the underlying implementation in a way that would make sense and resolved it would have to look like this.

I think it would work better conceptually if we considered the additional values to the generator to be ‘context’ rather than ‘args’ and for the type of that to be an associated type rather than a type argument of the trait. I say this as access to the args/context in a first-class generator would be accessed through some operator/keyword which would act like a variable within the generator, the type of which could be inferred to be a single concrete type.

In practice, I would expect to have some additional keyword/operation to access the current context of a generator within a given generator function or block which could be used to pass the context to the poll function.

It would also be nice to preserve the current generator API by doing something simple like defaulting the type of context on first-class generators to () and having an additional generator trait that subclasses the version with arguments/context where it has the type () that doesn’t require a context to be passed.

trait GeneratorWithContext {
    type Yield;
    type Return;
    type Context;

    fn resume(self: Pin<&mut Self>, context: Context) -> GeneratorState<Self::Yield, Self::Return>;
}

trait Generator: GeneratorWithContext<Context = ()> {
    fn resume(self: Pin<&mut Self>) -> GeneratorState<Self::Yield, Self::Return>;
}
1 Like

I think that making the arguments the value produced by a yield as well as those passed in initially is somewhat misaligned with what a yield represents conceptually and should be considered distinctly from the yield operation, despite having lifetimes that align with it. Making arguments the result of an internal operation is somewhat confusing.

Ideally I think we should’ve done something like this:

trait GeneratorArg {
    type Item;
    type Return;
    type Arg;

    fn resume_arg(self: Pin<&mut Self>, arg: Self::Arg)
        -> GeneratorState<Self::Item, Self::Return>;
    // generator methods
    fn resume(self: Pin<&mut Self>)
        -> GeneratorState<Self::Yield, Self::Return>
        where Self: Generator
    { self.resume_arg(())  }
    // iterator methods
    fn next(self: Pin<&mut Self>)
        -> Option<Self::Item>
        where Self: Iterator
    {
         match self.resume_arg(()) {
            GeneratorState::Yielded(val) => Some(val),
            GeneratorState::Complete(()) => None,
        }
    }
    fn collect<B>(self) -> B
        where Self: Iterator, B: FromIterator<Self::Item>
    { .. }
}

trait Generator = GeneratorArg<Arg=()>;
trait Iterator = GeneratorArg<Arg=(), Return=()>;

But unfortunately I highly doubt it will be possible to pull off in a backwards compatible way. :frowning:

1 Like

Generators are unstable and don’t even have an RFC that allows them to become stable, as long as it doesn’t break the async transform using them it should be possible to completely change the trait definition. (Whether it’s possible to backwards-compatibly integrate Iterator with them I don’t know).

2 Likes

It is theoretically valid to make Iterator a trait alias so long as implementations can use the same verbiage and provided trait methods remain provided trait methods.

In effect, I don’t think Iterator can be a subcase of Generator, but I think the correct way would be a default impl to interop between the two anyway.

And given the current thinking about const Trait, I think the better design would have actually been for the Iterator combinators to have been in a trait IteratorExt : ?const Iterator anyway, so that combinators’ constness is separate from the next constness.

1 Like

Assumes that Arg = () for all generators. Providing a default implementation of resume without an argument would require a separate trait with an imp GeneratorResume for Generator<Arg = ()> or similar.

No, it does not. Please, read the provided code snippet more carefully: resume method of GeneratorArg is only available for Generator , which is a trait alias for GeneratorArg<Arg=()>. In other words (assuming we have trait aliases) we do not need separate traits and blanket impls for restricted trait variants.

2 Likes

Oh wow. I didn’t realize that was either valid or possible.

Having thought about this it seems very problematic to conflate generators of the form

|| {
   for i  in 0..10 {
       yield i;
   }
   "Hello"
}

with async futures of the form

async || {
    let line = input.readline().await?;
    Ok(line.to_uppercase())
}

The current proposal leaves them non-compos-able, meaning you couldn’t have a construct of the form

let lines = async || {
    while let line = stdio.readline().await? {
        yield Ok(line);
    }
}

while let Yielded(line) = lines.await? {
    writeline(line).await?;
}

It is also strange that async closures require explicit annotations but generator closures do not (maybe a gen keyword used in similar places to the async keyword would make this clearer?.

In the standard instance the result of the future must be the Return of the GeneratorArg whereas in the second case the result of the future may either be the Yield or the Return of the GeneratorArg.

The two could probably be composed so that the an asynchronous generator closure is a GeneratorArg where the Item is Intermediate<T>.

enum Intermediate<T> {
    Pending,
    Yield(T),
}

And there exists some gen_await that is equivalent to

macro_rules! gen_await {
    ($e:expr) => {
        match ($e).resume_args($arg) {
            Yielded(Pending) => async_yield,
            Yielded(Yield(v)) => Yielded(v),
            Complete(v) => Complete(v),
        }
    }
}

This would mean that a generator would be GeneratorArgs<Args = ()>, an async closure would be GeneratorArgs<Args = &mut Context, Item = ()>, and some dual of the two would be GeneratorArgs<Args = &mut Context, Item = Intermediate<T>>. Additionally, awaiting an async closure would use the existing async syntax and resolve to Return and awaiting an async generator would use some new async_gen syntax and resolve to GeneratorState<Yielded, Return>.

Additionally, a yield in a generator would need to by distinct from the yield affected by an await. yield in a generator would result in a Yeilded(v) whereas yield in an asynchronous generator would result in a Yielded(Yield(v)). await in an asynchronous closure would result in a Yielded(()) and await in an asynchronous generator would result in a Yeilded(Pending).

The above consideration may be means for further discussion and beyond the consideration of this RFC.

2 Likes

This seems to indicate that you cannot actually implement Iterator for a generator but you can implement it for a pinned generator. I suspect any backwards compatible way of using a generator as an iterator would require an explicit use of an explicit into_iter method that accepts self: Pin<&mut Self>.


It seems to me that further consideration is required before generators themselves are stabilized but at that point we may want generators that don’t accept resume arguments as the default case so the solution posed by @newpavlov for distinguishing the two cases may be preferable.

I don’t think we are likely to ever have a direct implementation of iterators with generators and will always require some explicit conversion between the two as generators must remained pinned across all resumes but iterators in general do not.

The remaining question is as to whether generators of this form can be used to implement futures as we require them (which does seem to be the case) and whether generators can continue to be developed using this form (which also seems to be the case). Additionally, I think it will be possible to develop a composition mechanism for the two using this. (For bonus points, impl<A, B> Fn(A) -> B for GeneratorArgs<Args = A, Return = B, Item = ()> and you may be able to generalize closures as well).

In implementing this we will need to correctly distinguish between closure objects, generator objects, async closure objects, and possible in the future async generator objects. Each will need consideration of how they can be constructed syntactically (this is already stated for closures and async closures, although this change may lead to reconsideration about how this is done for generators), how the yield and await operators apply and operate with each, and how the operate to implement the given trait. async closures, for example, could not be implemented directly on top of generators with arguments as the context would be in the resume arguments and would need to be correctly concealed. I’m also not sure if generators with arguments should be exposed as something that can be achieved with the top-level syntax (this would remove the possibility of composing yield with resume arguments with async/await which I don’t think could be resolved).

Not all of the above must necessarily be considered by this RFC, but I think with slight alterations to the trait naming and some description of how closures, async closures, and generators would implement it, this may sufficient for an actual proposal.

2 Likes

You can use Poll<T> as the item type of async generators, using the generator trait definition from here (pretty similar to above but uses an input type parameter for the arguments) I have a full bidirectional mapping between Future/Stream <-> Generator.


This seems to indicate that you cannot actually implement Iterator for a generator but you can implement it for a pinned generator.

That was my feeling as well, so I started playing around with a PinIterator trait for iterators that require a pinned reference. Now that Pin is stabilised I really should get back to seeing whether there’s a subset of that which is useful without generators.

2 Likes