No return for generators

Disallow returning from generators. It includes explicit return expression and "reaching the end" i.e. the body must be ! (never type).

Remove GeneratorState type. Output type is the argument of yield expression.

  • Currently, generators will panic when resumed after completed. Panics should be explicit and visible, not default and invisible.
  • Reduce types. Only FnPin trait is needed.
  • The expected "No Yielded after Complete" is not guaranteed in the type system.
  • Future trait used to have Error type. It is now removed in favor of an explicit way. The situation is somewhat similar.
  • More symmetry of coroutine and function (routine). Also more appropriate for "generator" name.
  • When we adopt try block, ? operator can be used for the kind of control flow.
// Generators implement this trait.
trait FnPin<Args> : FnOnce<Args> {
	fn call_pin(self: Pin<&mut Self>, args: Args) -> Self::Output;
}

// implements FnPin<(), Output=Option<i32>>
let iterator_like = || {
	yield Some(1);
	yield Some(2);
	loop { yield None }	// This is naturally a fused iterator.
};

// implements for<'a> FnPin<&'a mut Context, Output=Poll<T>>
let future_like = |ctx| {
	let value = loop {
		match sub_future.poll(ctx) {
			Poll::Ready(value) => break value,
			Poll::Pending => yield Poll::Pending,
		}
	};
	let result = some_work(value);
	yield Poll::Ready(result);
	panic!("polled after ready");
};

Links

2 Likes

This doesn't appear to be affected by this change, future_like(cx) will still panic if called after Poll::Ready is returned from it.

In fact, this seems to make the fact that Poll::Ready is the end of a future-like generator more implicit, compared to the mapping Future<Output = T> ≈ for<'a> Generator<&'a mut Context, Yield = Poll<!>, Return = T>.

I wrote this from the point of view of a consumer of the generator, I guess having to explicitly panic inside the generator makes it more explicit to the creator, but I think in terms of panic visibility the external API is much more important than the internal API.

1 Like

Why should we disallow returning from a generator. IMO it makes a lot of send to have a generator that uses its yield points to "store" state and then get input back from the yield points (like in python for instance) and then once it has finished its computation it returns the final value.

Is there any reason we can't guarantee that in the type system, using the typestate pattern? Specifically, can we make generator functions internally use a type that consumes the generator and returns either the generator or a return value?

Please give an example of what a generator that produces a finite number of i32 values would look like, under your proposal: both the implementation, and code that consumes those values directly (not via any iterator-like facade).

1 Like

Yeah, we could have something like this to encode the type-state, but this isn't worth it. It is too complex for just making sure that we don't yield after we return. This is like making sure that we don't return Some(_) after None in an iterator, I think it would be better served by an adapter like Fuse.

2 Likes

I disagree. The thing is, Generator<Yield = Poll<!>, Return = T> and FnPin<Output = Poll<T>> are equivalent at the type level. Yes, one has a more appealing name, but that is a lie.

I wrote this from the point of view of a consumer of the generator, I guess having to explicitly panic inside the generator makes it more explicit to the creator, but I think in terms of panic visibility the external API is much more important than the internal API.

My view is that

  • For creators, it is more explicit.
  • For consumers, the new way is as implicit as the old way.

So the situation has strictly improved.

Generator implementation is iterator_like in my post. It produces finite number of values then produce a None to signal the end. Consuming this generator is the same as using an iterator:

pin_mut!(iterator_like);
while let Some(value) = iterator_like.call_pin(()) {
	println!("{}", value);
}

More elaboration of the motivation

Personally, the most appealing motivation is the symmetry between generator and closure.

A closure returns multiple values when called multiple times, a generator produces multiple values when resumed multiple times. The additional return seems an unnecessary extra bit.

An another way to see the symmetry is FnPin + Unpin = FnMut + Unpin equivalence.

Specifying the yield type

An issue of generator syntax was there was no way to specify the Yield type. No return from generators solves this problem.

// FnPin<(Arg,), Output = Output>
|arg: Arg| -> Output { /* contains `yield (output : Output)` */ }

The right-hand side of -> consistently represents the Output associated type for any Fn* traits including FnPin trait.

Infinite vs finite

// Assume `FnPin($($arg),*) -> $out`  is a sugar of `FnPin<$($arg,)*, Output = $out>`
// Assuming an another feature: automatic pin projection of captured variables.

fn zip<F: FnPin(), G: FnPin()>(mut f: F, mut g: G) -> impl FnPin() -> (F::Output, G::Output) {
    move || loop {
        yield (f(), g());
    }
}

fn zip_finite<A, B, F: FnPin() -> Option<A>, G: FnPin() -> Option<B>>(
    mut f: F,
    mut g: G,
) -> impl FnPin() -> Option<(A, B)> {
    move || {
        while let (Some(a), Some(b)) = (f(), g()) {
            yield Some((a, b));
        }
        loop {
            yield None;
        }
    }
}

Observation is the "finite" variant is a bit complicated than the "infinite" variant. The finite variant can also be represented by the infinite variant and combinators. The opposite way requires wrapping and unwrapping.

zip_finite(f, g) = zip(f, g)
    .map(|(a, b)| a.and_then(|a| b.map(|b| (a, b))))
    .fuse();    // Similar to `Iterator::fuse`

Generator with return represents a finite variant by default. Generator without return represents an infinite variant by default.

For me, it makes sense to prefer a simpler thing by default.

2 Likes

The concrete types might be effectively equivalently to the reader, but a generator trait with associated types makes it possible to distinguish between an infinite and finite generator in trait bounds. For example chaining arbitrary generators:

fn chain<Arg, A, B>(a: A, b: B) -> impl Generator<Arg, Yield = A::Yield, Return = B::Return>
where
    A: Generator<Arg, Return = ()>,
    B: Generator<Arg, Yield = A::Yield>,
{
    |arg| {
        // With a `yield from` operator the implementation is much simpler:
        //     yield from a;
        //     yield from b
        loop {
            match a.resume(arg) {
                GeneratorState::Yielded(val) => yield val,
                GeneratorState::Complete(()) => break,
            }
        }
        loop {
            match b.resume(arg) {
                GeneratorState::Yielded(val) => yield val,
                GeneratorState::Complete(res) => break res,
            }
        }
    }
}

This function is able to request A to be a finite generator (it may still have an infinite loop and never return, but the signature makes it clear that it's expected to) and propagates finite/infinite-ness of B to the returned generator. The yield from operator mentioned also relies on being able to generically distinguish when the generator is complete (instead of having to know that an iterator-generator yields None to terminate and a future-generator yields Poll::Ready), this operator is equivalent to await with the Future mapping I mentioned earlier, and is the main reason to use something like Poll<!> even though it's fundamentally equivalent to ().

As long as generators are just a hidden feature used to implement the async transform (and maybe eventually some subset of iterator/stream/sink transforms) this is probably not useful. But I still hold out hope for (and assume that this thread is based on) generators becoming a first-class feature one day to power other usecases. If that happens then these usecases are likely to want a library of generic utilities, which are able to be more powerful if they have more data from the type system to work with.

1 Like

The yield from operator mentioned also relies on being able to generically distinguish when the generator is complete

That is a good point. However, I argue that yield from is not a really good construction:

  1. It cannot be used for await when a context argument is used. We have to invent another form to support generator arguments. await is now a built-in syntax, anyway.

  2. The desugared code is not long at all.

// Just 3 lines long or one-liner depends on format
while let Some(x) = gen() { yield Some(x) }
  1. One difference of similar for expression is .into_iter() call. yield from has to accept something like Iterator, not like IntoIterator. It is not clear whether the target expression will be evaluated multiple times or not.

If generator resume arguments become a thing my imagined desugaring for yield from is something like:

let mut gen = $EXPR;
loop {
    match unsafe { Pin::new_unchecked(&mut gen) }.resume($ARGS) {
        GeneratorState::Yielded(val) => yield val,
        GeneratorState::Complete(val) => break val,
    }
}

where $ARGS refers to the arguments passed into the generator that the yield from is inside of. This is one of the main reasons to support such an operator instead of just doing it via a macro, it allows implicitly capturing the arguments and forwarding them to the inner generator. (This is mostly consistent with how the yield from operator is defined in Python).

That is exactly equivalent to the await desugaring modulo type names and the current TLS hack to emulate resume args.

I wasn't aware of the idea of passing resume arguments automatically. Good for await but is there other use cases?

Rust doesn't have an implicit argument feature in other cases. An implicit argument feature is requested for normal functions. I guess it will be inconsistent.

Still, I don't see this construction worth an additional language feature in Rust. Python seems to have corner cases for exceptions etc. when iterating manually. But the simple loop is correct in Rust.

Just one small clarification that I think some may have overlooked:

This suggestion is specifically about generalizing generators from

trait Generator {
    type Yield;
    type Return;
    fn resume(self: Pin<&mut Self>) -> GeneratorState<Self::Yield, Self::Return>;
}

to

trait FnPin {
    type Output;
    /*extern "rust-call"*/ fn call_pin(self: Pin<&mut Self>) -> Self::Output;
}

(when ignoring resume arguments) such that the current Generator trait becomes FnPin<Output = GeneratorState<Yield, Return>>, just like Future was generalized from

trait Future {
    type Item;
    type Error;
    fn poll(&mut self) -> Poll<Self::Item, Self::Error>;

to

trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>;
}

such that the old Future trait became Future<Output = Result<Item, Error>>.


The rationale of the change is that it makes generators "less special", as they're now "just" a pinned closure with a state machine transform applied to yield points. An "Iterator-like" FnPin is one that sets Output=Some(Item), and a "Future-like" FnPin is one that sets Output=Poll<Output>.


All of that said, the type level encoding of "don't call me after GeneratorState::Complete, even if not strictly enforced at the type level, is a useful thing to have and reason generally about (just like Iterator's "don't call me after None"). All in all I don't really have a preference on this, because for me the extra information from trait Generator weighs similar enough to the theoretical simplicity of FnPin that I can't choose one over the other.


Side note discression: should Iterator be compared to Generator<Yield=Item, Return=()> or Generator<Yield=Option<Item>, Return=!> (or both)? FnPin makes the mapping obvious as only FnPin<Output=Option<Item>> makes sense.

3 Likes

According to the docs, iIterators should not yield values after they yield None, so that aligns with the semantics of Generator<Yield=Item, Return=()>, Generator<Yield=Option<Item>, Return=!> has the semantics that it is ok for it to yield after it yields None, which is different from iterator.

Note, though, that Iterator does not require that, and in fact, exposes Iterator::fuse to define what happens after receiving a None. A concrete Iterator type is perfectly allowed to specify what happens when calling next after the first None (as Fuse<impl Iterator> does).

The restriction is that with a generic, unknown Iterator, anything, including a panic (but excluding UB), is correct behavior for the Iterator to return after the first None, and no assumptions can be made.

The way I see it, Iterator is Generator<Yield=Option<Item>, Return=!> and FusedIterator is Generator<Yield=Item, Return=()> (plus fused, calling resume again yields Complete(()), not a panic).

Yes, and it would be perfectly valid to panic after the first None. Also there is no way to validate any values in Rust beyond checking it yourself, so I don't find that argument convincing.

Exactly, so in general an Iterator behaves like Generator<Yield=Item, Return=()> because it is valid to panic after the first None, but it qould not be valid for Generator<Yield=Option<Item>, Return=!> to ever panic.

I would swap them due to panic behaviour, because it is valid to panic after a generator completes and only after a generator completes.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.