Blog series: generators

withoutboats · February 11, 2019, 4:43pm

I’ve started writing a new blog series about generators, trying to identify and resolve questions necessary to stabilize the feature for its most important use cases.

I’ll post links to my blog posts in this thread. I want to be very clear that I don’t think a free-form, open-ended discussion about generators as a feature would be productive at this point. I’ll use moderation tools to prevent this thread from becoming a discussion like that. Please feel free to respond to the blog posts here with specific questions, concerns or other feedback.

dhm · February 11, 2019, 5:06pm

I'd imagine the following:

use ::std::{
  *,
  ops::{
    GeneratorState,
    Try,
  },
};

impl<Y, R> Try for GeneratorState<Y, R>
where
  Y : Sized,
  R : Try,
{
  type Ok = GeneratorState<Y, R::Ok>;
  type Error = R::Error;

  /* etc, with:
      Yielded(y) <=> Ok(Yielded(y)),
      Complete(Ok(r)) <=> Ok(Complete(r)),
      Complete(Err(r)) <=> Err(r),
  */
}

And then a fallible generator (with error E) would just impl Generator<Return = Result<(), E>, >)

Of course, that differs from current Iterator::next(_) -> Option<_> but I'd expect the following kind of trait to emerge:

pub trait TryIterator
{
  type Item : Sized;
  type Error : Sized /*+ ::std::error::Error ? */;

  fn try_next (
    self: &mut Self,
  ) -> GeneratorState< Item, Result<(), Error> >;
  // or a new IteratorState that would mirror GeneratorState dichotomy
}

A last idea would be to flatten GeneratorState< Y, Result<T, E> > (3-branch enum) like Futures initially did, but since Futures "unflattened" afterwards (I guess the never type is still not mature enough), I don't know if we want to go down the same path.

withoutboats · February 11, 2019, 5:07pm

I’ll explain the nature of the problem in more detail in the next post.

bjorn3 · February 11, 2019, 5:12pm

I found a spell error:

It’s also worth noting that futures-async-await library previously mentioned implemented a verison of “async generators” that compile to streams, and so we can take some lessons from the experience with that library.

Great post though!

withoutboats · February 18, 2019, 3:29pm

Made a second post in the series, this time about the problem with ? in generators:

TimNN · February 18, 2019, 4:05pm

In other words, generators with a separate return and yield type, as a feature, requires that we make trait bonds with disjoint associated types into disjoint bounds.

I think that, especially for a MVP, Solution 1: Function adapters would work perfectly fine.

This would be a rather unpleasant outcome, in my opinion. You’d have to write this very unnecessary seeming iter::try_gen(generator()) wrapper every time you tried to iterate through a generator.

I honestly haven't spent much thought on how I would use generators, however my intuition is that in most cases you'll have a function that returns something like impl Iterator and just uses a generator internally. Having an iter::try_gen(...) in such a function as an implementation detail doesn't seem too bad to me.

withoutboats · February 18, 2019, 4:09pm

I don't know why this is your intuition (it isn't mine). Possibly its because you imagine generators being limited to closures, but the intention is definitely to add named generators with some syntax still to be determined like fn foo(...) yield i32 or gen foo(...) -> i32 or something. That is, the motivating example in this post could've been written as a standalone item, not just a closure-like expression.

TimNN · February 18, 2019, 4:29pm

The last time I came into contact with generators was in Python a few years back, where, IIRC, they were closely linked to iterators? In any case, I don't really remember (though I am aware that generators will be supported as standalone items).

Ignoring my intuition, I think it would be good to consider intent: If someone writes a function that returns a generator, I would expect their intent to be that it is used as a generator. In this case, I think it is fine for it not to be automatically treated as an iterator. If, on the other hand, someone wants the generator to be used as an iterator, they can return an impl Iterator directly, wrapping the generator internally.

While writing the above, I thought of a way to phrase things which made it clearer to me why you'd want the ability to treat any generator as an iterator: This not necessarily about generators being used to implement Iterator, but rather about a convenient way to access a generators values using, for example, a for loop.

If that is indeed the case, I would still favor Solution 1, however with a slight alteration, if possible: Would it be possible to have methods like .try_iter() on all generators?

I think that would be very nice from both an ergonomic and explicitness perspective.

gnzlbg · February 18, 2019, 6:57pm

Generator is a trait, so we would need trait methods returning -> impl Iterator<Item=Self::Yield>. This is not possible yet, but might be allowed in the future. Maybe we could work around this by adding an associated Iterator type to the Generator trait, so that these methods return -> Self::Iterator instead, but at this point, this starts to feel to messy: we'd have to support all of these "workarounds" forever, while if these were just functions, we could just easily deprecate them once solution 2 arrives, keeping the Generator trait "lean".

It is unclear to me whether @withoutboats is considering "named" generators (the blog post only talk about "anonymous" ones), but if we could have "named" generators:

struct MyGen(...);  // handwaves a lot
impl Generator for MyGen { ... } 
impl MyGen { 
    fn try_iter(...) -> impl Iterator { ... } 
}

Adding methods to them, including .try_iter(), would be trivial (maybe even automatically deriving them), and we could still move to solution 2 in the future. For "anonymous" generators, we can't name their types, so we can't implement methods manually for each of them, we have to either put them in the Generator trait, or implement a trait for all types that implement this Generator trait (or add generic functions, etc.).

bill_myers · February 18, 2019, 7:01pm

I think not having a return type is clearly the best solution.

The reason is that current “with-return-type” generators give back a GeneratorState enum when invoked, and a “without-return-type” generator could itself yield/return a GeneratorState value to emulate a “with-return-type” generator, but it also can return a simple type if such functionality is not desired.

So generators without a return type are effectively strictly more powerful than those with one, and they also fix this problem.

I would propose that “return x” in a generator yields x and then stops the generator (as if it was a break out of a loop enclosing the whole generator body).

RustyYato · February 18, 2019, 7:28pm

If you don’t have a return type, then what is the difference between generators and iterators? I don’t see the point of a generator without a return type.

newpavlov · February 18, 2019, 7:29pm

Honestly I am baffled by the last blog post. The whole premise sounds wrong. I think here:

But we want to be able to make this into an Iterator with an Item of io::Result<usize>

Author confuses "how we want to" and "how we are used to".

I would like to argue that most of the code which uses Iterator<Item=Result<T, E>> stops iteration after the first encountered error. For example some of my code is plagued with lines like these:

for record in iterator {
    let record = record?;
    // process record
}

This is why I've wrote the following proposal:

And to me it looks like the author wants to set in stone automatic Generator -> Iterator conversion, while the linked proposal argues that Iterator<T> -> Generator<T, ()> is a much more natural generalization. And it's especially strange for me considering the fact that @withoutboats participated in the Pre-RFC discussion.

But indeed there are cases when we want to convert generator into iterator. Why don't just add methods to Generator trait which will do the conversion? So instead of iter::try_gen(generator()) we for example will write generator().into_iterator() for Iterator which will ignore result and generator().into_try_iterator() for iterator which will convert Generator<T1, Result<T2, E>> to Iterator<Result<T1, E>>.

Why can't we just write a wrapper struct GeneratorTryWrapper<G: Generator<T1, Result<T2, E>>> which will implement Iterator<Result<T1, E>>?

mitsuhiko · February 18, 2019, 8:05pm

I can't comment much on generators in rust because I am not sure what they are supposed to be at the end of day, but I can comment on generators in Python quite a bit.

Generators in Python are great, but got overloaded in weird ways over the years which ultimately all turned out to be a mistake in my mind. Generators in Python are effectively just iterators (which have a pretty standard protocol which is: call __next__ until a StopIteration error is raised).

Unfortunately generator are a bit more. For a start a StopIteration error can carry a return value which is produced if someone raises StopIteration with an argument or if return value is used in a generator block. This feature turned out to be really only useful for using generators as coroutines which was deprecated a while back.

The second thing a generator does in Python is that the yield keyword is an expression with a return value that lets you send a value back into the generator by invoking send instead of __next__. This also was really only useful for coroutine use of generators which was replaced again by coroutines.

The new coroutines (async/await based) are largely separate of generators. There are quite a few reasons for this which are outlined in the PEPs.

Nowadays for all intends and purposes the yield generator syntax effectively produces an iterator and all other features are better not to be used.

gnzlbg · February 18, 2019, 8:19pm

We can, but then the methods are on GeneratorTryWrapper and not on Generator. I'm unsure how this makes anything better than the fn try_gen(gen()) approach. Does the user need to write GeneratorTryWrapper(my_gen()) or what does this enable?

newpavlov · February 18, 2019, 8:28pm

Why is that? I think we can write code like this without any problems:

trait Generator {
    type Yield;
    type Result;

    fn resume(&mut self) -> GeneratorState<Self::Yield, Self::Result>;

    fn try_iter<Y, T, E>(self) -> GeneratorTryWrapper<Y, T, E, Self>
        where Self: Generator<Yield=Y, Result=Result<T, E>> + Sized
    {
        GeneratorTryWrapper { gen: self }
    }
}

struct GeneratorTryWrapper<Y, T, E, G>
    where G: Generator<Yield=Y, Result=Result<T, E>> + Sized
{
    gen: G,
}

impl<Y, T, E, G> Iterator for GeneratorTryWrapper<Y, T, E, G>
    where G: Generator<Yield=Y, Result=Result<T, E>> + Sized
{
    type Item = Result<Y, E>;
    
    fn next(&mut self) -> Option<Self::Item> {
        match self.gen.resume() {
            GeneratorState::Complete(Ok(_)) => None,
            GeneratorState::Complete(Err(err)) => Some(Err(err)),
            GeneratorState::Yielded(val) => Some(Ok(val)),
        }
    }
}

Though I would prefer to hide GeneratorTryWrapper behind impl Trait if possible.

UPD: Small offtopic: I think using Iterator instead of a more general Generator (or maybe even generator with a parametrised resume?) was one of the mistakes of Rust 1.0 release. Coupled with trait aliases ergonomic impact would've been negligible and code could've been more expressive and flexible. But right now ecosystem is too Iterator-centric, and making Iterator an alias for Generator<T, ()> in a backwards-compatible way will be very difficult, if not impossible. At the very least we will have to start with renaming Yield to Item before Generator stabilisation.

newpavlov · February 18, 2019, 10:34pm

Also a relevant question: are we sure we want to use the same syntax for generators and closures? IIUC the only way to disambiguate them is to look inside closure body for yield keyword. So if we remove the last yield will generator suddenly become a closure? I think it can be quite surprising, plus to me personally it feels somewhat wrong to have such implicitness in Rust.

Diggsey · February 18, 2019, 10:56pm

I believe a fourth way to solve this would be with generic associated types:

trait IterableGeneratorReturn {
    type Item<Yield>;
    fn next<G: Generator<Return=Self>>(g: G) -> Option<Self::Item<G::Yield>>;
}

This trait can be implemented for () and Result<(), E> without overlap, and then a single generic implementation of Iterator for G: Generator, G::Return: IterableGeneratorReturn can be provided.

scalexm · February 19, 2019, 1:49pm

Would it be possible for solution 2 that we hack in a fix within the current coherence solver rather than waiting for the chalk integration to complete?

I mean, I think we already have experimented with negative reasoning in chalk so that we are confident enough disjunction based on associated items can work. And @nikomatsakis and @aturon expressed interest on trying to work on important features within the existing trait solver rather than relying too much on « chalk will solve it, we just need to integrate it in the compiler ».

Maybe this is one of those important features that are actually hackable without chalk, but honestly I don’t know

Ralith · February 19, 2019, 6:14pm

Strongly agree with @newpavlov here. I think the real root problem here is that using impl Iterator<Item=Result<T, E>> to represent iterators that may fail rather than iterators over a sequence of Results is a lossy and awkward pattern to begin with. Generators as the more general abstraction present an opportunity to fix that, as illustrated e.g. in @newpavlov’s pre-RFC, and we shouldn’t throw that away.

djc · February 20, 2019, 9:29am

I think this is very useful information when thinking about what we might want generators for in Rust. The first blog post started out defining the goal of generators in Rust as "allowing imperative control flow to create Iterators and Streams the same way async fn allows imperative control flow to create a Future", but it's not super clear to me what concrete use cases there are for this.

In my mind the imperative control flow for Iterator is a pretty clear use case, but at the same time it doesn't seem like a very big win -- translating imperative control flow to explicit state seems pretty straightforward. (By all means tell me if I missed something here!)

Python moved away from using generators for coroutines, so do we have any concrete/practical use cases in that space that motivate what seems to potentially be a substantial addition of complexity to the language?

Topic		Replies	Views
No return for generators language design	15	2020	October 18, 2019
[Pre-RFC]: Generator integration with for loops language design	25	10527	January 24, 2018
Help test async/await/generators/coroutines! announcements	33	22632	September 6, 2017
Pre-RFC: Await generators directly language design	10	3216	April 2, 2018
Rust generators: exploration of potential syntax language design	23	3441	December 12, 2021

Blog series: generators

Related topics