Streaming Generators?


#1

Background: Streaming Iterators

Rust’s Iterator trait has a well-known restriction: Iterator::next only borrows the iterator temporarily, rather than for the life of the Item it yields. This makes methods like collect possible (because you can call next again while previously-yielded items are still alive), but on the other hand it’s not possible for an iterator to yield references to values that it owns, or to require that previous-yielded values are gone before a new value is yielded.

This means that certain types of iteration (called “streaming iterators” in Gankro’s OMG ITERATORS talk and elsewhere) can’t use the Iterator trait, and must use alternate traits like StreamingIterator. Streaming iterator traits are difficult to express generically without language extensions such as associated type constructors (ATC).

Streaming Generators

The new Generator trait suffers from the same limitation, for the same reasons: Generator::resume cannot borrow the generator for the lifetime of the yielded value. With the ATC syntax proposed by RFC 1598, we could instead write a generator trait that allows arbitrary “streaming” generators:

pub trait Generator {
    type Yield<'item>;
    type Return<'item>;
    fn resume<'item>(&'item mut self) -> GeneratorState<Self::Yield<'item>, Self::Return<'item>>;
}

The compiler would need the ability to infer appropriate lifetimes for the associated types of generator literals. (This is a problem that doesn’t apply to streaming iterators, since we don’t have iterator literals.)

Allowing streaming generators would complicate generic code that only works with non-streaming generators, for example collect-like functions that repeatedly call a generator while keeping previously-yielded values alive. Such functions would need a bound to express that the Yield and/or Return type is not limited to the 'item lifetime. On IRC, @withoutboats gave a possible solution using implication bounds (which are not part of the language today, but are possible in Chalk).

Motivating Examples

These are examples of code that someone might want to write, but can’t with the current experimental generator features.

Example 1 (playground): A generator that returns references to data that it owns, to avoid allocating new resources for each yielded item:

    let lines = || {
        let mut f = BufReader::new(File::open("data.txt")?);
        let mut s = String::new();
        loop {
            s.clear();
            match f.read_line(&mut s) {
                Err(e) => return Err(e),
                Ok(0)  => return Ok(()),
                Ok(_)  => yield &s[..],
            }
        }
    };

Example 2 (playground): A generator that yields mutable references to the same data more than once, with a user that keeps only one yielded item alive at a time:

    struct Player;
    
    impl Player {
        fn take_turn(&mut self) {}
    }

    let mut player1 = Player;
    let mut player2 = Player;
    
    let mut turns = || {
        loop {
            yield &mut player1;
            yield &mut player2;
        }
    };
    
    while let Yielded(player) = turns.resume() {
        player.take_turn();
    }

I think these examples show that if streaming generators were available, they could also address many of the use cases for streaming iterators.

Borrow Checking

Currently, the body of a generator cannot contain any borrows that persist across a yield, because these may be invalidated if the generator is moved while suspended. (This affects “Example 1” above.) However, if resume borrows the generator for the lifetime of the yielded item, then the generator is guaranteed not to move for at least that lifetime. Therefore, I think, this restriction could be relaxed as long as any borrows that exist at a yield point have lifetimes no longer than the yielded type.

I think non-lexical lifetimes might also be important for allowing code like the examples above, but I haven’t dug into this further.

Questions

  1. Are streaming generators both feasible and worth supporting?
  2. If so, should there be a single Generator trait that supports both streaming and non-streaming generators, or two separate traits?
  3. If the current non-streaming Generator trait were stabilized as-is, would it cause any problems for adding streaming generator support later?

#2

Would that work? If I drop the reference I got from resume(), the generator becomes “unborrowed”, so I’d be free to move it, wouldn’t I?


#3

On a related note, I just happened to accidentally visit the issues page, where I noticed this:

https://github.com/rust-lang/rust/issues/44265

I think we need some sort of town crier to sound a big, blaring blow horn whenever an RFC gets accepted.