Background: Streaming Iterators
Rust’s Iterator
trait has a well-known restriction: Iterator::next
only borrows the iterator temporarily, rather than for the life of the Item
it yields. This makes methods like collect
possible (because you can call next
again while previously-yielded items are still alive), but on the other hand it’s not possible for an iterator to yield references to values that it owns, or to require that previous-yielded values are gone before a new value is yielded.
This means that certain types of iteration (called “streaming iterators” in Gankro’s OMG ITERATORS talk and elsewhere) can’t use the Iterator
trait, and must use alternate traits like StreamingIterator. Streaming iterator traits are difficult to express generically without language extensions such as associated type constructors (ATC).
Streaming Generators
The new Generator trait suffers from the same limitation, for the same reasons: Generator::resume
cannot borrow the generator for the lifetime of the yielded value. With the ATC syntax proposed by RFC 1598, we could instead write a generator trait that allows arbitrary “streaming” generators:
pub trait Generator {
type Yield<'item>;
type Return<'item>;
fn resume<'item>(&'item mut self) -> GeneratorState<Self::Yield<'item>, Self::Return<'item>>;
}
The compiler would need the ability to infer appropriate lifetimes for the associated types of generator literals. (This is a problem that doesn’t apply to streaming iterators, since we don’t have iterator literals.)
Allowing streaming generators would complicate generic code that only works with non-streaming generators, for example collect
-like functions that repeatedly call a generator while keeping previously-yielded values alive. Such functions would need a bound to express that the Yield
and/or Return
type is not limited to the 'item
lifetime. On IRC, @withoutboats gave a possible solution using implication bounds (which are not part of the language today, but are possible in Chalk).
Motivating Examples
These are examples of code that someone might want to write, but can’t with the current experimental generator features.
Example 1 (playground): A generator that returns references to data that it owns, to avoid allocating new resources for each yielded item:
let lines = || {
let mut f = BufReader::new(File::open("data.txt")?);
let mut s = String::new();
loop {
s.clear();
match f.read_line(&mut s) {
Err(e) => return Err(e),
Ok(0) => return Ok(()),
Ok(_) => yield &s[..],
}
}
};
Example 2 (playground): A generator that yields mutable references to the same data more than once, with a user that keeps only one yielded item alive at a time:
struct Player;
impl Player {
fn take_turn(&mut self) {}
}
let mut player1 = Player;
let mut player2 = Player;
let mut turns = || {
loop {
yield &mut player1;
yield &mut player2;
}
};
while let Yielded(player) = turns.resume() {
player.take_turn();
}
I think these examples show that if streaming generators were available, they could also address many of the use cases for streaming iterators.
Borrow Checking
Currently, the body of a generator cannot contain any borrows that persist across a yield
, because these may be invalidated if the generator is moved while suspended. (This affects “Example 1” above.) However, if resume
borrows the generator for the lifetime of the yielded item, then the generator is guaranteed not to move for at least that lifetime. Therefore, I think, this restriction could be relaxed as long as any borrows that exist at a yield point have lifetimes no longer than the yielded type.
I think non-lexical lifetimes might also be important for allowing code like the examples above, but I haven’t dug into this further.
Questions
- Are streaming generators both feasible and worth supporting?
- If so, should there be a single
Generator
trait that supports both streaming and non-streaming generators, or two separate traits? - If the current non-streaming
Generator
trait were stabilized as-is, would it cause any problems for adding streaming generator support later?