Do we need pinned sync iterators?

Recently, there has been some discussion about what it would take to support pinned synchronous iterators. One consideration is how useful such iterators would be. If they are useful, one would expect to see people using workarounds to emulate them. Of the potential workarounds, the simplest is to implement Iterator on Pin<&mut IterType> instead of on IterType directly.

And, according to a Sourcegraph search, it seems that… zero people are doing it. Not even one person has reached for this workaround, anywhere online.

This is pretty conclusive evidence, to my eyes, that core Rust shouldn't make disruptive changes to accommodate this non-use case.

How many futures that are implemented by hand (as opposed to through async fn) require pinning? This feature would make generator functions much more expressive by allowing self-references.

Your argument ignores the fact that the major use case for pinned iterators, namely generators, is a nightly only feature, which however almost nobody uses because it's an unfinished feature. And one of the biggest reasons it is unfinished is the open question of ergonomics due to pinning, hence this whole discussion.

By the way, a similar pattern is implemented by the genawaiter and yield-return crates. In both cases the pinning required to implement Iterator is done internally. The former allows pinning on either the stack or the heap, but in both cases it requires a bunch of macros (especially in the stack case). The latter only allows pinning on the heap. I would consider both non optimal, since you have to either choose between horrible ergonomics or non-zero cost.

2 Likes

I would argue that Iterators are fundamentally different from Futures, in such a manner that Iterator generators won't much need Pinning either.

  • Top-level Futures need to be 'static to be spawned. Because of this, they must own all their data. This makes it difficult to compose futures while avoiding self-references, which was a problem even before async/await.

  • Futures are usually "heterogeneous". They do thing 1, then await, then thing 2, then await again, then a third thing, await once more. Each step is unique and requires its own distinct set of borrowed state. In contrast, with Iterators, you generally have one thing, or a small set of possible things, that you are doing over and over. This limits the complexity of setting up all the borrows you need, and destroying them before yielding.

Even if generators don't have the same restriction it doesn't mean that it becomes unwanted for generators to own their state, especially if it happens to be generated internally. The alternative would be to have the user always pass everything the generator needs to it, which greatly limits what generators can do by themselves.

If you need to repeat the same thing over and over then isn't a normal iterator enough? The whole point of generators is that sometimes you need to express heterogeneous computations and iterators are really bad for that. If you prioritize the case where computations are simple and already expressible with iterators then what's the point of generators?

I feel you kind of got the reasoning backward. Current existing implementation doesn't do "complex" things is the result of Iterator being unpinned.

I, for one, need generator for some of my projects. I just invent my own trait for that and have something similar to impl<T: Iterator> Generator for T. I don't know why would you conclude something currently very cumbersome to do is a "non-use case" from a single search alone.

Edit: Additionally, your search term is quite misleading. If you search for Iterator for Pin without quotes, then you get some results that matches fully. It's even kind of suggested at you search result page.

(NOT A CONTRIBUTION)

It's complicated.

First of all, the need for self-references in async code was immediately evident. But it didn't look like futures being hand-written with self-references: it looked like future combinators having to put all their state in Arc<Mutex<_>> to capture it in two different closures, functionally "holding it across the await point." I never see Iterator combinator code that looks like that.

On the other hand, it probably looks different in Iterator. The place I can easily imagine it is that you need to collect an initial iterator into a vector for some reason, and then iterate over that collection. This would possibly look self-referential in a normal generator.

A solution for that might be to do something like this, to move the vec out of the generator:

gen fn process_vec(mut v: Vec<MyType>) -> MyOtherType {
   v.sort();
   for item in v {
       yield map_type(item)
   }
}

fn generator(iter: impl Iterator<Item = MyType>) -> impl Iterator<Item = MyOtherType> {
    process_vec(iter.collect())
}

I'm not sure if this would actually work or not. But it would be promising if there were patterns like this that solve the problem, and also would be promising if the problem didn't come up much. Right now there's not enough information to make this decision.

One option would be to disallow self-references in non-async generators in 2024, and then if that ends up being the wrong choice, do the more disruptive fix to make iterators pinned in 2027.

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.