Pseudo RFC? Interleave Iterator

Gankra · July 23, 2014, 3:45pm

Hey all, I’ve been directed to move this here since it doesn’t need a full-blown RFC.

I’m adding an interleaving iterator to the set of core iterator adaptors. That is a_iter.interleave(b_iter) yields a1,b1,a2,b2,... until one terminates.

There’s no easy way to accomplish this with the current set of iterator adaptors, and there are enough subtle corner cases that I wouldn’t expect someone to trivially implement it correctly when they need it.

Off the top of my head, this is desirable for when something expects a single iterator, but you want to interleave the contents of multiple iterators. For instance, if you wanted to create a stream of loosely unsorted data, you could do something like sorted.interleave(sorted.rev())

I’ve actually fully implemented this in this pull request, this is sort of a retroactive RFC-ish thing to see if this is in fact desirable to have in libcore at all.

My biggest open question on the whole thing is that it’s possible to have interleave not terminate when any one of the source iterators terminates, but rather just output the rest of the remaining iterator. This would increase complexity, and is not clearly desirable. In this vein, Interleave could potentially be augmented to provide all kinds of combinations of behaviour, such as offsetting the splicing of the iterators (a1, a2, a3, b1, a4, b2, a5, b3, ...) or unbalanced usage of the sources (a1, a2, b1, a3, a4, b2...).

Would we want any of these in addition or instead of my current default behaviour?

cmr · July 23, 2014, 4:29pm

zip also doesn’t continue to yield after any of the iterators is exhausted, so I think it’s fine if interleave doesn’t. This seems like a fine adapter to me. Anything more complex than the simple interleave shouldn’t be in core, they seem incredibly niche.

pczarn · July 23, 2014, 7:26pm

I think there has to be a general approach that would fit the current set of iterator adaptors. Interleaving is flattening after transposition. Perhaps all we need is a new method, transpose.

// a, b, c: Vec<T>
a.iter().zip(b.iter()); // move_iter can't work on tuples or arrays...

(a.iter(), b.iter(), c.iter()).transpose()
                                  // -> Iterator<Iterator<&T>>
                              .flat_map(|it| it)
                                  // -> Iterator<&T>

transpose could be implemented for a trait IteratorTuple3<T, T, T> = Tuple3<Iterator<T>, Iterator<T>, Iterator<T>>.

lilyball · July 23, 2014, 7:33pm

I’m generally in favor of interleave. I hadn’t considered allowing it to yield the remaining elements after one iterator is exhausted. That actually may be useful. There have in fact been multiple times where I wish you could at least test what caused Zip to stop yielding elements, though I don’t know a good way to do that (short of having the iterator yield Option<(Option<A>, Option<B>)> which isn’t very good).

To allow such behavior, perhaps just a method .chain_remainder() that changes the behavior to only stop once both iterators are exhausted. A boolean flag would be sufficient for the next() impl, but the downside is this may significantly complicate the impls of the other methods. This could also yield a separate type entirely, which would leave the methods simpler but would still require writing a lot of code similar to the current code. Because of this, I think it’s reasonable to not do this now, but I wouldn’t rule out adding such a method down the road if someone has a good use for it.

bachm · July 23, 2014, 8:50pm

What I miss is the ability to create iterators in a flat_map that yield a (short) sequence of values. This makes it easy to write things like interleaving on the fly. zip, flat_map, then creating an iterator that converts the pair into a sequence of two values we’re there. The naive approach of creating a fixed-size array in the flat_map, then iterating over it doesn’t work due to lifetimes issues. I’m guessing that a move_iter off a fixed-size array is all that’s needed to make this work.

Gankra · July 23, 2014, 9:01pm

Note that somehow transforming the contents of a zip will never work here.

Given:

let a = [1,3,5,6]
let b = [2,4]

Zip will only yield 1-4, but interleave should yield 1-5 (as currently defined).

dobkeratops · July 23, 2014, 10:50pm

something related that would be nice IMO is “intersperse” e.g. pretty printing values interspersing a separator. (is there an existing elegant way of doing that?) eg something like [1,2,3,4].iter().map(|x|x.to_str()).intersperse(",").concatenate() == “1,2,3,4”

pczarn · July 23, 2014, 10:59pm

I like interleave without any additional behavior. I'm in favor of adding it to libcore.

Note that somehow transforming the contents of a zip will never work here.

Good point. But a zip might work after extending the second iterator...

a.zip(b.map(Some).chain(Some(None).move_iter())).flat_map(...)

something related that would be nice IMO is "intersperse"

Something like .interleave(Repeat::new(",")) should work... almost, producing "1,2,3,4," with the trailing comma.

Gankra · July 23, 2014, 11:14pm

If you have exact size, you can do it perfectly:

fn intersperse <T:Clone, I: ExactSize<T>> (it:I, seperator:T) -> Interleave<I, Take<Repeat<T>>>{
    let len = it.len();
    it.interleave(Repeat::new(seperator).take(len-1))
}

matthieum · July 26, 2014, 6:28pm

Actually, I think that the idea of intersperse being implemented in terms of interleave actually reveals why stopping whenever on of the two streams stop is so useful: it lets you use infinite streams.

However, I wonder if we need not consider the case of more than 2 streams, chaining interleave calls does not work, it seems:

a.iter().interleave(b.iter()).interleave(c.iter())

will produce a stream:

a0, c0, b0, c1, a1, c2, b2, c3, ...

which differs from what interleave(a.iter(), b.iter(), c.iter()) would produce:

a0, b0, c0, a1, b1, c1, a2, b2, c2, ...

Thus, I do wonder if interleave should be added to the iterator trait, or if it would be better to instead define a free function that would accept any number of iterators and produce an interleaving iterator out of them ? Of course, the biggest issue would be the “any number of iterators”.

Gankra · July 26, 2014, 6:44pm

Oh, interleave has been formally rejected from Rust (as of last night) for being not sufficiently useful to justify supporting it as part of std.

bluss · July 27, 2014, 4:21pm

Maybe we should make a separate cargo crate for extra iterator tools

I’ve got a cartesian product (IterA x IterB) implementation that would be useful sometimes, and macros for arbitrary product (IterA x IterB x … x IterZ), and an iterator comprehension macro.

Gankra · July 28, 2014, 6:04pm

Blake, if you want to make a library, you’re more than free to take the code I moved to: https://github.com/Gankro/rust/tree/interleave

licensed under Rust’s license, I guess.

bluss · July 28, 2014, 9:05pm

I’m working on the library now just for fun. I’m collecting all the iterator stuff I have in various files. I do have an interleave impl from 2013, and it’s very similar (except it runs both iterators to the end). Don’t be offended but I’m going to start with my code even for interleave. (Now pushed as rust-itertools)

I guess I’ll come back to your impl to look at the next_back impl.

Gankra · July 28, 2014, 9:27pm

No worries man, have fun. I only wrote this to learn.

pczarn · July 28, 2014, 9:42pm

@matthieum Exactly my thoughts, your interleave(a.iter(), b.iter(), c.iter()) would have been (a.iter(), b.iter(), c.iter()).interleave() in my example, because flattening zip/transpose has slightly different behavior, as pointed out.

I think remainder handling is the only distinction between intersperse / flattened zip / interleave / interleave with chain_remainder.

bjz · July 29, 2014, 1:42pm

Something like this would be super useful for things like fmt::Show implementations for collection-like things. I’ve really missed it in the past, for example:

impl<T: fmt::Show> fmt::Show for RingBuf<T> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        try!(write!(f, "["));
        for (i, e) in self.iter().enumerate() {
            if i != 0 { try!(write!(f, ", ")); }
            try!(write!(f, "{}", *e));
        }
        write!(f, "]")
    }
}

Stuff like this litter the standard libs. Ick.

lilyball · July 29, 2014, 7:49pm

Maybe we could have a std::io helper function that looks like

fn write_sequence<T: fmt::Show, It: Iterator<T>, W: Writer>(w: &mut W, mut it: It, sep: &str) -> IoResult {
    for (i, e) in it.enumerate() {
        if i != 0 { try!(write!(w, ", ")) }
        try!(write!(w, "{}", e));
    }
    Ok(())
}

This way you can say write_sequence(f, self.iter(), ", ") (this differs subtlety from your function in that it writes e instead of *e but &T should render the same as T; alternatively it could take an Iterator<&T> instead and show *e).

Although you still need to map IoResult to fmt::Result, but I guess that’s not hard.

lilyball · July 29, 2014, 8:34pm

That’s not to say that interleave wouldn’t be helpful here, though I don’t think it actually would work right for this. It requires both iterators have the same element type (whereas you want to mix arbitrary T: fmt::Show and &str elements), and assuming you interleave with an infinite cycle of ", " elements it would actually print a trailing ", " after the last element.

system · March 25, 2019, 8:22am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] Replace IteratorExt::zip with tuple iteration ideas (deprecated)	25	5087	March 25, 2019
Pre-RFC: Add a chunk iterator to libcore libs	18	2543	January 13, 2022
[Pre-RFC] Iterator adaptor for handling intermediate Results libs	12	1029	August 26, 2019
Blog Post: Futures Concurrency III language design	2	924	May 16, 2022
Hints about primitive and standard type's functions implemented in another crates in std docs documentation	2	938	March 25, 2019

Pseudo RFC? Interleave Iterator

Related topics