[Pre-RFC] Replace IteratorExt::zip with tuple iteration

oli-obk · February 2, 2015, 3:55pm

Start Date: 2015-01-30
RFC PR #: (leave this empty)
Rust Issue #: (leave this empty)

Summary

Implement IntoIterator (since RFC 235 has landed) for the common tuple types and remove the zip function from Iterator.

Motivation

The zip function is convenient for iterating over two iterators at the same time. But when iterating over more iterators simultaneously causes unreadable code. A user might expect the following for loop to work:

for (x, y, z) in (1..10, 2..11, 3..13) {
    println!("{}", (x, y, z));
}

but instead is required to write

for ((x, y), z) in (1..10).zip(2..11).zip(3..12) {
    println!("{}, {}, {}", x, y, z);
}

Detailed Design

Remove IteratorExt::zip
replace std::iter::Zip by a struct and some macro-tuple-magic

Not quite finished implementation (+ all features zip had). WIP-implementation to be found at my repository. Will shout loudly at you because I haven’t figured out IntoIterator for references yet.

#[must_use = "iterator adaptors are lazy and do nothing unless consumed"]
pub struct TupleIterStruct<T> {
    inner : T
}

macro_rules! head {
    ($head:ident, $($tail:ident,)*) => {
        $head
    }
}

macro_rules! impl_ii_tuple {
    ( $($name:ident,)+) => (
        impl<$($name,)*> Iterator for TupleIterStruct<($($name,)*)>
            where $($name: Iterator,)*{
            type Item = ($(<$name as Iterator>::Item,)*);

            #[allow(non_snake_case)]
            #[inline]
            fn next(&mut self) -> Option<<Self as Iterator>::Item> {
                let ($(ref mut $name,)*) = self.inner;
                // lots of confusing brackets
                // Some -> tuple -> macro argument expansion -> if/else block
                Some(($(
                    if let Some(x) = $name.next() {
                        x
                    } else {
                        // WARNING: partial consume possible
                        // Zip worked the same.
                        return None;
                    }
                ,)*))
            }

            #[inline]
            fn size_hint(&self) -> (usize, Option<usize>) {
                let ($(ref mut $name,)*) = self.inner;
                $(let $name = $name.size_hint();)*

                let lower = head!($($name,)*).0;
                $(let lower = cmp::min($name.0, lower);)*

                let upper = head!($($name,)*).1;
                $(
                    let upper = match ($name.1, upper) {
                        (Some(x), Some(y)) => Some(cmp::min(x,y)),
                        (Some(x), None) => Some(x),
                        (None, Some(y)) => Some(y),
                        (None, None) => None
                    };
                )*

                (lower, upper)
            }
        }

        impl<$($name,)*> IntoIterator for ($($name,)*)
            where $($name : IntoIterator,)* {
            type Iter = TupleIterStruct<($(<$name as IntoIterator>::Iter,)*)>;
            #[allow(non_snake_case)]
            fn into_iter(self) -> <Self as IntoIterator>::Iter {
                let ($($name,)*) = self;
                TupleIterStruct {
                    inner : ($($name.into_iter(),)*)
                }
            }
        }

        impl<$($name,)*> ExactSizeIterator for TupleIterStruct<($($name,)*)>
            where $($name : ExactSizeIterator,)* {}

        impl<$($name,)*> DoubleEndedIterator for TupleIterStruct<($($name,)*)> where
            $($name: DoubleEndedIterator + ExactSizeIterator,)*
        {
            #[inline]
            fn next_back(&mut self) -> Option<<Self as Iterator>::Item> {
                let ($(ref mut $name,)*) = self.inner;
                let len = head!($($name,)*).len();
                $(let len = cmp::min($name.len(), len);)*
                $(
                    for _ in 0..$name.len() - len {$name.next_back(); }
                )*
                // lots of confusing brackets
                // Some -> tuple -> macro argument expansion -> if/else block
                Some(($(
                    if let Some(x) = $name.next_back() {
                        x
                    } else {
                        // WARNING: partial consume not possible here
                        // but code does not reflect that
                        return None;
                    }
                ,)*))
            }
        }
        impl<$($name,)*> RandomAccessIterator for TupleIterStruct<($($name,)*)> where
            $($name: RandomAccessIterator,)*
        {
            #[inline]
            fn indexable(&self) -> usize {
                let ($(ref $name,)*) = self.inner;
                $(let $name = $name.indexable();)*

                let lower = head!($($name,)*);
                $(let lower = cmp::min($name, lower);)*
                lower
            }

            #[inline]
            fn idx(&mut self, index: usize) -> Option<<Self as RandomAccessIterator>::Item> {
                let ($(ref mut $name,)*) = self.inner;
                // lots of confusing brackets
                // Some -> tuple -> macro argument expansion -> if/else block
                Some(($(
                    if let Some(x) = $name.idx(index) {
                        x
                    } else {
                        // WARNING: partial consume possible here
                        return None;
                    }
                ,)*))
            }
        }
    );
}

macro_rules! peel_ii_tuple {
    () => ();
    ($name:ident, $($other:ident,)*) => (
        impl_ii_tuple! { $name, $($other,)* }
        peel_ii_tuple! { $($other,)* }
    )
}

peel_ii_tuple! { T0, T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, }

#Alternatives# ##Keep zip## don’t change anything ##Extend zip## Extend zip to allow more than two items. ##impl Iterator for tuples## simple to implement (tested, works) Still requires .iter() and similar calls for the tuple elements.

#Unresolved questions# I have not thought about mixed move, ref and mut ref tuples.

japaric · February 2, 2015, 4:58pm

Nice, I have done something similar before but with a macro, which is not as nice, but can be implemented out of tree. Althought, I’ll like to see something like this in the stdlib, I’m not sure if core devs would stabilize something like this, because I think a feature like variadics generics would make it possible to do generic programming over tuples and would let us implement this in a cleaner way (for any arity); which would modify/deprecate the TupleIterStruct and I think that last part would be backwards-incompatible [1].

[1] Unless… we get abstract unboxed types before 1.0. Then, we could convert Iter into an “abstract” assoc type, that way the user would only know that into_iter() returns an iterator, but he won’t know what’s the concrete type. That way we don’t need to make TupleIterStruct public nor need to stabilize it. I know that @aturon wants this to avoid stabilizing/exposing lots of Iter structs, but I don’t if it’ll be implemented in time for 1.0.

TL;DR +1 from me, if this can be expanded to n-arity in a backward compatible way.

bluss · February 3, 2015, 12:21am

The “tuple-Zip” also exists in the itertools crate, I think it’s just like your implementation.

DanielKeep · February 3, 2015, 5:44am

A very big from me.

At first, I thought you meant that (1, 2, 3) should be iterable and produce a sequence 1, 2, 3. But this is just… completely unexpected black magic. In every other context, for e in s iterates over each element in a sequence. I can’t imagine that anyone would expect this to suddenly change to for e in s.flat_map_elements() for one specific kind of type in the language.

If you want this, just implement a FlatMapElements trait or something on tuples. Then it’s explicit.

oli-obk · February 3, 2015, 7:14am

phew... didn't think about that, as Zip<A, B> is already exposed. If negative trait bounds will come in the future, then only implementing Iterator for tuples of Iterators would be backwards compatible. In the future IntoIterator for all tuples whose elements implement IntoIterator but not Iterator (or where at least one element does not implement Iterator) could be implemented. That change would also get rid of the Zip type completely, since it does not require its own Iterator struct.

looks very similar, but has fewer traits forwarded and too many mutable variables for my taste. Thanks, will probably send some PRs that way

That would be impossible, the types could be completely different. Therefore I don't think anyone would ever expect that to happen.

So does this. The sequence is the zip of the tuple

DanielKeep · February 3, 2015, 9:04am

How would it be impossible for (i32, i32, i32)? In general, I concur, but it's incorrect to say it's impossible, period.

No, it doesn't. The elements of the sequence (a, b, c) are a, b, and c, not (the elements of a, the elements of b, and the elements of c) all concatenated.

It would be like doing v.iter() where v: Vec<&[u8]> and getting an Iterator<u8> out of it. Just because an operation isn't valid on every possible type doesn't mean those "holes" should be filled with unrelated functionality. Heck, having + mean sequence concatenation is bad enough as it is.

oli-obk · February 3, 2015, 9:22am

correction: it is impossible to check with generics, and we already have [i32, 3] for that.

I think i found our disagreement. It's not a disagreement but a misunderstanding: I don't want to iterate over a, then b, then c. I want to iterate over all of them simultaneously like zip does.

so the first iteration yields a tuple of the first elements of a, b and c. The second iteration yields a tuple of the second elements of a, b and c. ... and so on, until either a or b or c stop returning elements.

DanielKeep · February 3, 2015, 9:48am

You're correct in that I completely brain-farted on "zip".

However, I still believe this is bad for exactly the same reasons. So it's still a disagreement.

Gankra · February 3, 2015, 2:55pm

tuple-iteration would be cool, but surely it’s a discovery hazard (like a lot of other cool ideas)?

oli-obk · February 3, 2015, 4:02pm

what’s a discovery hazard?

tbu · February 3, 2015, 4:10pm

That it's hard to discover that tupling zips the iterators if you search for a way of zipping iterators.

oli-obk · February 3, 2015, 4:12pm

true… although I’d assume stack-overflow solves this after the question has been asked a few times

naicode · February 3, 2015, 5:33pm

I thing a possible solution would be to extend the iterator returned by zip with a additional method witch is like zip but includes the new item into the retuned tuples e.g.:

 for (x,y,z) in (1..10).zip(2..11).and(3..12) {
     //...
 }

(and is just a example name) note that I an not sure how feasible it is to implement this due to the need of generically Generating tuples preferable without macros. Neverless it should be possible.

Note that a (1..10, 2..11, 3..12) is not possible because it is a touple of iterators and iterating over a touple would (if it would be possible) lead to yielding each value respectivly, what might be possible is something like

 for (x,y,z) in (1..10, 2..11, 3..12).into_combined_iterator() {
    //...
 }

(You might have a shorter name)

Also note that changing the zip method on a iterator created by ziping to behave different Would be unsound. E.g if someone might have some iterator over tuples and then zip it the result should be independent whether or not the first iterator was created by e.g. iter. over a Vec of tuples or by zipping 2 iterators.

oli-obk · February 3, 2015, 7:00pm

chaining won't work without variadic generics or some very evil macro magic that doesn't implement and for the "longest" tuple.

oh it's quite possible, i implemented it already, it's just a question of backwards compatibility and convention (whether we want it or not).

yea that would be madness

netvl · February 3, 2015, 7:01pm

This is a very nice suggestion. Scala already does this, but only for tuples of size 2 and 3:

val x = Vector(1, 2, 3, 4)
val y = List(5, 6, 7, 8)
(x, y).zipped.map((x, y) => (x + y)/2).reduce(_ + _)

However, I do like that a decorator method should be called on a tuple in order to obtain something iterable better than just using tuples as an iterator.

netvl · February 3, 2015, 7:02pm

Oh, I see that @naicode already has suggested this, nice!

Gankra · February 3, 2015, 7:10pm

One can simply provide IntoIterator<(a, b)> for (A, B) where A: IntoIterator<a>, B: IntoIterator<b>

nikomatsakis · February 4, 2015, 1:19pm

I’ve often thought that foo.zip(bar) doesn’t read as nicely as I would like, so I kind of like this idea, or some variant thereof. Discoverability is definitely an objection, though we play similar games for collect() and I’m pretty happy with the result. Definitely iterators are a tool one has to learn to write idiomatic Rust.

I have to think about japaric’s concern about backwards compat, but it doesn’t worry me overly much. For one thing, i’d probably be happy just supporting a small number of arities (say 2-12, as usual). You can always build up arbitrary arity by using nested 2-tuples after all. If we do add variadic generics, then we could still use them to add support for arities from 13+ in the future. It’d be a bit uneven but meh.

glaebhoerl · February 5, 2015, 1:00am

Probably my Haskell showing, but my first instinct was to interpret this as a cartesian product ((1,2,3), (1,2,4), (1,2,5) ... (1,2,13), (1,3,3), (1,3,4), ... etc.) rather than a zip.

oli-obk · February 17, 2015, 12:11pm

rfc on github: https://github.com/rust-lang/rfcs/pull/870

Topic		Replies	Views
Non-truncating, more usable zip() language design	12	5304	March 25, 2019
Feature request: expand zip() and unzip() methods on iterator to N-tuple and N iterators libs	7	1850	November 26, 2023
Pre-RFC: Zip.pairmap	13	2976	March 25, 2019
Order of evaluation for zip iterator	9	1632	March 25, 2019
Pseudo RFC? Interleave Iterator libs	19	4000	March 25, 2019

[Pre-RFC] Replace IteratorExt::zip with tuple iteration

Summary

Motivation

Detailed Design

Related topics