What do you think about making a and b fields of std::iter::Zip public?

ouaicava · March 14, 2021, 5:50pm

The iterator zip stops returning elements once one of the two iterators is empty.

There are no easy way of consuming the remaining elements of the longest iterator.

A solution to this would be making the a and b fields of the Zip struct public.

Another solution would be to add a rest method that keeps returning elements from the longest iterator. This could cause a problem with next_back.

What do you think about this ?

steffahn · March 14, 2021, 6:10pm

You need to address this problem raised in the URLO thread you opened yesterday.

Just making the fields public doesn't solve the problem of skipping a value for the longer iterator, at least if iterator a is the longer one.

Of course it's feasible to extend Zip in a way that saves this otherwise forgotten value and offers some access to the remaining elements, e.g. through another iterator or some other kinds of methods like you proposed.

steffahn · March 14, 2021, 7:04pm

Another small problem is that not immediately dropping the first element of a on the last call to .next() would make a small but observable change to the behavior of Zip.

A bigger problem is the DoubleEndedIterator implementation. Currently if you zip two DoubleEndedIterator + ExactSizeIterator iterators then the resulting Zip also supports DoubleEndedIterator operation such as next_back(). For this, the implementation does—on the first call to next_back—first trim the longer iterator so that they both have the same length, with the goal that the iterators are still, logically, being “zipped from the front” (i.e. you really get the same list of values, just backwards [as well as reversed side-effects]). The problem is: An iterator does (in general) not allow you to just “leave” the last n elements alone and save them for later. The only way in which the rest/remainder of Zip could be kept after a call to next_back is by storing them in a new Vec or something; while caching a single element is reasonable, doing a whole potentially large allocation “just-in-case-it’s-needed” for an uncommon use-case seems very unreasonable.

Possible solutions include panicking or otherwise failing on calling a potential rest/remainder method in case next_back was called. Maybe the more reasonable approach is to just introduce an alternative to zip that offers such a rest method (and doesn’t implement DoubleEndedIterator) and maybe do this only in a third-party-crate.

scottmcm · March 14, 2021, 7:27pm

My first instinct here is that I can't think of any other iterator adapter with a public field, so I'm skeptical. Maybe a method to expose something (like the slice iterators have methods to give out slices), but probably not a field.

But broader than that, it makes me wonder if Zip is even the right thing for this use case. Can you describe why you want this? Are the iterators ESI? Could it be better done with zip_longest? Etc.

H2CO3 · March 14, 2021, 7:38pm

Do also note that it's not an accident the fields are private. In general, types that need to uphold their internal invariants can't just expose fields as public, because anyone could then mutate those fields in a way that would cause the state of the type to be inconsistent.

jhpratt · March 14, 2021, 7:53pm

^cough use case for read only fields

H2CO3 · March 14, 2021, 8:34pm

Meh. You could just write a getter for that. But more importantly, what do you do with a general read-only iterator? You can't even .count() it unless you own the whole Zip.

cuviper · March 15, 2021, 12:49am

I fear what public fields would mean for all the unsafe impl and specialization used by Zip. I don't know any specific way that would be a problem at the moment, but the risk would be increased.

steffahn · March 15, 2021, 1:25am

Oh, I’m certain that it’s unsound to make the fields public with the current Zip implementation. For example, look at Vec::IntoIter’s implementation of __iterator_get_unchecked and how Zip uses it and think about the lovely ways it could read beyond the end of the vector’s allocation if you threw a few direct calls to .next() (increasing self.ptr) on Zip’s fields into the mix if they were public.

2e71828 · March 15, 2021, 9:10am

An alternative would be an into_inner(self: Zip<A,B>)->(A,B) method to get the unconsumed portion of the original iterators back. There's no danger of violating Zip's invariants if the Zip object doesn't exist anymore.

matklad · March 15, 2021, 9:14am

I believe this is handled by the by_ref method? (modulo the issue that one element is skipped anyway).

2e71828 · March 15, 2021, 9:24am

by_ref serves a similar purpose, but must be called before creating the Zip. It can't be used to destructure a Zip that was received as an argument or returned from a function.

system · June 13, 2021, 9:25am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Non-truncating, more usable zip() language design	12	5363	March 25, 2019
Order of evaluation for zip iterator	9	1635	March 25, 2019
[Pre-RFC] Replace IteratorExt::zip with tuple iteration ideas (deprecated)	25	5070	March 25, 2019
Iterator::advance (again) language design	3	1607	January 7, 2021
Feature request: PureIterator trait libs	14	880	December 21, 2022

What do you think about making a and b fields of std::iter::Zip public?

Related topics