Applying `take()` on an infinite iterator should make it exactsizediterator is related.
Take<Repeat>>
already implements it, so Take<Cycle
should as well.
Applying `take()` on an infinite iterator should make it exactsizediterator is related.
Take<Repeat>>
already implements it, so Take<Cycle
should as well.
Take<Cycle>
returns 0 elements if the inner iterator has 0 elements, even when you try to take more.
I guess it would still work for Take<Cycle<impl ExactSizeIterator>>
.
Unless the Clone
implementation doesn't actually produce an equivalent iterator. However, that does already break Cycle
's expected behaviour.
playground (defining an I: ExactSizeIterator
so that Cycle<I>
produces a handful of elements, but no more)
“Evil” implementations don’t matter, as none of this is relevant beyond logic errors. Also, ExactSizeIterator
only requires (not for soundness, but for correctness) that .size_hint()
returns a result of the form (l, Some(l))
, where lower and upper bound match and upper bound is not None
. Everything else (i.e. the bounds being correct themselves) is already a requirement (not for soundness, but for correctness) of Iterator
itself.
And looking at the .size_hint()
implementation of Cycle<I>
already
pub struct Cycle<I> {
orig: I,
iter: I,
}
impl<I> Iterator for Cycle<I>
where
I: Clone + Iterator,
{
fn size_hint(&self) -> (usize, Option<usize>) {
// the cycle iterator is either empty or infinite
match self.orig.size_hint() {
sz @ (0, Some(0)) => sz,
(0, _) => (0, None),
_ => (usize::MAX, None),
}
}
…
}
this implementation does:
(0, Some(0))
when the original iterator (passed to Iterator::cycle
) returns size (0, Some(0))
(usize::MAX, None)
when the original iterator returns some size (l, u)
with lower bound strictly greater than 0
, i.e. l > 0
(0, None)
in the remaining case, lower-bound is 0
, upper-bound is non-0
For a correct ExactSizeIterator
inner iterator, the last case should never happen, so this boils down to
(0, Some(0))
when the original iterator has .len() == 0
(usize::MAX, None)
when the original iterator has .len() > 0
Now, wrap with Take
:
pub struct Take<I> {
iter: I,
n: usize,
}
impl<I> Iterator for Take<I>
where
I: Iterator,
{
fn size_hint(&self) -> (usize, Option<usize>) {
if self.n == 0 {
return (0, Some(0));
}
let (lower, upper) = self.iter.size_hint();
let lower = cmp::min(lower, self.n);
let upper = match upper {
Some(x) if x < self.n => Some(x),
_ => Some(self.n),
};
(lower, upper)
}
…
}
whose size_hint
does essentially simply
(min(n, l), min(n, r))
for an inner iterator size-hint of (l, r)
if we interpret r == None
like r == ∞
n == 0
because it saves the inner size_hint
callNone
case manually, since the Option
API does not offer a way to interpret None
like some kind of ∞
Applying this to the two possible outcomes of Cycle<impl ExactSizeIterator>
:
Take
always maps (0, Some(0))
to (0, Some(0)
cmp::min(lower, self.n)
is always 0
for lower == 0
match upper …
for upper == Some(0)
either results in Some(0)
when 0 < self.n
or in Some(self.n)
if self.n == 0
, but that’s also Some(0)
Take
always maps (usize::MAX, None)
to (self.n, Some(self.n))
self.n == 0
case, so (0, Some(0)
is (self.n, Some(self.n))
cmp::min(lower, self.n)
is always self.n
for lower == usize::MAX
match upper …
case for upper == None
takes _ =>
branch for None
producing Some(self.n)
Combining these facts characterizes the full .size_hint()
behavior of Take<Cycle<I>>
for (correct) contained I: ExactSizeIterator
, namely:
(0, Some(0))
when the original iterator has .len() == 0
(self.n, Some(self.n))
when the original iterator has .len() > 0
all that really matters to us though is simply that both cases do have the form (l, Some(l))
i.e. lower and upper bound are the same and upper bound is not None
.
The assumption that a correct “Clone
” should preserve the size_hint
is already relied upon for the correct implementation of size_hint
of Cycle
.
It looks like this assumption isn’t properly documented, as far as I could tell; perhaps someone should add this to the docs of fn size_hint
? Or perhaps on the whole trait or module; I believe the common practical assumption, with API such as
Cycle
is that .clone()
ing an iterator should generally return another iterator that will produce the “same” list of items.
It’s certainly something that can be relevant for API design, e.g. a receiver iterator for a mpmc
channel should not implement Clone
because what you can get (by sharing or cloning a Receiver
) is two iterators that share the load of items (in the sense that one receives exactly those items the other doesn’t receive). Whereas I suppose a broadcast channel could consider this[1].
depending on the exact flavor of broadcast channel implementation ↩︎
Great explanation! Very well thought out. It sounds like we can add
impl<T: ExactSizedIterator> ExactSizedIterator for Take<Cycle<T>>