Should we implement ExactSizeIterator for Chain?

// Return none if the iterator does not contain at least 3 items, 
// and an iterator contains exactly 3 items otherwise
fn take_exact_3<T>(i: &mut impl Iterator<Item = T>) 
    -> Option<impl ExactSizeIterator<Item = T>> {
    let t1 = i.next()?;
    let t2 = i.next()?;
    let t3 = i.next()?;
    Some(once(t1).chain(once(t2)).chain(once(t3)))
}

The above didn't compile because Chain<T1, T2> didn't implement ExactSizeIterator even when T1 and T1 both implements it.

In theory though, if T1 and T2 all have exact sizes, the chained iterator should also have exact size.

Is this a missing piece of the standard library? I am looking forward to write a generic macro for those take_exact_n functions.

I believe the main reason it doesn't have this already is that adding two lengths may overflow.

It is a bit weird since we can create ExactSizeIterator with 0 or 1 items, but we couldn't even create one that have exactly 2/3 items unless defining a newtype.

In general though, rust does not prevent this kind of errors: if you access an Vec with out-of-bound index you panic. So I don't know why we couldn't do the same here - if the user access the len method and it is overflowed, just let it panic.

Interestingly, a.chain(b).take(usize::MAX) would be able to be ExactSizeIterator but that would require all kinds of specialization. I wonder if the precedent here is enough to introduce this as its own concept?

You can go a lot further with all kinds of logic like this.

For example this one is always of length 10, no matter what kind of iterator i is:

once(x).chain(i).cycle().take(10)

Edit: This example would be particularly hard to archieve based on the type system alone since the type of the iterator that once produces also allows for the iterator to be empty (after next was called).

That's true in theory, where we have infinite-precision integers. But it's not true in practice where we only have usize.

This is the very nuanced difference between the rules for ExactSizeIterator -- where the len must be exact -- and TrustedLen -- where the size hint is allowed to be infinite -- which is why their rules for various adapters are different.

2 Likes

Is panicking acceptable in such case? Would presence of panic negatively affect optimizations?

Maybe it could saturate? Behaving like .take(usize::MAX) shouldn't be limiting for practical applications, since it's already problematic to process this many items on 32-bit machines, and outright impossible on 64-bit.

And it can never overflow with any collections that are actually in memory, because by definition there's not enough address space for two collections to be that big.

I'm going to lazily reply with links... :sweat_smile:

4 Likes

This response makes me think that Rust would benefit from having a technical FAQ.

6 Likes

Well, I bet a PR to describe this in Chain's docs would be welcome.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.