Pre-RFC: Add a chunk iterator to libcore

I'm drafting up an RFC to add an iterator that groups elements it receives into arrays and then returns those arrays. Could I add anything to this RFC?

1 Like

Perhaps the new method should be called chunks_exact, because it behaves like slice.chunks_exact() rather than like slice.chunks().

It could also use a method like ChunksExact::remainder to access the left-over elements.

  • remainder is definitely a concern, and it's probably the most complicated part of the API:
    • What should it return?
      • a (possibly mutable) slice?
      • some kinds of ArrayVec?
      • an iterator over owned items? Could this be an already advanced array::IntoIter?
    • Even if we ignored it, it can still affect other methods
      • For example implementing DoubleEndedIterator should require the underlying iterator to implement DoubleEndedIterator, otherwise calling .next_back() would yield different chunks than .next()
  • Do we really need ChunkBuffer? Couldn't this reuse collect_into_array?
  • N possibly being 0 is also a concern. Note that this is currently a blocker for stabilizing the array_chunks feature.

One thing you might consider is how you're going to get the chunks, and whether that's well optimized today. Even on a simple iterator like a slice one, ends up having poor codegen.

So consider whether a first step here might be a next_chunk method on Iterator so that different implementations have have it behave more efficiently that the default try_fold-based one would.

(Or maybe you've found a smarter implementation than I could for it and none of this is a concern.)

I guess, but it doesn't really behave like chunks_exact either, since it returns arrays instead of slices.

I've added remainder and remainder_mut as methods. They return a slice and a mutable slice, respectively.

ChunkBuffer provides functionality that is now used in the remainder method, as well as making folding/try-folding much easier.

I've added a check to the chunks() method that panics if N is 0, since that's what array_chunks currently does.

There's probably a better way of using ChunkBuffer, I agree. I wonder if using InPlaceIterable would help at all, like it does in Zip.

It is quite unfortunate that they don't give owned access to those items. I would at least mention this.

I would still consider extending collect_into_array to allow those functionalities, or maybe a potential ArrayVec, whenever that will be introduced. Anyway, even if it doesn't end up like this, I think this is something that should be discussed, so I would put it under a "possible concerns" or "unresolved questions" section.

Note that this is not supposed to be the correct/final behaviour, but you make it look like it is. I would mention that it will temporarily panic if N is 0, but it is supposed to give a compile time error and that this should be a blocker for stabilization.

Zip doesn't use InPlaceIterable, that trait is used for in place collection. You might be thinking of TrustedRandomAccess. I think it might be worth specializing on TrustedLen and/or TrustedRandomAccess. The former might be more supported, but the latter may enable more optimizations.

By the way the proposed implementation of advance_by is wrong, it discards the remainder. The implementation of DoubleEndedIterator also ignores the remainder.

I've added an into_remainder method to Chunks that gives the remainder in the form of an owned iterator.

I've noted that in the unresolved questions section.

Also noted this.

Chunks already specializes on TrustedLen.

Could you elaborate on this? How could I fix it so that it doesn't ignore the remainder?

I didn't mean to implement TrustedLen on Chunks, but to specialize Chunks::next for iterators that implement TrustedLen, this way it doesn't need to do a check for each item, you can just check that the number of remaining items is bigger than the chunk size and then you can just repeatedly call .next().unwrap_unchecked(). Similar for TrustedRandomAccess, except in that case you can call __iterator_get_unchecked.

For DoubledEndedIterator I think you have to require that the underlying iterator implements ExactSizeIterator, then use its size hint to know how many items will be in the remainder. Then you can first take the remainder and then call .next_back() N times.

For advance_by I think it may be a bit more complex. In the worst case you could keep the default implementation.

Yeah I see that now. I've reworked the proposal as a whole to accommodate this.

I still think advance_by's implementation is wrong. For example I would expect this to succeed, instead the assertion after the advance_by fails:

fn test_advance_by() {
    let elems = &[1, 2, 3, 4, 5, 6, 7];

    // Manually advance 4 times
    let mut chunks = elems.iter().copied().chunks::<2>();;;;;
    assert_eq!(chunks.remainder(), &[7]);

    // Use advance_by(4)
    let mut chunks = elems.iter().copied().chunks::<2>();
    assert_eq!(chunks.remainder(), &[7]);

I see. I've removed the optimized advance_by for the time being until I can think of a better solution.