[crazy] Replace Most of the Collections API with ranges and iterators


#1

This is not particularly well-thought-out, and requires a lot of stuff we don’t have, but it’s super cool.

Summary

Make remove, get, and get_mut accept ranges, and insert accept IntoIterators. Deprecate iter, iter_mut, clear, drain, and, truncate in favour of these.

That is:

  • foo.clear(); -> foo.remove(..);
  • foo.truncate(a); -> foo.remove(a..);
  • let iter = foo.drain(); -> let iter = foo.remove(..);
  • let iter = foo.iter(); -> let iter = foo.get(..);
  • let iter = foo.iter_mut(); -> foo.get_mut(..);

Additional interesting patterns include:

  • foo.insert(i, some_iterable) inserts all of the elements of the iterable at the given index. This suggests some simple patterns like foo.insert(i, vec![1,2,3,4]), and some weird patterns like inserting an Option unconditionally to “maybe” insert.
  • foo.remove(a..b) gets a drain over an arbitrary subrange, performing the back-shift and length change only when the iterator is dropped.
  • foo.get_mut(a..b) get an arbitrary subrange iterator. Combined with insert this allows arbitrary subranges to be replaced with arbitrary subranges effeciently and cleanly.

Note that remove, insert, and get would continue to provide the same behaviour for integer indexing. This implies that we add some kind of trait for each of these with associated returns types, and implement them for the appropriate inputs, just like Index.

Also note that things like HashMap would not be able to accept arbitrary subranges. Only indices and Full ranges. Also Sorted collections still would like to have range methods for all the inclusive/exclusive combinations.

An interesting side-effect is that remove(i...i).next() is a “non-panicking” remove on a single index (here using a hypothetical inclusive range syntax with 3 dots).

As far as I can tell, there should be no perf regressions by moving to this more condensed iterator-returning form, as e.g. truncate and clear need to iterate the collection anyway to drop the elements, and the different variants should be statically dispatched to.

Requirements

Good to Have


#2

That’s kind of a non-starter, isn’t it? Will we see that before 1.0?


#3

Yeah this is “far future”. Would basically be a 2.0 API.


#4

I have a feeling there should be a trait like:

trait IsRange { // only because `Range` is already taken
    type Index;
    fn start(&self) -> Option<&Index>; // just to support BigNums, I guess
    fn end(&self) -> Option<&Index>;   // alternately, `Option<Index>` and require `Index: Clone` on impls?
}

impl<T> IsRange for {Range,RangeFrom,RangeTo,FullRange}<T> {
    type Index = T;
    /* ... methods in the obvious way ... */
}

Perhaps this would help reduce duplication for collection methods which can work with arbitrary subranges?

Would the behavior of methods with scalar indexes be the same as that of the range methods where the range contains exactly that single index? (I.e., does foo.method(i) = foo.method(i...i)?) For one thing: I think it would cause surprises if this weren’t to hold. Conversely, if it does, then perhaps duplication could be further reduced by having impls of IsRange for the various scalar number types, returning Some(self) from both start() and end(). I’m not sure whether or not this would actually be a good idea.


#5

We discussed a Range trait (basically everything that you wrote), but I think we want to have a different return type for indices and ranges (value vs Iterator), and potentially even a different return type for a.. vs a..b. The fact that i...i returns a different result than i seems quite reasonable to me.

This trait might be good for internal use, though. Basically all the specializations defer to a single get_internal<R: Range> method, and newtype/mangle the result accordingly.


#6

But note that such a generalization could be added backwards-compatibly, since it just changes concrete methods to use generics. (Of course, the elimination of the “redundant” methods would require deprecation, but personally I don’t mind having some overlap with convenience methods in general.)


#7

Oh, one thing I forgot to mention: this gives a cleaner story for deques, because you want e.g. truncate_back and truncate_front, which are just remove(a..) and remove(..len.saturating_sub(a))


#8

I don’t think it can be introduced in a backward-compatible fashion, because currently the type inference can determine that an argument to a get is a uint, but cannot do so after this change.

For example, I’m not sure even the most simple form .get(0) will continue to work.


#9

With default type parameter and inference fallback (https://github.com/rust-lang/rfcs/pull/213) I think we can recover this behavior. (Also, in this case the int fallback would trigger.)


#10

Is the I: for<'a> IterableTrait<'a> thing enough with regards to higher-kinded lifetimes?


#11

Probably, it would be better, if such operations were methods of view objects:

foo.clear() -> foo.view().clear() foo.insert(i, iter) -> foo.view().skip(i).insert_front(iter) foo.retain(pred) -> foo.view().filter(!pred).clear()

Similar API exists in Java collections library: List.subList returns a List which is a mutable view of original list. You may clear, sort, modify it whichever way you want, and modifiations will be reflected in the original list. (However, it is rarely used in practice).


#12

I really like the proposal overall but this might be a little problematic. The following looks like it shouldn’t panic but it can depending on the definition of some_fn:

v.insert(0, some_fn());
do_something(v[0]); // might panic!

How about a splice(offset, values) method? This way, insert and push always increase the size of the vector and splice and extend may increase the size of the vector.

Actually, splice could be implemented over a range as well and act like JavaScript’s splice (remove the specified range of elements, insert elements from the passed iterator, and return the removed elements in an iterator): fn splice(&mut self, range: Range<usize>, items: IntoIterator<T>) -> Iterator<T>. This could even avoid unnecessary allocations by lazily replacing items in the collection as the returned iterator is consumed. Unfortunately, the returned iterator would have to implement drop to insert/remove any remaining items.