Fallible alternative to Extend

The core::iter::Extend trait is defined as:

pub trait Extend<A> {
    fn extend<T: IntoIterator<Item = A>>(&mut self, iter: T);
}

That's fine for containers like alloc::vec::Vec. The problem is when this trait is impl'd on fixed-size containers like heapless::Vec or tinyvec::ArrayVec. In either case, they panic if there's insufficient space in their underlying buffer.

What'd be nice if if there were a fallible alternative to this trait, impl'd on alloc::vec::Vec, which could be used to abstract over it versus the heapless/fixed-sized types. Something like this:

pub trait TryExtend<A> {
    type Error;
    
    fn try_extend<T>(&mut self, iter: T) -> Result<(), Self::Error>
    where T: IntoIterator<Item = A>;
}

This would simplify writing code that's abstract over these different container types, allowing the use of heapless::Vec or tinyvec::ArrayVec on no_std, or alloc::vec::Vec with liballoc, but also ensuring that the out-of-buffer situation is panic-free in all cases.

Here's an example of such a trait being used internally within one of my crates.

8 Likes

An alternative to fallible extension is simply stopping when the container is full which covers at least the fixed-size containers as well as adding to a Vec without reallocation. Here is that idea as a crate: https://crates.io/crates/fill (Note that you might treat Option and Result as very simple limited containers in that sense).

What are the added benefits of placing such a trait in the standard library?

Usage example:

use fill::Fill;
let mut option = None;

let spilled = option.fill_and_keep_tail(42..);
assert_eq!(option, Some(42));
assert_eq!(spilled.start, 43);
2 Likes

It's certainly the sort of thing that can (and probably should) start in an external crate like fill.

As for the reason for including it in libcore: I think something like TryExtend be a nice corollary to Extend in the same way TryFrom/TryInto share a similar relationship with From/Into. But I can see how others might just want a crate for this.

2 Likes

The problem with that is a silent failure when trying to extend a container that's filled to capacity. Silent failures in general are a recipe for disaster.

1 Like

It's not silent, it can quite easily return the iterator and allows you to inspect it to deal with remaining elements. In fact, the potential to also poll an iterator via a reference without consuming it allows a great deal of freedom with how exactly the filled state is handled:

  • fn fill<I>(&mut self, iter: I). The base case, all other can be implemented in its terms. Return nothing, for cases where exhaustion is not a failure case but expected. For example, simply filling the remaining allocated but uninitialized portion of a Vector with some default representation has no fail case and is then simply
    vector.fill(iter::repeat_with(Default::default))
    
    The iterator is also dropped so any logic will be executed, such as emptying a drain and dropping elements. This is not completely silent in any case.
  • fn fill_and_keep_tail<I>(&mut self, iter: I) -> I::IntoIter. Simply return the remaining portion of the iterator, so that you can use your own recovery strategy. Useful if you want to use some less efficient algorithm for all overflowing elements for example.
  • fn fill_count<I>(&mut self, iter: I) -> usize . Just count how many elements were inserted.
  • fn checked_fill<I>(&mut self, iter: I) -> Option<I::IntoIter> . Returns Some(_) if the insertion stopped from a full container and None if it stopped from the iterator being drained.
9 Likes

Suppose we insert 40 elements using x.try_extend(y), and it failed at item 15. What should happen to the 15 elements added to x (rollback? keep them in x?). How to recover the remaining 25 elements?

In the implementations I've done so far, they're kept in x.

I can imagine ways of making this all-or-nothing, possibly in conjunction with some sort of specialization for ExactSizeIterator (which could potentially be generic if there were a TryReserve trait of some sort).

With the API specified above, they'd be lost. You could imagine a slightly different API that doesn't consume the iterator, though:

pub trait TryExtend<A> {
    type Error;
    
    fn try_extend<T>(&mut self, iter: &mut T) -> Result<(), Self::Error>
    where T: Iterator<Item = A>;
}

An API like that would allow you to retain T in the event the underlying buffer lacks the capacity to store its entire contents.

Unecessary, we have impl Iterator for &'_ mut impl Iterator. In fact, in case it was not clear enough, the method fn fill<I>(&mut self, iter: I) enables all other proposed signatures and these methods can be implemented in its terms. I would expect similar possibilities are available when returning a Result<(), Error> instead.

But the rollback question makes it even less appealing to use a result for this operation, in my opinion. The failure case has already performed a meaningful operation on its target. I would rather expect a try_ method to be available only on containers that know their exact capacity, operate on ExactSizeIterator and then check the condition before performing the mutation.

2 Likes

For what it's worth, I prototyped a crate called collectable with a number of different traits for fallible collection interactions. I released a version of it which consumes the iterators, then went back and changed the signatures to borrow the underlying iterators instead (and therefore be bounded by Iterator instead of IntoIterator per above).

I was able to add a TryFromIterator and a corresponding TryCollect extension trait to Iterator both implemented in terms of blanket impls (for T: TryExtend in the case of the former).

This might be a reasonable place to start. I'll have to look into all the use cases I want this API for are covered by ExactSizeIterators or not though.

I think the implementor can decide the precise semantics. The goal & default would be to keep elements one-by-one (they succeed one-by-one).

Elements can always be recovered in such a scenario - even if .extend() takes an IntoIterator, you can always pass it a &mut I where I is an iterator and have it available after the method returns. (Fusing caveat maybe, but it shouldn't apply if we return with an error?)

Aah, good point!

To note: Extend for ArrayVec already does implement "extend only to available space".

2 Likes

Yeah that's a very old choice to do so, not sure the API should be that way, but we don't have any good error handling choices for Extend (see this topic :slight_smile: )

1 Like

On a slightly different note, the TryExtend trait proposed here commits the same mistake as the Extend trait with regard to the receiver: it forces &mut self.

It's relatively natural to expect that the receiver will be &mut self, as that's the common case, however it leaves on the table all the shared collections, such as concurrent collections, which is problematic.

It would seem better, instead, to specify the receiver as self and let the user implement the trait for &mut Collection or &Collection as is suitable for their usecase.

That would be much harder to use, it's better to take &mut self, and then implement the trait for &Collection if possible. This keeps the common case simple and the general case possible.

3 Likes

It seems like this broadly falls under the header of fallible collection allocation (https://github.com/rust-lang/rfcs/blob/master/text/2116-alloc-me-maybe.md, https://github.com/rust-lang/rust/issues/48043, others) except as a slightly special case (allocation always fails!). I think all the same reasoning applies though.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.