What do you think about Iterator::single?

Using C#'s Linq at work, I like their Single method: if there is one item in the iteterator, it is returned; otherwise an exception is thrown.

What do you think of an equivalent in Rust? An Iterator method like:

fn single(self) -> Option<Self::Item> { /* ... */ }

It could also be like:

fn single(self) -> Result<Self::Item, IteratorSingleError> { /* ... */ }

enum IteratorSingleError {
    MoreThanOneItem,
    EmptyIterator,
}

But I am not sure if it is usefull. It seems too verbose.

2 Likes

I’m curious about the conditions under which one would even take an iterator expecting one element… What problem are we solving where we’re trying to create an iterator with one item and then throwing the iterator away by pulling the item out? Why not just create the item without any iterator?

One might not be in control of the value source. It could be a stream, for example. There might also be multiple values involved in the whole process, but it ends with "and then get exactly one more item" for it to be correct.

I don't think I've personally come across enough usage for it to be a bother to not have it inside std. When I come across the situations I usually want some more custom behavior, so writing the "check the rest of the iterator" part isn't much of a burden.

However, if I were to use a std provided version, I'd probably prefer it to return a Result with an error value that provides me with a new iterator that I can use to inspect the wrong set of values for error reporting or fallbacks and such.

As example that would be having something like:

enum IteratorSingleError<I> {
    Empty,
    Many(IteratorSingleErrorIterator<I>),
}

and have IteratorSingleErrorIterator (I know, awful name) contain the first item, next item, and rest iterator, combining them into a full iterator again.

Maybe this would be a candidate for an itertools addition?

2 Likes

It’s very useful to report errors, along the lines of

let item = match things.single() {
    Some(item) => item,
    None => bail!("duplicated item definitions"),
}

Here’s some example usages from Intellij-Rust: https://github.com/intellij-rust/intellij-rust/search?q=singleOrNull&unscoped_q=singleOrNull

Itertools has a similar method, which turns the iterator into a tuple, checking that there are no excess elements: https://docs.rs/itertools/0.7.8/itertools/trait.Itertools.html#method.collect_tuple

2 Likes

In addition to the other answers, I want to add that it is useful associated with map for example, to verify that there is only one item that matches a property and to retrieve it.

Suppose that a library returns a whole bunch of items, and only one item has your interest:

let my_item = match get_items().map(|item| item.some_property == reference).single() {
    Some(item) => item,
    None => return Error,
}

I like the idea of retrieving the original iterator in case of failure, BTW.

rustc has a similar method used in its parser:

1 Like

I made a crate for this two months ago:

https://crates.io/crates/single

(Giving back everything in the error case is hard and inflates the error fast.)

2 Likes

I like the idea of retrieving the original iterator in case of failure, BTW.

The original iterator can be retrieved when single is called on by_ref iterator. Special handling in single function is not necessary.

What if the iterator is reading from the network?

Edit: Also, ByRef just wraps &mut I, so it will still consume the first and second items and the original iterator will not have them.

2 Likes

ByRef just wraps &mut I

Yes, so original iterator is not consumed.

it will still consume the first and second items and the original iterator will not have them

@Boiethios was talking about retrieving the iterator, not retrieving elements.

Two consumed elements should be stored in single::Error::MultipleElements(first, second). (Edit: I'm not sure these two elements need to be preserved, but that's not very important).

(BTW I also love the idea of single iterator fn)

Sure, that would work. But then you'll possibly have to re-assemble the items yourselves in your own error.

Maybe a generic error type or type parameter that can take what it wants via From<(Item, Item, I)> is the best solution.

But then you’ll possibly have to re-assemble the items yourselves in your own error

And retrieving original iterator from single function won't help, because you can't push elements back to the iterator. Returning original iterator from single doesn't solve any problem.

Maybe a generic error type or type parameter that can take what it wants via From<(Item, Item, I)> is the best solution.

From my experience, if something can be done without generic magic, it's better to do it without generics. Otherwise, it leads to code verbosity and compilation errors.

Which is why I said

and

There can be big value in providing customization points via generics. In this case, you can let the library consumer decide how much information they'll want collected.

Result with an error value that provides me with a new iterator

Iterator of a different type than the original iterator.

Looks like overengineering to me.

There can be big value in providing customization points via generics. In this case, you can let the library consumer decide how much information they’ll want collected.

I believe in 99% cases people just want to panic with a meaningful message (empty or more than one). The rest can be done in utility functions in client code.

I’ve needed a “is this a singleton iterator” a few times when writing procedural macros. Essentially:

fn singleton<T>(mut iter: impl Iterator<Item = T>) -> Option<T> {
    if let Some(n) = iter.next() && let None = iter.next() {
        Some(n)
    } else {
        None
    }
}

I assume you don't mean actually panic. But anyway, I don't think I ever wanted a version of this where I didn't customize things depending on context and the other elements.

Why?

I assume you don’t mean actually panic.

Sometimes panic is fine, e. g.

fn fetch_exact_number_of_objects(count: usize) -> Vec<count> { ... }

let single_object = fetch_exact_number_of_objects(1).into_iter().single().unwrap();

Looks like overengineering to me. Why?

Because I think API should be minimal to solve 99% of cases. Any complication above is overengineering.

(And for remaining 1% cases you probably don't even want to call single in any form, but call next() two times instead).

I can with certainty say that I never wanted to have this functionality panic.

There are no usage statistics for this kind of thing. You can also make an adaptible generic version and have convenient short-paths for the common cases, like with type aliases, helper methods and such.

And if we're talking about the standard libary or a widespread crate, 1% can represent a lot of users.

I can with certainty say that I never wanted to have this functionality panic.

Oh, there's a misunderstanding. single itself should not panic, it should return an error, which can be turned into panic with .unwrap(). Almost exactly the same way implemented by @CAD97.

1 Like