Add NonEmptyIterator to std?

I'm working on a project and I was suprised with current implementation on all/any on empty iterators. Their results make sense if you think about it like a monoid fold although it still suprising from the user perspective.

I wonder if anyone else think it worth adding into std. Or probably I should add it to itertool or my custom library first, although in the latter case the feature won't get its auditory even if it's worthy.

Here is a small PoC: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0c1d918753fdfa940a88a2d7b6bf71a2

The general idea is add new NonEmptyIterator which is an iterator with at least one element and add mapping Iterator<Item=T> -> Option<NonEmptyIterator<Item=T>> so every iteratior maybe is non empty.

It helps in cases we want to perform specific operation if iterator produces no elements. For example, if we want xs.any(predicate) return true if xs are empty. Current implementation returns false in this case and we don't have any chance telling if collection was empty or predicate returned false for every element.

Why would you want to know that? For example when I have user preferences where one could filter incoming messages by tag. Okay, we write:

let is_message_should_be_displayed = 
   user_preferences.mesage_filter_tags.any(|x| message.tags.contains(x))

But if user has empty mesage_filter_tags it means he doesn't want to filter anything so is_message_should_be_displayed should be true. Unfortunately the current implementation will return false.

You can hack it in several ways depending if you know the backing collection behind mesage_filter_tags so you could check its length or if you don't then you could write something like

let is_message_should_be_displayed = 
   user_preferences.mesage_filter_tags.filter(|x| message.tags.contains(x))
      .map(|_| true).next().unwrap_or(true)

But it all looks quite hacky and ugly. I find my proposal more general and clean.

I don't think that this is useful because there is no way to check if an iterator is empty without mutating it. Also, all iterators can exhaust themselves and become empty, so encoding non-empty in the type system is meaningless.


To be more clear, you can easily exaust a "NonemptyIterator", so it lies about itself.

3 Likes

I don't think the behavior is that surprising: in essence all and any are just a forall and an exists. The predicate "all unicorns are purple" is true, because there are no unicorns, while "there exists a unicorn that is purple" is false, again because there are no unicorns :unicorn:.

When xs.any(predicate) returns true I expect there to be at least one x in xs for which the predicate indeed holds and which I could then use. In an empty iterator this is not possible, so I would expect false.

As for this specific use case, I would just make the two checks explicit:

let is_message_should_be_displayed = 
   user_preferences.mesage_filter_tags.isEmpty() || user_preferences.mesage_filter_tags.any(|x| message.tags.contains(x))
4 Likes

This proposal is obviously wrong: if there are no elements then the predicate is not true for any element and thus any() must return false.

This is how you can implement your desired function:

fn any_or_empty<T: Iterator>(x: T, f: impl FnMut(T::Item) -> bool) -> bool
{
   let p = x.peekable();
   p.peek().is_none() || p.any(f)
}
2 Likes

I don't think that this is useful because there is no way to check if an iterator is empty without mutating it.

It's true although it's fine to peek the first element. However with peekable it looks somewhat better.

I don't think the behavior is that surprising: in essence all and any are just a forall and an exists . The predicate "all unicorns are purple" is true, because there are no unicorns, while "there exists a unicorn that is purple" is false, again because there are no unicorns :unicorn:.

Right but doesn't \any x P(x) => \exist y P(y) hold for any P?

As for this specific use case, I would just make the two checks explicit:

imagine user_preferences.mesage_filter_tags = (1..10000000).filter(|x| x < 0), then two checks are disastrous. Peekable solves it however it's somewhat verbose.


I see that the type would lie about itself when it's exausted but I'm not sure how to work it around. Although I have a strong opinion it's useful to have such a type. It just have to be designed better than I did.

:confused:

3 Likes

It could look like:

fn into_non_empty<T>(x: impl Iterator<Item=T>) -> Option<impl Iterator<Item=T>> {
    let mut p = x.peekable();
    if p.peek().is_none() {
        None
    } else {
        Some(p)
    }
}


fn main() {
    let foo: Vec<i32> = vec![];
    println!("{:?}", into_non_empty(foo.iter()).map(|mut x| x.any(|y| *y == 0)));

    let foo: Vec<i32> = vec![0, 1, 2];
    println!("{:?}", into_non_empty(foo.iter()).map(|mut x| x.any(|y| *y == 0)));

    let foo: Vec<i32> = vec![1, 2];
    println!("{:?}", into_non_empty(foo.iter()).map(|mut x| x.any(|y| *y == 0)));
}

We don't have any lying type names here but it basically shows the same idea. After you called into_non_empty you can be sure that iterator contains at least one element. It looks uglier because you cannot write an extension trait but it still the same.

For a set X we do not have \forall (x \in X) P(x) => \exist (y \in X) P(y) which is more applicable then that you wrote. The set would consist of the values of the iterator in this case.

Please describe how one would consume a non-empty iterator, and what takes away your instance of a NonEmptyIterator type once it has no more items, such that you can't pass it to something else wanting a NonEmptyIterator.

(It looks, in your PoC, like I can just while it.next().is_some() {} and have a NonEmptyIterator that's actually empty.)

I don't think it's that surprising. any(p) == true means "I found an an example where p holds", and there are no examples in an empty set. all(p) == true means "I was unable to find a counterexample", and there are no counterexamples in an empty set.

1 Like

Speculating, but if rust had gone this direction I assume .next() would have to return both an Option(T) and an enum NonEmptyIterator | EmptyIterator. Or perhaps using a NonEmptyIterator would downgrade it to just an Iterator with some kindof linear type magic (you could re-test it to upgrade it back to non-empty?).

Please describe how one would consume a non-empty iterator, and what takes away your instance of a NonEmptyIterator type once it has no more items, such that you can't pass it to something else wanting a NonEmptyIterator .

You pass it into some combinators. I aggree that it probably not very useful if you're doing manual nexts, but it is when you only pass it into combinators. Another use is group_by where every group is guaranteed to have at least one element. Currently you have to do foo.next().unwrap() to see the first value while the group_by contract is that it will never panic.

(It looks, in your PoC, like I can just while it.next().is_some() {} and have a NonEmptyIterator that's actually empty.)

Right, probably I'd better to call it IteratorThatWasNonEmptyWhenItWasCreated but afaik rust community doesn't like java-like names so I shortened it a bit at the expense of naming precision.

You could just not implement iterator for it and write split(self) -> (T, impl Iterator<Item=T>) but in this case you won't get any combinators and it eliminates the whole idea.