Adding Iterator::extrema, argmin, argmax

I think it would be convenient to add some methods to the Iterator protocol for adding extrema (both min and max at once), argmin (index of smallest value), argmax (index of largest value), argextrema (indices of min and max). Each of these would then have a corresponding method that is parametrized by the comparison function (like min_by takes a comparison function).

These names are not great but I think that the functionality is useful for enough people.

Like the itertools methods Itertools::minmax[_by[_key]], Itertools::position_min[_by[_key]], Itertools::position_max[_by[_key]], Itertools::position_minmax[_by[_key]]?

3 Likes

The problem with extrema is that a good return type is non-trivial; see MinMaxResult in itertools - Rust.

As for argmin/argmax, note that .zip(0..).min()/.zip(0..).max() work pretty well for them.

My instinct is that Iterator already has a ton of stuff on it, so another 9 methods might not be desirable. And thus it's fine for them to be left to something more statistic-oriented, which could also have mean+stddev and such.

But you could always open an ACP see what libs-api has to say.

4 Likes

What I really want is a general API for composing folds by making a single pass in the iterator.

Because, there is already Iterator::max and Iterator::min (and Iterator::fold in general), but those don't compose and then itertools has to define Itertools::minmax.

Maybe the only way to compose folds is to build a "fold" that is independent from the iterator, and then have some API that applies all folds (and returns a tuple). But then this fold is just a closure (or a struct that wraps a closure).

One issue is that folds usually get ownership but for running multiple folds in the same pass we should borrow (either & or &mut is ok).

But it would be something like

    let x = build_fold(.., |acc, x| ..);
    let y = build_fold(.., |acc, x| ..);

    // returns a tuple (s, q) by running making a single pass on iterator
    x.compose(y).fold(iterator)

Perhaps another way is to have a "composable fold" that is tied to an iterator, and a way to process them in lockstep that guarantees that, if those folds are processing the same iterator, they should access the iterator just one time per element iterated (but if they are from different iterators, they will access different elements). Something like

    let x = a.iter().composable_fold(.., |acc, x| ..); // or a.iter().fold_compose(..)
    let y = a.iter().composable_fold(.., |acc, x| ..);
    let z = b.iter().composable_fold(.., |acc, x| ..);

    // makes a single pass zipping a and b, returning a triple (p, q, r) of folded elements
    x.compose(y).compose(z).fold()

Now, how does the API could possibly know that x and y are folding over a single iterator, and z is iterating over another iterator? Maybe it isn't possible to detect this.

1 Like

Java int/float-streams have a sort of middle ground where they provide a specific summary-statistics struct that aggregates a few things:

2 Likes

I agree with this. Make an aggregator that can be passed more values to add to the aggregation (and make them mergeable for use with rayon), then just use Iterator::for_each to put things from the iterator into the aggregator.

That makes the code far more reusable than something that only works for a pass over a single iterator. For example, you might make a type for https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm.

You mean something like Java Collector?

+1, Java Collectors are a very clean and powerful API.

But even better, because traits can express things like "this type can be merged with another value of the same exact type (not anything implementing the interface)" that interfaces can't.

1 Like

Related: Add minmax […] to core::cmp

If these are added, they arguably should have Iterator counterparts too – although their existence would also make it straightforward to simply write a reduce as needed.

1 Like

@dlight this was my attempt a while back at designing a "composable fold" type, similar to what you're describing (IIUC): reductor - Rust (sorry for the shameless self-plug :slight_smile: )

2 Likes

I'll look into it, thanks!

I liked that you divided this in two traits, one for the "reductors" (which is the composable fold itself) and another called Reduce (which has a blanked impl impl<I> Reduce for I where I: Iterator). Only thing that seems missing is a free standing function to build a reductor from a closure (or I guess, two, and an initial state), rather than defining a struct.

Alas, without variadic generics we can't have something that builds up tuples like that

Similarly, the use of pattern matching is almost nice, but the needed nesting to make it work could quickly get ugly.

Maybe another option is to lean on the borrow-checker and put the results into the source and then get them back out once iteration is done.

E.g.

let mut x = Fold(initial, |acc, val| { .. });
let mut y = Fold::arg_min();
let mut z = Fold::avg();

// and() takes &mut
iter.for_each(x.and(y).and(z));

println!("results: {} {} {}", x.val(), y.val(), z.val());

The downside of this pattern is that it makes them stateful so one can't (re)use it in a declarative fashion across many (possibly concurrent) iterations which might otherwise be interesting for things like rayon.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.