Pre-RFC: another Option combinator (merge_with)

FZs · December 11, 2022, 4:44pm

I posted this originally on the Rust Users Forum.

Option combinator methods make working with Options much smoother, but I found a case which isn't covered by them.

It is when I have two Option<T>'s and I want to get an Option<T> that has a Some if either of the inputs is Some, and combine the values if both are Some. So, it would be similar to Option::or(), but when both inputs are Some, it would call a closure to determine the output.

It's easier to describe what I mean with a snippet of code:

fn merge_with<T, F>(a: Option<T>, b: Option<T>, f: F) -> Option<T> 
where F: FnOnce(T, T) -> T
{
    match (a, b) {
        (None,     None)     => None,
        (Some(v),  None)     => Some(v),
        (None,     Some(v))  => Some(v),
        (Some(v1), Some(v2)) => Some(f(v1, v2))
    }
}

It could be used to perform aggregation operations on Option-wrapped values without unwrapping them first.

I can imagine two kinds of operations being used with this:

An operation that selects one of its operands:

let a = Some(2);
let b = Some(5);

let greater = a.merge_with(b, i32::max);

Or, an operation that somehow merges them:

use core::ops::Add;

let a = Some(2);
let b = Some(5);

let sum = a.merge_with(b, Add::add);

As pointed out by @kpreid on the original post, it's currently possible to achieve the same effect without a match by converting to iterators:

[a, b].into_iter().flatten().reduce(f)

or

a.into_iter().chain(b).reduce(f)

There was also a discussion about some similar thing in Haskell (which I'm not qualified to talk about), see the original post.

I see several use cases for this, but what do you think? Would it be useful? Would it be worthwhile to add this to std? I see that this is a minor feature, but so are many other combinators, and I think it would improve the ergonomics of working with Option-wrapped values.

Also, I couldn't come up with a better name than merge_with but I feel like it's very generic, so if you have a more descriptive name for this, don't keep it to yourself.

scottmcm · December 11, 2022, 5:27pm

I would always suggest describing those, with the context for the operation. Where did the options come from in the first place?

elidupree · December 11, 2022, 5:34pm

I've used this function!

github.com

elidupree/time-steward/blob/5c6401eb76dde2c25e09342a131d94cb12a3d2ba/time-steward/src/support/trajectories.rs#L289-L293


      
          combine_options(
            trajectory.next_time_significantly_le (range, input_shift, bounds [0].coordinate (dimension) + four),
            trajectory.next_time_significantly_ge (range, input_shift, bounds [1].coordinate (dimension) - four),
            |a,b| min(a,b)
          )

The scenario is that I have multiple "times when something is going to happen" which are Option, because the thing might not ever happen. And I need to get "the first thing that happens", i.e. the minimum of them, as a new Option.

…although, looking at it again, maybe it would be best to use Iterator::min() for this. (It's only happenstance that there are exactly 2 of them in this context; it would naturally generalize to more than 2.)

FZs · December 11, 2022, 9:07pm

The first one is the one that made me write this post now. My solution for today's LeetCode challenge (Binary Tree Maximum Path Sum - LeetCode) used it with i32::max.

The point was that None and Some(0) are both valid values and mean different things. The code had to find the path with the greatest value in a binary tree. And although an empty path has a sum of 0, it can not be chosen (not allowed by problem description), even if all the paths are negative-valued. So empty paths were marked by None's and valid paths were marked with Some(n)'s where n was the sum of the path.
Some time ago I worked on a project, where there were many Option-wrapped integers. They were Option-wrapped because many parts of the code had to handle the case where there was no valid data yet. That part is actually pretty similar to the first use case (though None meant something else). The other difference is that the numbers were added (most of the time) and multiplied other times.
I also used merge_with (it was called join_options and it didn't take a closure because it was used only once) as an Iterator::fold() callback, like this:
```
iter.fold(None, |opt, val| join_options(opt, val))
```
But now I actually see that it makes no sense in this case, because it could be easily replaced with flatten and reduce. I've never really thought of Options as Iterators before.

As I had to enumerate these use cases I feel less confident that merge_with is useful enough. To summarize, in all cases it makes sense when a None means something very different from a Some(identity) where identity is the identity value of the operation (0 for addition, 1 for multiplication, type::MIN for maximization, etc.) and as soon as we have any value, we prefer to have it over a None.

rkuhn · December 11, 2022, 11:27pm

What you describe is that some type T shall be used as a (mathematical) group but lacks a neutral element, so Option::None to the rescue. While I cannot exclude that there may be valid use-cases for this, my own experience is that my code was better off formulating T in a suitable fashion: Instead of using None to mean “never” I prefer a proper type with specific semantics. Using Option to mean more than “value may be absent” is a code smell in my book.

max · December 19, 2022, 1:30am

A use case for this that I ran into recently was an optional config that limits an optional user-passed Vec to only contain certain values:

fn constrain(user_val: Option<Vec<i64>>, constraint: Option<Vec<i64>>) -> Option<Vec<i64>> {
    match (user_val, constraint) {
        (None, None) => None,
        (None, val) | (val, None) => val,
        (val1, val2) => val1.intersect(val2)
    }
}

This would be a lot easier as

user_val.merge_with(constraint, Intersect::intersect)

I can't just use empty Vecs instead of Nones because an empty constraint Vec would mean that no values are allowed, and the rest of my code assumes optional values in general and I'd rather not special-case this.

rkuhn · December 21, 2022, 3:50pm

This proves my point: using generic collection types offers far too big an API for the data structure at hand. A user would think that user_val.or_else(constraint) also makes sense, since the compiler accepts it. My advice is to use Option, Vec, and friends only where their whole API is adequate, otherwise there will be bugs. Rust offers zero cost wrappers to allow you to express precisely what you want.

system · March 21, 2023, 3:50pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Option<T>, Option<U> -> Option<(T, U)> libs	7	2006	March 25, 2019
Lifting binary operations, e.g. T+T → Option<T> + Option<T> language design	30	5863	May 4, 2019
Std proposal: Option::fold()	9	2488	May 8, 2019
[Pre-RFC] Elvis/coalesce + ternary operator language design	52	4075	March 25, 2019
[Feature Request] Unwrap nested options when return type is Result language design	10	1284	December 25, 2022

Pre-RFC: another Option combinator (merge_with)

Related topics