Add a `within` method to `PartialOrd`

The RangeBounds trait offers a contains method which... I am not much a fan of.

I typically favor not getting hung up on syntax, but I just find it hard to read:

!(0.0..=1.0).contains(&x)

In descending order of importance:

  • Yoda notation: I need to reach the end of the expression to learn which variable is being concerned (x) when that's the most important part, really.
  • Blending: that ! next to ( really blends in, making it harder to notice.
  • Noise: an extra set of parentheses (around the range) and an extra &, just boilerplate that needs to be skipped.

I was glad to learn with the 1.73 release that I am not the only one who is not a fan. Though I had no solution to offer.

/u/kibwen did, however, offer a fairly compelling alternative:

!x.within(0.0..=1.0)

This actually solves all the issues I had above, which I honestly didn't think was possible:

  • x is first, immediately denoting which variable we're interested in.
  • ! stands out quite a bit more.
  • No extra set of parentheses, no extra &.

And here I am, thinking that it would be a worthwhile addition to the PartialOrd trait:

fn within<R>(&self, range: R) -> bool
where
    //  Somehow reproduce the bounds of RangeBounds::contains
    R: RangeBounds<Rhs>,
    Rhs: PartialOrd<Self>,
{
    range.contains(self)
}

Pure syntactic sugar, really, yet oh so sweet.


Comparison of various syntaxes, side to side:

if scale < 0.0 || scale > 1.0

if scale < 0.0 || 1.0 < scale

if !(0.0 <= scale && scale <= 1.0)

if !RangeBounds::contains(0.0..=1.0, &scale)

if !(0.0..=1.0).contains(&scale)

if !scale.within(0.0..=1.0)
21 Likes

I think some careful thought should be put on PartialOrd here. If it were Ord, I'd have no reservations, but there are questions that PartialOrd raises:

  • With floating point, what is the behavior around NaN?
  • If one has a type Point2d(T, T); impl<T: PartialOrd> PartialOrd for Point2d<T>, the behavior of within depends on which coordinate is compared first.

Then again, I suppose that contains already has these problems…so maybe my concerns are already "handled" by existing behavior.

6 Likes

I support this idea. Ranges as arguments look a bit weird at first, but are necessary to ensure all possibilities by comparison operators are also possible. Apart from that, this method looks very similar to Ord::clamp - still, it should not be limited to types that implement Ord imo as all floats, on which these operation seem to have the most usefulness, could not benefit from it.

7 Likes

I went with PartialOrd because contains is already predicated on PartialOrd, so this decision has already been made in a way.

Do note that the motivating example uses floats, which don't implement Ord, and also note that Ord does not allow mixed-comparisons, such as checking whether a T is in a range of U.

1 Like

You can always do !RangeBounds::contains(0.0..=1.0, x) instead

I like the idea. The contains Clippy lint is relatively annoying and I would be happy to replace it with such method.

A bit of bikeshedding: I think is_within could be a better name, since it clearly indicates that return result is bool.

9 Likes

Beware that PartialOrd is pervasive, so this will add the method to all kinds of things, potentially being (de jure "minor") breaking for method resolution. (This was a problem with Ord::clamp, for example.)

So a particularly-strong proposal for this will look at other uses of within in the ecosystem to see if this conflict would be bad. Though of course a check-only crater run can also assess impact of a PR.

8 Likes

What exactly is the semantics for contains/within wrt floats? I would expect that NaN is never contained in anything and +/-inf is only contained if they are the upper/lower boundary respectively. But I can't find any documentation on this.

https://doc.rust-lang.org/std/ops/struct.Range.html#method.contains has a bunch of examples to show how it behaves, at least.

clamp wants Ord, because it needs to know higher or lower to know which end to clamp to, but contains can just return false when it's not relatively ordered to both ends.

2 Likes

My bad, missed the examples. Still, a couple of examples is not really a proper specification of behaviour. And inf isn't mentioned at all.

Inf isn't a problem, since the infinities are still properly ordered against everything that's not a NAN.

Isn't the behavior of NaN here fixed by the principle that any comparison involving NaN is false? Therefore, NaN is not within any range, and a range where either end is NaN contains nothing. Which is, I think, what that documentation is saying is already the case.

3 Likes

There's a few types representing mathematical sets that already have a contains(&T) function:

{1,7,3} => HashSet.contains
{1,3,7} => BTreeSet.contains

{x| -∞ < x < ∞ } => .. => RangeFull.contains
{x| -∞ < x < 7 } => ..7 => RangeTo.contains
{x| -∞ < x <= 7 } => ..=7 => RangeToInclusive.contains
{x| 1 <= x <= 7 } => 1..=7 => RangeInclusive.contains
{x| 1 <= x < 7 } => 1..7 => Range.contains
{x| 1 <= x < ∞ } => 1.. => RangeFrom.contains

And a few sets that can't be represented so easily:

{x| 1 < x < ∞ } => (Bound::Excluded(1), Bound::Unbounded)
{x| 1 < x <= 7 } => (Bound::Excluded(1), Bound::Inclusive(7))
{x| 1 < x < 7 } => (Bound::Excluded(1), Bound::Exclusive(7))  

{x| x % 2 == 0} => PredicateSet (doesn't exist)

If theses types could all implement a common Set<T> trait, such as:

pub trait Set<T>
{
    /// query whether the set contains a value
    fn contains(&self, val: &T) -> bool;
}

Then an opposite blanket trait ContainedIn<T> (or trait In<T> with fn in(...)) could be implemented for the reverse:

pub trait ContainedIn<T> {
    fn contained_in(&self, set: &impl Set<T>) -> bool;
}

impl <T> ContainedIn<T> for T {
    fn contained_in(&self, set: &impl Set<T>) -> bool {
        set.contains(self)
    }
}

let set1 = HashSet::from([1,4,7,8]);
let set2 = (15..30); //or RangeSet::from(15..30) or RangeSet::from((Bound::Excluded(4), Bound::Excluded(7));
let set3 = PredicateSet::new(|x| x % 2 == 0);
// implement union, intersection, difference, symmetric difference
// all of which also implement Set<T>. The cartesian product could implement Set<(T1,T2)>
let combined_set = (set1 | set2) & set3;

if 7.contained_in(&combined_set) {
    ...
}

if combined_set.contains(&7) {
    ...
}

The name "within" wouldn't make much sense for a set of enumerated values eg. {1, 4, 7} or if you had a set defined by a predicate, because the value might be within the range of min and max values, but not contained in the set. Using "in", or "contained_in" (for the reverse of "contains") might make more sense.

Aside: Sets can also define relationships (such as the binary relationships <, <=, >, >=) between 2 or more ordered values. The GreaterThanSet could contain all pairs (a, b) where a > b, eg. (2,1), (20,7), (123,94), but not (4, 7).

If there was an In<T> trait defined, then perhaps this would indicate that the "in" keyword could be reused:

pub trait In<T> {
    fn in(&self, set: &impl Set<T>) -> bool;
}

let hashset = HashSet::from([1,4,6]);
if 2 in hashset { ... };
if 7 in 4..9 { ... };

instead of:

if 2.in(hashset) { ... };
if 7.in(4..9) { ... }
3 Likes

It depends on whether you implement it as low <= x && x < high or !(x < low || high <= x). NaN.within(low..high) will be false and true for these two implementations.

I'm open to different names. within was an off-the-cuff suggestion, though I do disagree that it doesn't suit {1, 4, 7}, there's nothing in within which suggests it only considers bounds.

I do find the consideration of sets (or collections, really) interesting, and indeed it may be worth extending beyond PartialOrd and ranges...

I find keyword reuse problematic, as in making approaching and learning a language more difficult. It hampers discoverability, by making it more difficult for a newcomer to immediately be guided to the one concept expressed by the keyword.


I'm not that invested in names, in general. I'd much rather focus on semantics first, then we can figure out what's the most appropriate name for the agreed upon semantics.

Discussing names first is just a waste of time, if the semantics change, we may need to discuss again.

The suggestion here is to implement it as range.contains(&x) hence the behavior of NaN is fixed by the implementation of contains, which is itself fixed for backwards compatibility reasons by now.

Following @DanB's suggestion, I had a look at the various contains methods in std:

  • RangeBounds: heterogeneous single element.
  • Collections: homogeneous single element.
  • str: heterogeneous pattern.

It seems it would be possible to implement a more generic within method, though I'm not clear how useful it would be.

I'm curious what people think, and why:

  • A. within should only be available on ranges.
  • B. within should only be available for single elements, on ranges and collections.
  • C. within should be available whenever contains is.
0 voters
1 Like

I think is_in would be the best name for this. Short and sweet, makes clear the result is a bool, and is a general inverse of contains not specific to ranges. There's also some precedent, like Python’s in keyword as a predicate.

5 Likes

Adding support for types like HashSet probably would require addition of a new trait, since HashSet::contains is an inherent method and the RangeBounds trait does not look like a good fit for these types.

As for the name, is_in sounds like a good and concise option.

I believe what I'm saying is that the latter is wrong.