Clamp function for primitive types

Xaeroxe · March 25, 2017, 9:25pm

This is a small change to the primitive types for Rust so I wasn’t sure if an RFC was appropriate. There is a constant need for the ability to restrict values to be between a certain minimum and maximum in several libraries. Some syntax patterns I’ve observed to accomplish this are

if input > max {
    max
}
else if input < min {
    min
} else {
    input
}

and

input.max(min).min(max);

and even

match input {
        c if c >  max =>  max,
        c if c < min => min,
        c              =>  c,
    }

This is a very common task but there isn’t an idiomatic way to do this. Thus I propose clamp whose general form is:

pub fn clamp(self, min: Self, max: Self) -> Self {
    if self < min {
        min
    }
    else if self > max {
        max
    }
    else {
        self
    }
}

I’ve added this function to the applicable primitive types in a branch on my own Rust repo. I’d like to have them incorporated into the main compiler but I’m not entirely sure what my next step should be to accomplish that.

https://github.com/Xaeroxe/rust/tree/clamp_patch

scottmcm · March 26, 2017, 1:12am

Default answer for a new API is to make an RFC: https://github.com/rust-lang/rfcs/blob/master/libs_changes.md#is-an-rfc-required. I don’t know whether this is a case where a PR would suffice; that doc implies you could just submit one and find out.

I do think it sounds like a useful thing to have; I know I’ve made similar things before.

Some questions to ponder, regarding the implementation:

Why only on [ui]N? Why not on anything Ord? (like std::cmp::min)
Should one of the Range types be involved, since the parameters are in essence a range?
fN aren’t Ord, so the custom method on them needs to talk about NaN behaviour, justify it, and test it.
If there’s a good answer for NaN, can that be extended to a good answer for PartialOrd?
What happens if the requested range is uninhabitable? 4.clamp(11, 9), for example.

Edit: I probably should have said RangeInclusive, not just Range. Though part of me thinks that assert_eq!( (3.14f64).clamp(0.0..1.0), 0.9999999999999999) is awesome (if probably not that useful).

kornel · March 26, 2017, 1:17am

Ord doesn't work with floats

scottmcm · March 26, 2017, 1:19am

Yes, hence the note about "the custom method on fN", also like min/max.

kornel · March 26, 2017, 1:23am

I write a version of clamp() in almost every program I write, but most often I use it when converting to a smaller type, e.g. f32.clamp() -> u8.

What are your use-cases for it?

Xaeroxe · March 26, 2017, 1:52am

Thanks for your advice on the PR. I’ll give this a second pass based on your feedback then submit the PR and see what happens.

Use cases are very wide and varied but it typically has to do with interfacing with APIs that take normalized values or when sending output to hardware that expects values to be in a certain range, such as audio samples or painting to pixels on a display.

I had not considered using one of the Range types but I will certainly look into it.

I thought of implementing this for a trait such as PartialOrd but prior precedence showed similar functions lived at these locations so I decided to go with that. That being said a historical reason is not a good reason so I’ll see if there’s a better home for these that makes them more broadly applicable.

There is no good answer for NaN types other than panicking or returning some kind of error value. Returning an error value would complicate the signature, and we also have prior precedence from invalid indexes into arrays demonstrating that panicking on unusual inputs is acceptable so I’m of the belief that panicking would be preferable.

If it’s going to be implemented based on a trait I’d rather not use Ord but PartialOrd instead, as floats are a type that needs to be restricted often as well.

There is also a case to be made that if this can only handle Ord types correctly then maybe it should only be implemented for Ord with a special case for floats.

Xaeroxe · March 26, 2017, 3:06am

I’ve decided not to use the range type, as it being exclusive on the upper bound is problematic for the clamp function.

Fylwind · March 26, 2017, 6:25am

There’s a RangeToInclusive BTW.

Xaeroxe · March 26, 2017, 6:32am

I appreciate the thought but it’s also not fit for this for two reasons, the first being that it is unstable, the second being that it doesn’t define a beginning, only an end.

colin_kiegel · March 26, 2017, 8:41am

Also not stable, but it has the lower bound too RangeInclusive. I also believe std can and does use unstable features internally. So it should be possible to use ranges. Users will be able to construct ranges via a...b on stable, but only std can then access the unstable API on that range, destructure, etc. After all, it’s defined in std.

hanna-kruppe · March 26, 2017, 12:28pm

A sensible alternative is to return NaN when the value to be clamped is NaN. This matches what fN::min and fN::max do, and the more general principle that NaN should be propagated rather than swallowed by most operations. A NaN min or max is much stranger, I'm not sure whether x.clamp(NaN, y) should return NaN, or if it should panicks just as if the lower bound is larger than the upper bound. I mixed this up, thanks @bluss

In any case, this would require making a dedicated clamp method that works slightly differently from the clam or Ord types, but this too matches the min/max situation.

bluss · March 26, 2017, 1:31pm

fN::min and max are actually always propagating the non-NaN operand, so they do the inverse of that.

Xaeroxe · March 26, 2017, 3:25pm

So I decided to make clamp a more general purpose function that lives in std::cmp::clamp and uses Ord with RangeInclusive. Since it is also oftentimes desirable to clamp floats as well I left the current float implementation but modified it to use RangeInclusive. RangeInclusive also contains an “Empty” variant whose purpose I believe to be for iteration, but I’m not iterating the range so I created code branches for Empty values that should never be executed.

Clamp and NaN: Here is the behavior the current implementation guarantees:

If an upper bound of NaN is used then no maximum is applied. If a lower bound of NaN is used then no minimum is applied. If this function is called on NaN then NaN will be returned.

Commit for review: https://github.com/Xaeroxe/rust/commit/e1f90513e6cd525f1bd75e5557ef53d470a9251c

bluss · March 26, 2017, 3:27pm

numpy has a function called np.clip that clamps all the values in an array. It leaves NaNs in place.

it accepts min and max, and passing NaN for either of those behaves as if passing None (no clip at the corresponding lower/upper boundary).

Xaeroxe · March 26, 2017, 4:18pm

This clamp code should behave in a similar way. If a value can’t be compared (such as NaN) the comparison will return false. That means that if NaN is passed for min no minimum is enforced. The same is true of max. If NaN is passed as self then NaN will be returned because it can’t be compared.

kornel · March 26, 2017, 7:29pm

The semantics should probably match:

maxsd   %xmm0, %xmm1
minsd   %xmm1, %xmm2
movaps  %xmm2, %xmm0

otherwise the implementation may be much slower than it needs to be.

scottmcm · March 26, 2017, 8:09pm

Oh man, floating point and its billion NaNs These all give totally different ASM:

     if x < start { return start; }
     if x > end { return end; }
     return x;

    x.max(start).min(end)

    // this gives @kornel's instructions
    if x < start { x = start }
    if x > end { x = end }
    x

Unimportant aside: they also give different instructions for integers, but not meaningfully (just different combinations of cmov(l|le|g|ge)).

Xaeroxe · March 26, 2017, 9:38pm

The exact assembly produced is one area in which my knowledge is definitely lacking. Assuming the code is functionally equivalent I’ll take whatever is decided to be the most optimal by you people.

bluss · March 27, 2017, 1:39am

If I put exactly this into ndarray’s mapv_inplace, the loop autovectorizes pretty well (Element type is f64). But it can also be ndarray’s own responsibility to provide an array wide clamp.

a.mapv_inplace(|mut x| {
    if x < min { x = min };
    if x > max { x = max };
    x
})

(AVX instructions)

│140:┌─→vmovup xmm2,XMMWORD PTR [rcx-0x20]
│    │  vmovup xmm3,XMMWORD PTR [rcx]
│    │  vinser ymm2,ymm2,XMMWORD PTR [rcx-0x10],0x1
│    │  vinser ymm3,ymm3,XMMWORD PTR [rcx+0x10],0x1
│    │  vmaxpd ymm2,ymm0,ymm2                     
│    │  vmaxpd ymm3,ymm0,ymm3                    
│    │  vminpd ymm2,ymm1,ymm2                   
│    │  vminpd ymm3,ymm1,ymm3                  
│    │  vextra XMMWORD PTR [rcx-0x10],ymm2,0x1
│    │  vmovup XMMWORD PTR [rcx-0x20],xmm2   
│    │  vextra XMMWORD PTR [rcx+0x10],ymm3,0x1
│    │  vmovup XMMWORD PTR [rcx],xmm3        
│    │  add    rcx,0x40                     
│    │  add    rdx,0xfffffffffffffff8      
│    └──jne    140

Xaeroxe · March 27, 2017, 5:34am

Clearly I underestimated the amount of discussion this function warrants, so I created an RFC here: https://github.com/rust-lang/rfcs/pull/1961

Topic		Replies	Views
(Pre-?)Pre-RFC: Range-restricting wrappers for floating-point types language design	13	3974	March 25, 2019
Pre-pre-RFC: Extrema traits language design	9	1494	February 24, 2022
[Pre-RFC] Clamped types language design	12	1087	May 4, 2024
[DRAFT] RFC: or patterns language design	4	891	March 25, 2019
Older RFCs up for discussion this week	4	1734	March 25, 2019

Clamp function for primitive types

Related topics