Ergonomics of wrapping operations

One thing that was deliberately postponed in the overflow RFC were ideas on how to make wrapping operations more ergonomic. The current RFC simply includes methods like wrapping_add and so forth that can be applied to integers. For example, one can do:

fn something(y: i32, z: i32) {
    let x = y.wrapping_add(z);
}

This is fine for a small number of operations, but it can be quite painful when done at scale. The RFC also added a Wrapping type:

fn something(y: i32, z: i32) {
    let x = Wrapping(x) + Wrapping(z);
}

However, this option too suffers from some ergonomic downsides:

  1. += doesn't work yet on overloaded operations. (But see RFC 953, which I hope we can approve relatively soon and have available in Rust 1.1).
  2. Wrapping doesn't (currently, at least) interoperate well with integer literals, so you can't write x + 1 where x has Wrapping<i32> type, you have to write x + Wrapping(1).
  3. Some people feel like the "new types for modulo" isn't the right approach, because the fact that wrapping arithmetic is used is more a property of the calculation being done, not the types flowing in or out. (For example, having a hash function take &mut [Wrapping<u8>] instead of &mut [u8] feels like exposing impl details.) Of course, others feel the opposite. I have some sympathy with both sides.

Design space

It's clear we can do better. I see three options.

A. Nothing major, incremental improvements to Wrapping.

We adopt RFC 953 so that x += y works. We add impls that allow Wrapping<T> to be added to plain T so that x + 1 works.

B. Swift-like operators.

Swift has introduced a parallel set of operators &+ and so forth to indicate modulo operations. We could do this, but it is not clear to me that this use case is sufficiently large to warrant a complete parallel set of operations -- this is a lot of operator real estate. For example, if we were going to add variations on +, maybe we want variations that target SIMD or other collective operations, etc.

C. Scoped wrapping semantics of some kind.

The original RFC proposed a kind of lint-like scoping system that allowed overflow checks to be controlled at a fine-grain. It intentionally did not, however, allow overflow checks to be completely disabled in this way. The reasons against were well-summarized by @rkjnsn (but do read the whole thing):

Using scoped annotations to change the semantic meaning of operations seems very, very wrong. As you mention, one would have to search all enclosing contexts to determine whether operations were meant to be wrapping, and it still wouldn't be clear if it were for performance reasons (overflow is still incorrect) or algorithmic reasons (overflow is expected). Combined with the other points you make, I would find using scoped annotations to determine whether wrapping is desired to be completely unreasonable.

Considerations

Some other considerations to keep in mind no matter what solution:

  1. What happens with overloaded +?

A specific proposal

One option I've been turning over in my head is a variant on Option C. The idea would be to permit a #[wrapping] annotation to be placed on blocks or fns (but not modules or crates). When placed on a fn, it would affect the code in that fn, but not the code in nested fns. This would affect the semantics of potentially overflowing operations within that block to disable checks and enable the "fallback" semantics. It would interact with overloading by changing the + operator so that it is not connected to the Add trait but rather the WrappingAdd trait (and so on for the other operators). Here are some examples of how this might look:

fn foo(y: i32, z: i32) {
    #[wrapping] {
        let x = y + z;
        ...
    }
}

The reason I find this approach appealing is that:

  1. it doesn't require new syntax (which seems like overkill, given the magnitude of this problem);
  2. particularly for large blocks of code where lots of math is going on, this requires just one annotation, rather than requiring you to convert every operator. It seems less prone to errors where most of the operations are using &+ but a few of them are accidentally left with +.
  3. for individual options, y.wrapping_add(z) remains an option.
  4. limiting the attribute to fns and blocks means that it should not be hard to decide whether a given operation is potentially overflowing or not.

One downside that is not addressed is the contention that people may use this attribute to identify perf-sensitive bits of code and then turn on overflow checks everywhere else, leading to semantic ambiguity (was that overflow anticipated or not?). This is a risk. It could be addressed by providing a means to control overflow checking in optimized builds -- basically saying "we want checks even when optimized, but not here". This mechanism would not affect checks in debug builds, making it clear that overflow is still not an anticipated result. This mechanism may also be overkill, I'm not sure.

Timeframe

It's not clear to me just how fast we have to act on this. We have a coherent story but also known ergonomic problems. In particular I'd like to make sure we have a sense of the full set of problems we want to address -- that might help inform decisions about how much control to put into the design. For example, I don't know whether we will need the ability to request (and then also disable) overflow checks in optimized code or not. (It would also be useful to have a firm figure on the runtime cost of overflow checking; the current implementation is fairly naive and it's likely we can optimize quite a bit if we choose.) But I'd still like to kick off this conversation so that we get a sense of what ideas are out there.

3 Likes

One of my own pet projects is currently a gameboy emulator, and I haven't found the time to update it to nightly since the overflow changes landed due to the massive rewrites needed. First, I want to be specific about what the pain points I've found are. The high level summary is that all of these operations are compile errors:

  • Wrapping<T> += T (e.g. a += 3)
  • Wrapping<T> = T (e.g. a = 3)
  • Wrapping<T> as T (e.g. a as T)
  • foo(Wrapping<T>) where foo wants a T instead

I've omitted the Wrapping<T> += Wrapping<T> case due to RFC 953 as well as Wrapping<T> + T due to the possibility of adding new Add impls. What I have personally found is that if you really want to use wrapping arithmetic everywhere that those two measures unfortunately may not even hit the "halfway point".

So, on to your three options:

I unfortunately don't think that this will solve our problems. I think we'd need some more language constructs or traits, not all of which I'd be comfortable adding!

This I think is a pretty plausible course of action. The "overhead" for using the wrapping operators is far less compared to dealing with the different types I believe. That being said I haven't actually migrated code to use it, so I'm not sure how it would pan out.

This is what I think that we may need to add for cases like an emulator. Overall I like your proposal, it's precisely the kind of scoped control of wrapping operations that I would like. Some nits/questions:

  • To clarify, the reason you only allow it on functions/blocks and it doesn't recurse is in light of @rkjnsn's comment? Ergonomically I would prefer to tag the entire cpu module of an emulator with "wrapping semantics" rather than each function within, but I also understand the concern of determining whether it is wrapping or not.
  • Adding a whole new set of WrappingFoo traits would be a little unfortunate. I wonder if wrapping_foo default methods could be added to the existing Foo traits that are just Some(self.add(other)). Another possibility at least!
  • For using #[wrapping] for perf-sensitive code, we may not have to worry too too much because -O strips all overflow checks by default.
2 Likes

I’m against an attribute such as #[wrapping]. If you write new code in a file you then need to remember whether you’re currently in a scope with #[wrapping], or not, copying code around changes semantics even though it might have the same types.

I think that rather the current ergonomics should be improved – this would not only help this particular kind of operation type (wrapping), but could rather also support others that are implemented in a third-party library later (think one’s-complement integers, e.g. for IP headers).

As you say, 1) is addressed soon, 2) is a pain point not only for wrapping operations, see my example above.

For 3) I think the right approach would be to make &mut [Wrapping<u8>] convertible to &mut [u8] and vice versa, in this way everyone (that means not only the standard library) gets better type conversions.

I feel like rushing to get a fundamentally new solution into 1.0 is wrong, it should either be incremental improvements or a new (and easier) way to deal with wrapping operations in 1.1.

It could also be limited to the next statement.

#[wrapping]
let x = y + z; // wrapping
let a = x + y; // not wrapping

The only issue with such a system, is that it cannot be infectious. Calling a function inside a wrapping block may not under any circumstances cause wrapping inside the function. The only thing allowed to happen is that basic operators call a different function depending on whether the operator is inside a wrapping block or not.

Also: the wrapping types could be in their own crate for everyone who likes modulo types.

Side-Note: Ada has modulo types, doesn't have "+=" and requires all types in an operation to be of the same type. So it's exactly like the current situation. Ada modulo types wrap over arbitrary values, not just unsigned::max

That's a different issue entirely. It's like having to write Int::zero() whenever working with generic types. Maybe after const-fn this can be made better through a FromLiteral trait.

Yes, my concern was that it would be hard to know the semantics of any given +. I was basing the scoping rules roughly around unsafe fns/blocks, though the analogy is not exact since #[wrapping] is not part of the interface of a fn, unlike unsafe.

Another simpler possibility would just be to disallow overloaded ops inside #[wrapping] code. Maybe they aren't really needed. Or to just let overloaded ops behave as normal; but I find that messes with the a+b is just Add::add(a,b) mentality -- of course so does disallowed overloaded ops. I guess having fn wrapping_add(self, other: RHS) -> Self::Output { self.add(other) } in the Add trait is also a fairly painless addition. In any case, this is a minor point.

I think we should first exhaust the possibilities for improving the ergonomics within the general language before we resort to adding some kind of special-purpose features (operators, scopes, whatever). We could add typedefs like type u32w = Wrapping<u32>. We should add “heterogeneous” impls of the various operations, so that e.g. (foo: Wrapping<T> + bar: T): Wrapping<T> (to the extent permissible under coherence rules). We could add an easier way to wrap things in Wrapping, whether fn W<T>(n: T) -> Wrapping<T>, or if we grow something like const fn, we could have #[lang(from_literal)] trait FromLiteral { type From; const fn from_literal(From) -> Self; } impl<T> FromLiteral for Wrapping<T> { type From = T; ... } as suggested by @ker. Something analogous to Haskell’s Control.Newtype might be useful. If we’ve done all that and still feel that the ergonomics are in dire need of improvement, then we should think harder about adding some kind of special language support for it.

2 Likes

Wrapping(and maybe saturating) operators make the most sense to me in general. The wrapping annotation would need to be hooked into by so much stuff, and would be super confusing no matter what you decide.

Does #[wrapping] hook into 2.pow(3), 2 << 3, and my_int + my_int? What happens if MyInt fails to impl wrapping add? Does it error? Silently ignore?

Having explicit methods/traits/operators makes what is and isn’t supported, as well as what is and isn’t happening more clear. You can also do things like iter.fold(0, SaturatingAdd::saturating_add) (or .fold(0, saturating_add) with super-UFCS).

On the other hand, are there types that should only have wrapping or saturating semantics? Does that imply that they only impl +~ and not + or something?

I haven’t thought this through as much as is necessary, but another option might be a newtype facility that promises that a destination type shares representation with a source type, and allowing bidirectional as casts between X<T> and X<newtype(T)>? (I don’t believe the current struct(T) “newtype” facility makes the “shared-representation” guarantee.) Then newtype Wrapping<T> = T where T: WrappingOps;, or something. Regarding @alexcrichton’s pain points:

  • Wrapping<T> += T might be a += 3 as _
  • Wrapping<T> = T might be a = 3 as _
  • Wrapping<T> as T would just work.
  • foo(Wrapping<T>) might be foo(Wrapping<T> as _).

Still painful, but in, I think, a good way. (In general, I’m not sure you want to silently change an operand from one type to another: a = 3; will mean you want the result to be type_of(a), and ideally type-inference should make sure that 3 is well typed in that statement, but do we want a + b to be possibly different than b + a in corner cases? e.g. a: Saturating<u8> = 255, b: Wrapping<u8> = 2).

I’ve been thinking along these lines (explicit newtype keyword) as a way of avoiding unnecessary code duplication due to monomorphisation of container types, but the idea isn’t as fleshed out as it wants to be.

I think that @nikomatsakis's proposal doesn't actually suffer this problem due to the fact that it has the same scoping rules as unsafe (e.g. no unsafe modules, just functions, and not nested functions). Or were you thinking of a different kind of scoping?

I definitely agree that we should think through what language features might help out the ergonomics here, but I do also think we should keep in mind that this is a very real ergonomic problem today and adding all the necessary language features may take a good deal of time.

This is a good point about how what #[wrapping] applies to would need to be precisely defined. I think answers to your questions would largely depend on the precise design (which is all up for debate), but certainly good things to keep in mind!

I would still be opposed to this solution, even restricted to functions and blocks. I really don't like the idea of changing code semantics based on attributes. It makes it easy to make mistakes such as copy-pasting code that suddenly has a different meaning, not checking operations that should be checked because they're in a #[wrapping] function and one either forgets or finds it too unergonomic to explicitly use debug-checked operations for the block, etc.

It could also be confusing to figure out to what, exactly, it applies. I'll echo @Gankra, here. Does it apply to operations provided by methods (like pow)? Or the potential cast method discussed in this thread? (Probably not for either of those cases, but one could be forgiven for expecting otherwise.) What about using an operator like + on custom types? You proposal seems to suggest that it would call wrapping_add instead of add, which may or may not be surprising, depending on the type. What if a type implements Add but not WrappingAdd? does the operator fall-back? (Again, this may or may not be desirable, depending on the type and its semantics.) Is it a compile error?

Finally, I still don't like the risk for ambiguity between "for performance" and "semantically want wrapping".

I do think it would be worth while to have a attributes to enable and disable overflow checking for different regions in optimized builds (debug would still check, or at least have the option to) to tweak security vs performance trade-offs, but I think that can probably wait until more optimized "release-mode" checks are available (potentially aggregating checks, skipping checks for values that won't be used, potentially not checking overflow for parts expressions where the final result is guaranteed to be correct, etc.).

It does suffer from this problem. Unsafe doesn't have this problem because it doesn't change the meaning of any code; it just allows you to do more things. #[wrapping] would change the semantics of identical code based on an attribute of its containing function or block. Thus copying code into or out of the function would actually change the meaning of the code. Copying code into an unsafe block doesn't change its meaning, and code copied out will either do the exact same thing or result in a compiler error. This is not at all true for #[wrapping].

Given all of the drawbacks of #[wrapping], if we don't want to wait for the language features to solve this generally, I think adding wrapping operators is the only viable short-term solution. I might personally prefer something like +% (I like having the main operator before the modifier, and the modulo reference makes a bit more sense to me than Swift's mask reference), but I don't think that's particularly important. To respond to some of the stated downsides:

If the use case is not sufficiently large, I think we should definitely focus on incremental improvements and general language features that would improve the ergonomics and efficiency of working with wrapping types and operations, as others have suggested. The premise of having any kind of special-case solution (including #[wrapping]) is that this specific problem is common enough and currently painful enough that it's worth a specialized solution.

While technically this may be true, on a conceptual level, it really doesn't feel like it. Rather than feeling like a bunch of new operators, it really feels like adding a single operator "modifier", where typing +% calls wrapping_add instead of add, and the other operators are analogous. Thus, I don't think it would add a significant mental burden.

Since the premise of adding a specialized feature is that the current issues with wrapping arithmetic are too pressing to wait for general solutions, I'm not sure it makes sense to worry about other situations that may want ergonomic improvements in the future. For the specific example of SIMD, I think one would want SIMD-specific types due to specific size and alignment requirements, and thus it would be fine for it to overload standard operators. Given the existence of wrapping operators, such types could even choose to only implement wrapping operators to convey that the operations wouldn't be checked.

I don't think other collective operations would ever want their own operators unless you had a situation where both collective and non-collective operations made sense for a given type, but this seems rare.

I think this is fairly obvious for the wrapping operators approach. + calls add, +% calls wrapping_add and using either type of operator on a type that doesn't implement the respective trait would be a syntax error.

This assumes that all usages in a function or block will either be wrapping or not wrapping, which I'm not sure is generally the case. It's been a while since I worked on the interpreter I wrote, but as I recall, there were often operations where overflow was desired intermixed with operations where it would have been an error. Having wrapping operators allows each wrapping operation to be individually annotated without it being a significant burden. Without a lightweight way to do this, there will once again be an ambiguity: are all of the operations in this function meant to be wrapping, or was it only 20%, but adding the annotation was much easer than either using wrapping_* all over the place or making lots of little blocks to annotate with #[wrapping].

3 Likes

You can do this today:

use std::num::wrapping::Wrapping as W;
fn main() {
    let x = W(5);
}
1 Like

It seems to me like the "nice" solution to Wrapping<T> would be to just have more support for them from the language:

  • Numeric literals should convert to Wrapping<T> as easily as to T, with the same rules that apply to numeric literals converting to u32, u64, f32, f64.
    • This solves Alex's operations 1 and 2.
  • Support coercion for a as T.
    • This directly solves operation 3, and semi-solves 4 with foo(a as T).

I don't know if allowing foo(a : Wrapping<T>) makes sense, because you certainly want to disallow (a : Wrapping<T>) + (b: T), which is in some sense a.add(b) and a subset of the function-coercion issue. It also might make some sense to keep it explicit: if you are doing something like emulation, and foo(T) doesn't use wrapping, it might be good to remind you that foo(a) might be doing math in non-wrapping ways.

That all seems sane to me, and also seems like it would solve much of the ergonomics. What do you think?

The other problem with this is that its easy to miss one and not notice. With types, you are forced to either use the same wrapping operators consistently, or explicitly mark instances as not using wrapping operators.

As an example of this, Matlab has something similar with matrix operators *, / and '^', and "element-wise" operators .*, ./ and .^. Then you have some big expression like this:

c  = a .* ((1 + b)*(1 - b)).^2 + cos(3 .* a./b)`

And by accident, I used * in the middle and not .*, and there will be no error at compile-time or even run-time: the two operators output the same types, so it will be silently wrong. With a type system backing up Wrapping, that sort of mistake would be much harder to make.

1 Like

The problem is that you might not need a wrapping operation in one place but still do it because you don't bother. This doesn't happen with unsafe because it only allows more operations and does not change existing ones.

But my main problem with this proposal is that it favors Wrapping over all other newtypes (which I think of as a hack), rather than trying to make newtype wrappers more convenient in general, so third-party newtypes can benefit from this too.

1 Like

I think distinct operators are the way to go, because wrapping arithmetic is mathematically and semantically distinct from non-wrapping arithmetic. I mean, you can’t just bitcast a Vec<i32> to a Vec<Wrapping<i32>> and expect sum to still be meaningful. Wrapping types are a kludge. I don’t regard annotations as being much better; the question of what they do and don’t affect is a problem.

On the other hand, most people are not going to need wrapping (let alone saturating) arithmetic, so it seems a little dubious to dedicate operators and associated traits to them.

So, crazy idea: custom operators. (. causes the parser to go into “operator” mode, which works like parsing a macro invocation: it eats tokens until it finds the matching .) token. Everything in between is taken as an operator. Precedence is decided by the first sub-token that matches an existing operator, so if you see (.+%.), then it has the same precedence as +. The trait would need to be associated via some mechanism like so:

#[operator="+%"]
trait AddUnderMod<Rhs> {
    type Output;
    fn add_under_mod(self, rhs: Rhs) -> Output;
}

Of course, you could always make the precedence configurable as well, but this seems like a nicer, more predictable rule.

Thus, code that needs wrapping arithmetic can just use the associated trait. Same with saturating arithmetic, linear algebra, etc.

Edit: I just realised that, by accident, this has some nice properties: editors and the like can recognise custom operators, and determine precedence without having to do semantic analysis, or even have all files available. It also safely namespaces custom operators from “builtin” ones, so we don’t have to worry about forward compatibility.

Downside: it opens the door to “special snowflake” syndrome where every codebase has its own set of completely non-standard operators.

1 Like

A few weeks ago, an event I like to refer to as the wrapocalypse occurred: when Rust started panic!()ing on wrapping math. This event may not have been noticed by many in the Rust community, but, as the maintainer of Rust-Crypto, I, unfortunately, was quite aware of it (ideally, I would have dealt with it earlier, but, c’est la vie). Anyway, the result was that Rust-Crypto wouldn’t work anymore in debug mode due to the large number of operations in Rust-Crypto that required wrapping math.

So, I set out to fix it. Attempt #1 was to wrap all that variables that needed wrapping behavior in the Wrapping newtype. As others have pointed out, Wrapping has a variety of ergonomic issues. After struggling with that for a while, I gave up on that approach. Instead, I converted all of the operations that needed wrapping behavior to user wrapping_{add,sub,mul} which solved the problem.

Based on that experience, my strong suggestions is to get rid of Wrapping altogether. It has all the ergonomic issues mentioned above. Worse, its highly un-ergonomic to transform a u32 into a Wrapping<u32>, do some math, and then transforms it back out again. In fact, that’s the primary reason that I gave up on it. Any variable that needs wrapping math needs to come from somewhere before you operate on it. Once you’re done with whatever calculations you need to do, it needs to go somewhere. The problem is that whatever function produced it (for example: a functions reading a vector of u32s out of a [u8]) probably wants to return a [u32]. And whatever function you want to pass it to next (maybe a function writing it to another [u8]), probably wants a u32 as well. Having to new-type the value before working with it meant that there was a tremendous amount of boilerplate code just wrapping and unwrapping numbers. The wrapping_{add,sub,mul} functions don’t have this problem, and, at least in the case of Rust-Crypto, are much easier to use.

I think the basic problem with Wrapping is, as others have suggested, that wrapping arithmetic is a type of operation on a number. Its a property of the operation as opposed to a property of the type. Right now, when you put a (key, value) into a HashMap, it the key already exists in the map, then the existing value is replaced with the new value. As an alternative, we could reject the update. If we wanted to provide such a method, I think the way we’d do it would be via a new method that had those semantics. I don’t think we’d create a newtype wrapper for HashMaps that modifies the behavior of the put operation - its un-ergonomic and it hides what the operation does at the point where it used based on the declaration of the type which might not be anywhere near the use.

The suggestion for `#[wrapping] might make sense for annotating a bit of code that is performance critical. But, for the more general case of trying to do wrapping math, I think its awkward. Just because I want 51% of the math operations in a function to be wrapping, doesn’t mean that I want the remaining 49% to have that behavior as well. Its also a kind of “spooky action at a distance” - I can’t just look at the operation being performed to see whats happening, I also have to look at the function declaration. Its basically declaring a new subset of Rust that some functions can opt into but others won’t. It may solve some ergonomic issues, but, I think it will introduce much bigger, more profound ones.

So, I described it as the “wrapocalypse” - but, the end result of all that work is that now Rust-Crypto is much better - if you read through the code, it is explicitly clear which operations wrap and which don’t. This is a huge win. But, I think there is too much focus on “how can we make newtypes work” - I don’t think they can. wrapping_{add,sub,mul} is not especially ergonomic, but, even in a code base that makes unusually high use of wrapping operations, I’d say that the transition to these operations was a big win. I think the solution to the ergonomic problem is to treat the wrapping ops as distinct ops - give them their own operations ala Swift. It fixes all of the ergonomic issues (in my opinion) while keeping the code clear and not inflicting wrapping math on operations that shouldn’t be wrapping just because they happen to be located in a function that is doing lots of wrapping math.

10 Likes

It fixes all these problems for wrapping operations. While wrapping operations certainly are common, I worry that we "close the door" for other operations that should be implemented, e.g. my example from above, if you want to implement one's-complement arithmetic for dealing with IP headers.

1 Like

How about allowing arbitrary operators? Define a set of symbols +-/*%^.= that can be combined in any manner to create a new operator. To implement an operator, one needs to define a new trait that works with that operator.

#[operator="+%"]
trait BinaryOperator {
    type RHS;
    type Result;
    #[operator_fn]
    fn wrapping_add(lhs: Self, rhs: RHS) -> Result;
}
2 Likes

I think this is the crux of your argument, but I have to somewhat disagree: a type representing an odometer should never use non-wrapping operations. (To make this more targeted to the domains Rust is interested in, instead of odometer, think TimeInMilliseconds: u16, in which you're interested in the difference between two times, but allow that the time can wrap around pretty quickly.) A type-system is for statically declaring what operations make sense against which types of data, so an odometer-reading type should be prevented at the type-level from using non-wrapping operations. Most numbers should never use the wrapping operations, and the type-system should enforce this, too. Crypto is the special case, in that it should support both types of operations at once, and you've used the correct way of representing that (by naming the operations differently, rather than the types).

I understand that this means there are ergonomic issues with converting types between different arithmetic semantics. Those should be solved. It is, IMO, a general problem that Vec[NewType<T>] cannot be safely-and-explicitly casted to Vec[T], and there should be a general solution to that problem.