Heterogeneous math

From my limited Rust experience, the inability to elegantly and safely mix signed and unsigned types is rather frustrating.

I can’t find a good reason to forbid comparison of different numeric data types (e.g. comparing i32 and u64). Answer is always well-defined after all.

I’ve seen some mentions of heterogeneous comparisons on this forum, but couldn’t find any related RFCs. Is there any? I’ll write and implement one if there is none.


On a side note, can anybody think of a reason to not have heterogeneous compound operators?

“u16 + i8” is problematic not because there is no right way to do it, but because we don’t know what’s the desired return type. It always possible to return i32, but that’s not very practical. In this case user probably wants i16 or u16, but we don’t know which one. Math can be made polymorphic by return type, but that really over-complicates everything.

Compound operators don’t have this problem because result type is always known. There is only one way to perform “i16 += u8”: widen to 16-bit, add, and check overflow flag. Same goes for all other heterogeneous compound operators.

Example:

Adding signed delta to unsigned value is a relatively common operation (if you use unsigned at all, that’s it). Right now the only correct way to do it is to use “if”:

let mut base: u32 = 100;
let delta: i32 = -2;
// base = (base as i32) + delta; // cannot be used, incorrect overflow check
if delta > 0 {
    base += delta as u32;
} else {
    base -= -delta as u32;
}

Heterogeneous compound operator is much nicer and just as safe:

let mut base: u32 = 100;
let delta: i32 = -2;
base += delta
5 Likes

I think it’s an inference problem. x < 2 can get the type for 2 from x because there’s no mixing allowed, but if it could be x < 2u32 or x < 2i32 (assuming 32-bit x), then it would be ambiguous without the type suffix on the literal.

Note the inference breaks caused by the following PR, for example: https://github.com/rust-lang/rust/pull/41336

I feel your pain. Casting all over the place is annoying.

Instead of as try using .into(). The as operator is dangerous and error prone, since it won’t warn about accidental lossy casts (such as x as i16 where you originally were expanding u8, but at some point the expression was refactored and x is now i32).

Sadly, many Rust users assume any changes to Rust’s semantic would make it just as bad as C, and oppose even most conservative relaxing of the rules :frowning:

Is it really a problem? Result is the same regardless of the type you choose. Rust doesn't have a problem picking one specific type when I write:

println!("{}", 1)

Into is somewhat better than as, but using it still breaks all overflow checks.

let a = 255u8;
let b = -1i8;
let c = a + b; // this will overflow if you just use "as" or "into"

This particular case has a slight invisible complexity. There is no machine type that can represent both a: i32 and b: u64, so the translation would have to be more code than expected (i.e. a == b -> (a >= 0) && ((a as u64) == b)).

It's a very small price to pay for the convenience though.

That's true, but there is no faster way to perform it correctly anyway. Right now most people would just cast it potentially getting incorrect result.

It's probably a bad idea, but heterogeneous ops can be optional. It's a set of traits from language perspective. They don't have to be automatically imported/used.

For what it’s worth, I totally agree that those things should be allowed.

That's a different feature, where a literal defaults to i32 if it's unconstrained, mostly so that little examples like that work.

But it's not picking it smartly:

fn main() {
    println!("{:?}", 4_333_222_111);
}

Will tell you

warning: literal out of range for i32
 --> <anon>:2:22
  |
2 |     println!("{:?}", 4_333_222_111);
  |                      ^^^^^^^^^^^^^
  |
  = note: #[warn(overflowing_literals)] on by default

and print 38254815.

It could work the same way for unconstrained comparisons.

Rust could be a bit smarter about literals, but that's a rather minor inconvenience in my book.

What about a “safenum” or “midnum”, as opposed to “bignum”, package from which anything could be converted cleanly?

It could represent numerical types as an enum of the actual machine types and do the correct comparisons based on how the enum destructured?

We should avoid storing a “safenum” on the heap due to the overhead of the enum and runtime checks, but if we create and consume them transiently then inlining should eliminate those runtime checks.

in general, conversions from a safenum would tent to produce an Option or Result.

In the long run, I’d love to see numerically constrained types make this stuff work more cleanly.

I can see how it can be useful if you need to check for overflows, but don't want to check every single operation.

That's somewhat orthogonal functionality, though. Heterogeneous comparisons can be implemented efficiently without any wrappers.

I think this could be worth multiple different RFCs:

  • Expanding the set of implementations of AddAssign and other compound operators is a clear win, the result is obvious and the implementation will take care of performing the correct checks for underflow or overflow where the user could stumble; it might cause inference issues, but those are excluded from the backward compatibility guarantees,
  • Expanding the set of implementations of Add and other non-compound operators is more problematic; some cases have an obvious result type (when one type can be “promoted” losslessly to the other), however some combinations are problematic (i32 + i16/u16 is obvious, u32 + i16 is not). Widening is somewhat surprising.

So I would propose starting with a first RFC for compound operators: it’s immediately actionable, and doesn’t preclude discuss the others.

There’s some previous discussion here. (I called it “integer promotion” there.)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.