I’ve stumbled upon yet another bug in C++ code due to mixing signed and unsigned types, and decided to check how it might be handled in Rust.
Actually, I’m a bit surprised.
fn main() {
let a = -2;
let b:u32 = 2;
let c = a / b;
println!("{} / {} = {}", a, b, c);
}
Output
<anon>:2:10: 2:12 error: unary negation of unsigned integers may be removed in the future
<anon>:2 let a = -2;
^~
error: aborting due to previous error
So far so good, but it looks like a special case. Minor changes and we got a terrible result of mixing signed and unsigned types.
fn main() {
let a = 2;
let b = 6;
let c:u32 = 2;
let d = (a - b) / c;
println!("({} - {}) / {} = {}", a, b, c, d);
}
Output
(2 - 6) / 2 = 2147483646
Why such unsafe code is possible in such safe language?
Also, it is hard to reason about a piece of code like let a = 2, as it is not possible to know whether a is signed or unsigned.
If you had compiled your example in debug mode it would just panic in run-time.
thread '' panicked at 'arithmetic operation overflowed'
There was very heated debate about overflow checks in release builds, run-time costs and other stuff some time ago and it was decided that overflows should be checked in debug for testing but not in release because of performance impact.
Compile-time constrains would make every arithmetic expression a mess because every non-constant expression can possibly overflow.
Actually, your code does not mix signed and unsigned types, a, b, c, d are all u32. Subtraction of unsigned integers is defined in Rust. More over integer overflow is defined in Rust (unlike signed integer overflow in C++). Overflow will produce a panic! in debug build, and will wrap in release build.
I'm aware of overflow checks in debug and their absent in release builds, and it is obviously good solution.
I'm wondering why this code is not rejected by compiler? It is error-prone code, isn't it? I would rather add some explicit type annotation to make it compilable.
Yes, I understand, but even if it is due to type inference rather than integral promotion in C++, it has absolutely the same result: implicit switch from signed to unsigned arithmetic.
This code leads to unexpected switch from signed to unsigned arithmetic, it is not about over/underflow.
As far as I understand, there is no signed arithmetic in your example. All initial values and all intermediate values are u32. May be there is a terminology issue here? What is your definition of unsigned and signed arithmetic?
I'm wondering why this code is not rejected by compiler?
It is impossible to predict at a compile time if acertain operation will overflow, hence the run-time checks.
Also, it is hard to reason about a piece of code like let a = 2
You can use a literal suffix to make this obvious:
let a = 2u32;
let b = 2u64;
let c = 2i64;
let d = 2isize;
let e = 2usize;
Ok, read this code. It is pretty straight forward isn’t?
fn print_foo(foo: Foo) {
let a = 2;
let b = 6;
let c = a - b;
let d = c / foo.count;
println!("{}\n", d);
}
fn main() {
let foo = Foo { count : 2 };
print_foo(foo);
}
Sure, Foo is defined in another module/file/library, as it is usually is in non-trivial code. One morning someone changed Foo definition from
Hm, I've found one more potential source of confusion here. Integer literals without suffixes are polymorphic. Their type is inferred from use, if it is unambiguous, and defaults to i32 (am I correct here?) if it is not constrained.
That is, in the following code
let c = 92;
The c can have any integral type, and you need to see usages of c to determine it's precise type. In your first example,
let a = -2;
let b:u32 = 2;
let c = a / b;
all three variables are typed as u32 because of the explicit annotation for b.
So this
Is not always true. There is not enough information to say if these are signed or unsigned numbers. If the next line is, say, let d = c / 2u32 then these are signed 32 numbers. But if the next line is let d = c * 92i64 these are signed 64 bit numbers.
This looks complicated, but in practice is rarely a problem. Oftentimes you have an explicitly typed variable in the expression, and you can always use suffixes. Please not also that the issue is not with type of an arithmetic expression, but with the type of a literal.
Yes, in such case the only guarantee is a type annotation on any variable or a suffix on any literal. I think it’s a dilemma of static polymorphism in general:
if you change a type, you don’t need to change all it’s usages and it’s good.
if you change a type, the semantics of each usage is silently changed and you are not warned by a compiler error and it’s bad.
Here is an example of a similar issue without integers:
struct A;
struct B;
impl Default for A {
fn default() -> A { A }
}
impl Default for B {
fn default() -> B { B }
}
trait T {
fn do_something(&self);
}
impl T for A {
fn do_something(&self) {
// Make something good.
}
}
impl T for B {
fn do_something(&self) {
panic!("Destroy the world.")
}
}
struct Foo {
field: A
}
fn act(foo: Foo) {
foo.field.do_something();
}
fn main() {
let foo = Foo { field: Default::default() };
act(foo);
}
FWIW, there was a period of time when there was no implicit fallback type at all. If an unsuffixed literal couldn’t be inferred, you’d get an error. So your example of “let a = 2; let b = 6; let c = a - b;” would have failed if there was nothing else to infer the exact type.
RFC 212 restored the fallback as i32, which might be interesting reading for you (including PR comments).
To be clear: default inference basically only kicks in for examples or unit
tests. All other code I’ve ever seen pretty quickly forces the types to be
concrete (either by interacting with a struct or a function). For instance,
if you index into an array, it’s gotta be a usize.
The fact that you can get silly things in toy code is not particularly
concerning to me.
One minor caveat / question:
I haven’t checked this but I’d personally expect some kind of lint/warning when a negative literal is inferred to be unsigned. I know that C programmers think that:
unsigned a = -1;
is perfectly fine and idiomatic but even in C this is a type unsafe way of writing the equivalent type safe: