(name/syntax to be bikeshedded; In this post I use T as U as abbreviation of let x: T; x as U).
Currently the as operator used for casting does one of two things:
Lossless, reliable conversion to a compatible type of same or larger size (e.g. u32 as u64).
Potentially lossy and dangerous truncation to a smaller type (e.g. usize as u8).
The problem is that I often only want the first one, but I have no way of preventing the second case from unintentionally happening.
Although unexpected integer overflow is āsafeā per Rustās strict definition, itās still a problem that can cause bugs, data loss and panics in Rust code, and cause memory safety issues when wrong values are passed through FFI.
For example usize as u32 is correct on 32-bit architectures, but on 64-bit architectures it will compile without warnings to code that may be subtly broken (and vice-versa with u64 as usize).
Itās easy to say āwell, just be careful with types you castā, but thatās not so easy in practice:
Sizes of aliased type names are non-obvious. How do you know whether somelibrary::Distance as usize is correct (lossless), on all platforms?
And you canāt know whether itāll remain correct in v2.0 of the library (but the cast will, unfortunately, compile without warnings even when it becomes incorrect).
Size of usize and C types can vary, especially on architectures that I donāt test on
I suppose all of my Rust code would compile on 16-bit platforms, but would fail miserably at runtime. Iād prefer it to fail to compile on some architectures if some uses of as that were lossless on 64-bit became lossy on the smaller architecture, as the compile errors would help reviewing and fixing the code.
For integers Rust has overflow-checked and wrapping variants. I think of current as operator as the wrapping variant of type conversion, and Iād like to have a variant that prevents unintentional overflows.
By the way, while the case may be clear with u32 as u8, there are a good number of cases where the information loss may not be obvious (or may only happen for certain values)?
But how do you distinguish between deliberate loss and accidental loss?
There are cases where truncation is desired, e.g. hashing and other bit-twiddling. There are cases where the loss wonāt happen in practice (in examples below itās obvious, but in practice the check/rounding could happen in another part of the program).
if x < 256 { x as u8 } else { 255u8 };
let y: i64 = float.round() as i64;
Iām planning to write an RFC based on this experiment with into() for lossless integer conversions and checked_cast() / wrapping_cast() for potentially lossy integer conversions, but the issue didāt seemed pressing, so I was in no hurry. Now when thereās at least one more interested person maybe itās time to start writing.
But how do you distinguish between deliberate loss and accidental loss?
I don't. My suggestion is to Allow the lint by default, and you can set it to warn wherever you have code that should not have such conversions. Also note that your second example could overflow and doesn't handle NaN.
You still got that backwards ā it would be allow by default, and lint only if you #![warn(lossy_cast)]. Look into the clippy issue, it has more detailed information.
That said, I'm pretty happy with the idea of using Into for this. As I said elsewhere, whoever designed this is a genius.
The problem with āallow by defaultā lints is you can be even unaware of their existence. IMHO, such lints as ālint_asā should be warn by default: not actually stop you from compiling, but letting you know thereās something to check.
@kstep: Agreed. But in this case, the lint would either be ridiculously complex (it would have to check value ranges for all bindings during the program, using symbolic execution to check all code paths, etc. ā and would still get false positives) or catch a lot of false positives (i.e. anywhere we do a narrowing cast).
If we choose the former, the lint will probably never get done at all.
If however we choose the latter, weāll likely split it into three (as in the clippy issue):
possible overflow (e.g. u32 as u8) ā This will probably come up so often that people will just #![allow] it should we decide to Warn by default.
loss of sign (e.g. i32 as u64) ā This is probably uncommon enough that Warn would be OK.
loss of precision (e.g. i32 as f32) ā I believe this should be Allow by default, for it isnāt too unusual, and the loss of precision is acceptable for a large set of programs
As an idea, maybe the lint should be split up into several ones? E.g. #[allow(narrowing_cast)],#[allow(sign_losing_cast)], #[allow(precision_losing_cast)] (not sure about naming, though).
Without syntax in the language or traits in core there isnāt even a good way for a linter to distinguish between cases of lossy-cast-intended-to-be-lossy and lossy-cast-by-accident, so I donāt think a linter can be a solution to this problem.
Optional lints, especially from 3rd party packages that are not part of the language, end up being a ānice to haveā thing, and such warnings are generally treated as matter of opinion, but IMHO casts causing data loss where programmer required lossless conversion should always be hard compilation errors.