`lossless_as` operator

kornel · July 14, 2015, 10:46am

(name/syntax to be bikeshedded; In this post I use T as U as abbreviation of let x: T; x as U).

Currently the as operator used for casting does one of two things:

Lossless, reliable conversion to a compatible type of same or larger size (e.g. u32 as u64).
Potentially lossy and dangerous truncation to a smaller type (e.g. usize as u8).

The problem is that I often only want the first one, but I have no way of preventing the second case from unintentionally happening.

Although unexpected integer overflow is “safe” per Rust’s strict definition, it’s still a problem that can cause bugs, data loss and panics in Rust code, and cause memory safety issues when wrong values are passed through FFI.

For example usize as u32 is correct on 32-bit architectures, but on 64-bit architectures it will compile without warnings to code that may be subtly broken (and vice-versa with u64 as usize).

It’s easy to say “well, just be careful with types you cast”, but that’s not so easy in practice:

Sizes of aliased type names are non-obvious. How do you know whether somelibrary::Distance as usize is correct (lossless), on all platforms?
And you can’t know whether it’ll remain correct in v2.0 of the library (but the cast will, unfortunately, compile without warnings even when it becomes incorrect).
Size of usize and C types can vary, especially on architectures that I don’t test on
I suppose all of my Rust code would compile on 16-bit platforms, but would fail miserably at runtime. I’d prefer it to fail to compile on some architectures if some uses of as that were lossless on 64-bit became lossy on the smaller architecture, as the compile errors would help reviewing and fixing the code.

For integers Rust has overflow-checked and wrapping variants. I think of current as operator as the wrapping variant of type conversion, and I’d like to have a variant that prevents unintentional overflows.

llogiq · July 14, 2015, 1:31pm

Would a lint warning of lossy type conversions be sufficient (this could even have variants to set word size to 16/32 bits)?

By the way, while the case may be clear with u32 as u8, there are a good number of cases where the information loss may not be obvious (or may only happen for certain values)?

Examples:

u32 as i32 – what about numbers > 2³¹?
u32 as f32 – what about numbers > 2²⁴?
i16 as u32 – what about the sign?

So there are three bits that may get lost:

overflow
loss of sign
loss of precision

kornel · July 14, 2015, 3:29pm

But how do you distinguish between deliberate loss and accidental loss?

There are cases where truncation is desired, e.g. hashing and other bit-twiddling. There are cases where the loss won’t happen in practice (in examples below it’s obvious, but in practice the check/rounding could happen in another part of the program).

if x < 256 { x as u8 } else { 255u8 };
let y: i64 = float.round() as i64;

petrochenkov · July 14, 2015, 3:42pm

I’m planning to write an RFC based on this experiment with into() for lossless integer conversions and checked_cast() / wrapping_cast() for potentially lossy integer conversions, but the issue did’t seemed pressing, so I was in no hurry. Now when there’s at least one more interested person maybe it’s time to start writing.

llogiq · July 14, 2015, 7:10pm

But how do you distinguish between deliberate loss and accidental loss?

I don't. My suggestion is to Allow the lint by default, and you can set it to warn wherever you have code that should not have such conversions. Also note that your second example could overflow and doesn't handle NaN.

kornel · July 14, 2015, 7:51pm

In that case I think syntax of .into() is easier to use and more robust than (edit: any sort of lint syntax)

Oops! Another argument for safer casts

llogiq · July 14, 2015, 9:55pm

You still got that backwards – it would be allow by default, and lint only if you #![warn(lossy_cast)]. Look into the clippy issue, it has more detailed information.

That said, I'm pretty happy with the idea of using Into for this. As I said elsewhere, whoever designed this is a genius.

kstep · July 14, 2015, 10:35pm

The problem with “allow by default” lints is you can be even unaware of their existence. IMHO, such lints as “lint_as” should be warn by default: not actually stop you from compiling, but letting you know there’s something to check.

llogiq · July 15, 2015, 10:24am

@kstep: Agreed. But in this case, the lint would either be ridiculously complex (it would have to check value ranges for all bindings during the program, using symbolic execution to check all code paths, etc. – and would still get false positives) or catch a lot of false positives (i.e. anywhere we do a narrowing cast).

If we choose the former, the lint will probably never get done at all.

If however we choose the latter, we’ll likely split it into three (as in the clippy issue):

possible overflow (e.g. u32 as u8) – This will probably come up so often that people will just #![allow] it should we decide to Warn by default.
loss of sign (e.g. i32 as u64) – This is probably uncommon enough that Warn would be OK.
loss of precision (e.g. i32 as f32) – I believe this should be Allow by default, for it isn’t too unusual, and the loss of precision is acceptable for a large set of programs

kstep · July 16, 2015, 9:01pm

As an idea, maybe the lint should be split up into several ones? E.g. #[allow(narrowing_cast)],#[allow(sign_losing_cast)], #[allow(precision_losing_cast)] (not sure about naming, though).

llogiq · July 16, 2015, 9:30pm

This is exactly what I proposed in the clippy issue, just with different names.

kornel · July 16, 2015, 9:55pm

Without syntax in the language or traits in core there isn’t even a good way for a linter to distinguish between cases of lossy-cast-intended-to-be-lossy and lossy-cast-by-accident, so I don’t think a linter can be a solution to this problem.

Optional lints, especially from 3rd party packages that are not part of the language, end up being a “nice to have” thing, and such warnings are generally treated as matter of opinion, but IMHO casts causing data loss where programmer required lossless conversion should always be hard compilation errors.

petrochenkov · July 19, 2015, 7:26pm

Here it is: https://github.com/petrochenkov/rfcs/blob/intconv/text/0000-integer-conversions.md
(Please, report any grammar mistakes you see, preferably on github. I’ll fix them and submit the RFC PR.)

petrochenkov · July 20, 2015, 9:32pm

Submitted:

system · March 25, 2019, 8:10am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
On Casts and Checked-Overflow bikeshed (deprecated)	50	11003	March 25, 2019
Rust 2018: facing the cast problem language design	16	3006	March 25, 2019
Let's deprecate `as` for lossy numeric casts	54	9496	July 12, 2022
Numeric .into() should not require everyone to support 16-bit and 128-bit usize language design	19	3860	March 25, 2019
To reduce the number of true casts language design	11	3758	March 25, 2019

`lossless_as` operator

Related topics