`lossless_as` operator

(name/syntax to be bikeshedded; In this post I use T as U as abbreviation of let x: T; x as U).

Currently the as operator used for casting does one of two things:

  1. Lossless, reliable conversion to a compatible type of same or larger size (e.g. u32 as u64).
  2. Potentially lossy and dangerous truncation to a smaller type (e.g. usize as u8).

The problem is that I often only want the first one, but I have no way of preventing the second case from unintentionally happening.

Although unexpected integer overflow is ā€œsafeā€ per Rustā€™s strict definition, itā€™s still a problem that can cause bugs, data loss and panics in Rust code, and cause memory safety issues when wrong values are passed through FFI.

For example usize as u32 is correct on 32-bit architectures, but on 64-bit architectures it will compile without warnings to code that may be subtly broken (and vice-versa with u64 as usize).

Itā€™s easy to say ā€œwell, just be careful with types you castā€, but thatā€™s not so easy in practice:

  • Sizes of aliased type names are non-obvious. How do you know whether somelibrary::Distance as usize is correct (lossless), on all platforms?
    And you canā€™t know whether itā€™ll remain correct in v2.0 of the library (but the cast will, unfortunately, compile without warnings even when it becomes incorrect).

  • Size of usize and C types can vary, especially on architectures that I donā€™t test on :slight_smile:
    I suppose all of my Rust code would compile on 16-bit platforms, but would fail miserably at runtime. Iā€™d prefer it to fail to compile on some architectures if some uses of as that were lossless on 64-bit became lossy on the smaller architecture, as the compile errors would help reviewing and fixing the code.

For integers Rust has overflow-checked and wrapping variants. I think of current as operator as the wrapping variant of type conversion, and Iā€™d like to have a variant that prevents unintentional overflows.

5 Likes

Would a lint warning of lossy type conversions be sufficient (this could even have variants to set word size to 16/32 bits)?

By the way, while the case may be clear with u32 as u8, there are a good number of cases where the information loss may not be obvious (or may only happen for certain values)?

Examples:

  • u32 as i32 ā€“ what about numbers > 2Ā³Ā¹?
  • u32 as f32 ā€“ what about numbers > 2Ā²ā“?
  • i16 as u32 ā€“ what about the sign?

So there are three bits that may get lost:

  • overflow
  • loss of sign
  • loss of precision

But how do you distinguish between deliberate loss and accidental loss?

There are cases where truncation is desired, e.g. hashing and other bit-twiddling. There are cases where the loss wonā€™t happen in practice (in examples below itā€™s obvious, but in practice the check/rounding could happen in another part of the program).

if x < 256 { x as u8 } else { 255u8 };
let y: i64 = float.round() as i64;

Iā€™m planning to write an RFC based on this experiment with into() for lossless integer conversions and checked_cast() / wrapping_cast() for potentially lossy integer conversions, but the issue didā€™t seemed pressing, so I was in no hurry. Now when thereā€™s at least one more interested person maybe itā€™s time to start writing.

3 Likes

But how do you distinguish between deliberate loss and accidental loss?

I don't. My suggestion is to Allow the lint by default, and you can set it to warn wherever you have code that should not have such conversions. Also note that your second example could overflow and doesn't handle NaN.

In that case I think syntax of .into() is easier to use and more robust than (edit: any sort of lint syntax)

Oops! Another argument for safer casts :slight_smile:

You still got that backwards ā€“ it would be allow by default, and lint only if you #![warn(lossy_cast)]. Look into the clippy issue, it has more detailed information.

That said, I'm pretty happy with the idea of using Into for this. As I said elsewhere, whoever designed this is a genius.

The problem with ā€œallow by defaultā€ lints is you can be even unaware of their existence. IMHO, such lints as ā€œlint_asā€ should be warn by default: not actually stop you from compiling, but letting you know thereā€™s something to check.

@kstep: Agreed. But in this case, the lint would either be ridiculously complex (it would have to check value ranges for all bindings during the program, using symbolic execution to check all code paths, etc. ā€“ and would still get false positives) or catch a lot of false positives (i.e. anywhere we do a narrowing cast).

If we choose the former, the lint will probably never get done at all.

If however we choose the latter, weā€™ll likely split it into three (as in the clippy issue):

  • possible overflow (e.g. u32 as u8) ā€“ This will probably come up so often that people will just #![allow] it should we decide to Warn by default.
  • loss of sign (e.g. i32 as u64) ā€“ This is probably uncommon enough that Warn would be OK.
  • loss of precision (e.g. i32 as f32) ā€“ I believe this should be Allow by default, for it isnā€™t too unusual, and the loss of precision is acceptable for a large set of programs
1 Like

As an idea, maybe the lint should be split up into several ones? E.g. #[allow(narrowing_cast)],#[allow(sign_losing_cast)], #[allow(precision_losing_cast)] (not sure about naming, though).

This is exactly what I proposed in the clippy issue, just with different names.

Without syntax in the language or traits in core there isnā€™t even a good way for a linter to distinguish between cases of lossy-cast-intended-to-be-lossy and lossy-cast-by-accident, so I donā€™t think a linter can be a solution to this problem.

Optional lints, especially from 3rd party packages that are not part of the language, end up being a ā€œnice to haveā€ thing, and such warnings are generally treated as matter of opinion, but IMHO casts causing data loss where programmer required lossless conversion should always be hard compilation errors.

1 Like

Here it is: https://github.com/petrochenkov/rfcs/blob/intconv/text/0000-integer-conversions.md
(Please, report any grammar mistakes you see, preferably on github. Iā€™ll fix them and submit the RFC PR.)

1 Like

Submitted:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.