Proposal: integer conversion methods

I would’ve expected this to be covered by From/Into impls between Wrapped<u8> and u16 and all the other combinations, but apparently there are no such impls.

Is there any reason we couldn’t add those impls, as an alternative to adding all of OP’s inherent methods?

Just a note on potentially relevant prior art: the conv crate (Disclaimer: I wrote it) tried to address this by defining two extra conversion trait pairs: Value{From,Into} (for value-preserving conversions, which is mostly subsumed by current Try{From,Into}), and Approx{From,Into}. The Approx* traits also required the user specify an approximation scheme as a generic parameter (DefaultApprox, Wrapping, plus RoundTo* for float → integer).

All together, that gave you lossy casts (with as), checked casts (with Value*), wrapping casts (with Approx* + Wrapping scheme), and saturating casts (with Value + saturate/unwrap_or_saturate extension methods on the result).

Also as a note, the OP’s impls for usize and isize are not forward-looking, and thus IMO are incomplete, because they do not account for usize == u128, which the RISC-V architecture supports as the logical endpoint of continuing growth in storage size. The missing impls are:

impl usize {
    fn checked_to_u64(self) -> Option<u64>;
    fn saturating_to_u64(self) -> u64;
    fn wrapping_to_u64(self) -> u64;
 }

impl isize {
    fn checked_to_i64(self) -> Option<i64>;
    fn saturating_to_i64(self) -> i64;
    fn wrapping_to_i64(self) -> i64;
 }

and the following potentially-silently-truncating impls, which presume that Rust will never support usize larger than u64, should be removed:

impl usize {
    fn to_u64(self) -> u64;
 }

impl isize {
    fn to_i64(self) -> i64;
}
1 Like

For those who think that this is too many functions, they could all be replaced by two additional traits:

  1. From/Into (existing, lossless conversion)
  2. TryFrom/TryInto (existing, checking conversion)
  3. TruncateFrom/TruncateInto (new, truncating/wrapping conversion)
  4. SaturateFrom/SaturateInto (new, saturating conversion)

Bikeshedding aside, this separates the debate about adding lots of inherent methods from that about having a way to do truncating/saturating conversions without as being a foot gun.

Note: An amusing application for TruncateFrom would be converting from a [u8] into a [u8; N].

3 Likes

It could also be done with three new types, (bikeshed names)

Saturating<T>(pub T);
Wrapping<T>(pub T);
Truncating<T>(pub T);

Then have have impls

From<Saturating<{integer}>> for {integer} { ... }
From<Wrapping<{integer}>> for {integer} { ... }
From<Truncating<{integer}>> for {integer} { ... }

for all pairs of integers, this way we can use the existing From machinery that we have, and have it documented separately.


With this we could also have the normal arithmetic operators setup to safely do the operations with the given protocol.

3 Likes

Open RFC in this area:

I’ve been meaning to try and propose a subset of that to avoid the floating point questions, so just

trait IntegerFrom<T> {
    fn wrapping_from(_: T) -> Self;
    fn saturating_from(_: T) -> Self;
    fn checked_from(_: T) -> Option<Self>;
}
trait IntegerInto<T> {
    ... I bet you know ...
}

Aping the naming of all the inherent methods. (And yes, checked_from is just try_from().ok(), but I’d still have it there to follow the pattern.)

I really want this so that, like we have with from, we can propose clearer-intent method calls instead of as – especially as _.

2 Likes

I don’t mind using my own traits until consensus stabilizes, but I really don’t like using fn names that will differ from those of the eventual standardized trait. Can we get past the bikeshed naming phase and decide on the various fn names now, particular for those fns usable with method syntax?

I’d really like to see more focus on reducing the amount of conversions necessary at all. Most of the time I know that the conversion I’m doing should never fail, but I still have to pick from among a whole bunch of different options. Typically I’m just dealing with small positive integers but still have to go through the motions of converting them between u32, u64, usize and sometimes even i64 all because different functions/libraries have different requirements.

One example where this is really all pointless is when indexing a vector. There’s no reason why I shouldn’t be able to use any unsigned integer type: there already has to be a bounds check, why should I need to convert to usize first?

1 Like

I don’t think we need even more traits. I almost always know the exact integer types I’m dealing with and don’t need the genericness of a trait, I just need a way to convert from one integer type to the other concisely and descriptively.

TryFromLossy in particular just looks waay too vague: “convert something into something else in a way that might lose information and could also fail”. The entire point of traits is to act as bounds on generic types. Why would I ever write <T: TryFromLossy<Foo>> without knowing anything about the nature of the conversion, why it could fail, or the implications of the lossyness? If you’re only ever calling your trait methods with concrete, non-generic types, then there’s no reason whatsoever to be using a trait at all. Your code would be clearer and more flexbile if you turned all the trait impls into inherent impls. In the case of TryFromLossy that clarity and flexibility means having method names that describe the kind of conversion being performed, and being able to have multiple different methods for different kinds of lossy-and-possibly-failing conversions.

Why are people averse to adding more inherent methods?

@bascule

I’m aware I can already perform most these conversions with from and friends. My gripe is that I’m often writing code with lots of fiddly little integer conversions, and compared to .to_u16():

  • u16::from(x) can’t be written as a method call
  • .into() doesn’t say what it’s converting into
  • .into::<u16>() is longer and uglier

I’m aware that most these inherent methods would be technically redundant, but I don’t think that matters and I certainly don’t think it’s more idiomatic to use a trait method where there are no generic types involved.

Me too. Specifically I do cryptography, where I often want different semantics (truncation/overflow/wrapping/checked) for different circumstances.

…but into() can…

…so add an explicit type annotation if that makes you feel better?

…so use a type annotation instead of the turbofish?

As someone who cares deeply about language aesthetics, these seem like extremely minor syntactic nits which do not justify adding 310(!!!) new methods to the language which both duplicate and are incompatible with existing functionality, and invent a separate, parallel, bespoke language of method names instead of modeling the same concepts with types and traits.

Adding a small number of traits and/or trait impls which can interoperate with the existing types like Wrapping would make an awful lot more sense to me.

9 Likes

Note that x.into::<u16>() is not a valid spelling (fn into() itself doesn’t have generic parameters so you can’t turbofish it). You need to write one of

let y: u16 = x.into();
// or
let y = Into::<u16>::into(x);

If RFC 2522 (type ascription) is accepted this would be written as

let y = x.into(): u16;
// vs.
let y = x.to_u16();
1 Like

It would be helpful to have a couple code samples with a lot of casts in them. It is really hard to judge the ergonomics of any of these options looking at only a single conversion…

Edit: One possible example from my own code.

Here’s a simple example that shows the reduced readability (vs as):

use core::num::Wrapping;
impl crypto_support for u32 {
  // Define arithmetic operator on u32s for 32b x 32b -> 32b pseudo-modulo
  // multiply that approximates multiplication modulo G(x) ≈ x.pow(32) + 1
  #[inline(always)]
  fn mul_mod_gx(self, rhs: &Self) -> Self {
    let t: u64 = self as u64 * rhs as u64;     // 32b x 32b -> 64b multiply
    (t as u32).wrapping_sub((t >> 32) as u32)  // subtract MS half from LS half
  }

versus

    (t.to_wrapping_u32()).wrapping_sub((t >> 32).to_wrapping_u32())

That situation looks not like a cast problem to me, but instead like

where it would be (using my current favourite proposal)

let (low, high) = self.mul_with_carry(*rhs, 0);
low.wrapping_sub(high)

And that snippit is a great example of why as scares me:

That’s casting &_ to u64, which feels to me like it could be a ptr2int cast. I assume it’s not, and it’s doing what you wanted, but it’s really quite hard to be sure.

1 Like

This was offered only as an example. In actual use the inputs are wrapped u32s (w32s). I’m fully aware that Rust’s references and unsafe raw pointers are not simply integers (unlike in many other languages).

Some architectures implement 32b x 32b -> 64b multiply in one instruction, others implement it in two related instructions (one generating the low-order 32b and the other the high-order 32b of the 64b result), while still others can’t generate the high-order 32b in hardware at all. It’s low-level differences such as that which drive much crypto to be written in assembly.

Addendum: I should have pointed out that LLVM does not provide the needed n x n -> 2n multiply, even though most architectures support it.

(I still think we should allow implicit widening, that is, coerce for example u8 to u16.)

4 Likes

… or anything not larger to usize for indexing.

Am I off base in thinking all of these conversion functions can be accomplished with a single macro?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.