`TryFrom` for `f64`

Now that Try{From|Into} have stabilized, have there been any discussions about how these might be implemented for f{32|64}. It’s not obvious how (or if) you’d do it:

Choices (e.g. u64 to f64)

  1. Nearest match (no failure) - the conversion would pick the nearest f64 to the u64. This is undesirable IMO because the semantics don’t match expectations, and round-tripping would not be the identity ((big_num as f64) as u64 != big_num in all cases).
  2. Fail on numbers bigger than (1<<53)-1 - this is a conservative choice - there are some numbers bigger than (1<<53)-1 that can be represented as f64, but they are not dense (i.e. there are gaps).
  3. (My preference I think) succeed if there is an f64 that is exactly equal to your u64, fail otherwise. Doing sums with your f64 will lead to errors, but that’s expected with f64 anyway.

I think that this should be implemented in some way (either TryFrom or special methods), because as well as providing functionality, it gives us a place to raise this issue with the programmer. The programmer will search for the conversion, and then read the docs and learn about the intricacies of working with f64. In some ways I think this teaching opportunity is more important that actually providing the functionality.

5 Likes
2 Likes

This expectation makes me think #1 (nearest match) is fine.

1 Like

Just be careful in the definition of "nearest". As with almost everything in IEEE floating point, such mathematical concepts are somewhat imprecise when realized within the constraints of the encoding. Is +0.0 "nearer" to 0u64 than -0.0? Is rounding "down" toward zero nearer than rounding "up"?

3 Likes

Given that there are multiple reasonable interpretations of these conversions, I think separate methods would probably make more sense than a TryFrom implementation.

6 Likes

That RFC covers the space nicely - the aim is to make as used as little as possible and to make rigorous the different conversion traits (like From, FromLossy etc.).

For my use case (numbers in javascript are all f64), I really need TryFrom rather than FromLossy for f64->i64, for example. I want to error in the case that the number isn’t an exact integer.

The RFC is very ambitious and while I’m totally in favor of exploring the design space it is a big job and would require sustained effort from a team/working group to resolve everything I think. So we’ll have to wait until it becomes a priority for enough people.

1 Like

I understand the reasoning behind this, but I strongly disagree with it. It is too easy for someone to write a function similar to the following:

pub fn trivial(incoming: u64) -> u64 {
    (incoming as f64) as u64
}

which completely hides what is happening on the inside, making it impossible for consumers of the API to understand why in some cases they get the correct results, while in other cases they don't. My personal preference is for option #3. This also makes it easier for doing formal analysis of programs in the future.

7 Likes

f64’s are fuzzy. I think converting a ~20 digit precision u64 to a ~15+ digit f64 should Explicitly be an exercise in rounding. Heck, I’d go so far as to say calling it as a rounding function might be best.

I find the notion of #3 and #2 - sometimes fail - problematic. First, because the computational cost of if then equal branching and double conversion value checks adds up quickly and most frequently isn’t needed. And secondly, ‘sometimes fail’ introduces error behavior that both the user and programmer may have to deal with.

In my ideal and perfect world rust would have f128 primitives - not just from cargo crate ‘f128’ - but given a mere 812 downloads I think my own “why wouldn’t everyone want to do scientific computing with rust at better than part per quadrillion accuracy?” notions are minority.

7 Likes

I'd never thought of it that way, but that's a great way to characterize the issue. Every float value approximates a small interval of arithmetically-nearby real values, with an infinite number of real values in that interval. Floating-point "precision" determines the size of that small interval, relative to the value's magnitude (modified somewhat when the value is denormalized), and thus the "fuzziness" of the specified value.

6 Likes

I think it depends on use-case. For someone writing greenfield applications, you would just never use floats when you want integer accuracy, so rounding is the correct behavior. However, if you're working with a technology that uses f64 for integers (e.g. ecmascript), you might prefer the checking behavior. I think @sfackler is probably correct when he says there should be different methods for different use-cases, although I think the semantic meaining of TryFrom is that it fails when the conversion is not possible exactly, so if TryFrom were to be implemented for f64 -> i64 say, it should use #3

I think a good idea here would be to use “safe integer” bounds like e.g. JavaScript does - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/isSafeInteger - and allow TryFrom conversion to u64/i64 to succeed for such numbers.

Within these bounds, such conversion is guaranteed to be precise and lossless.

Do we have a list of use-cases for this to evaluate against?


Many float to integer conversions I use end up being rounded in a specific way and clamped, so I wouldn’t use try_from for them.

I can imagine using (integer as f32 * float).round().try_from() as a way to scale an integer by a fraction, and use try_from() to catch overflows.

There are no unsigned floats, so .round().try_from() to u32 may be also be a way to check for values < 0. In some calculations I also need values > 0, so I might want to have f32 -> NonZero<u32> conversion, but at this point it’s getting weird. float as u32 > 0 seems good enough, and much clearer.

I can’t think of any example where I wouldn’t combine .round() (or floor() etc.) with .try_from(). Given that 0.1 + 0.2 != 0.3, floats just can’t be trusted to remain exactly integers (and if I didn’t need fractional values anywhere, I’d use integer types).

I’m not sure about aligning with JavaScript here. The 2^53 range is for consecutive integers, but f64 can express integers larger than this.

This is the difference between #2 and #3, #2 is to fail for all big numbers, while #3 is to only fail for big numbers unrepresentable as a float. I suspect you could make #3 quite fast with bit fiddling (mask to get the mantissa, check the distance between the first and last '1' is small enough, or something like that).

It should be #3, on the principle that it should succeed for all values where the inverse conversion gives the original value.

Regarding implementation, probably first have a fast path for small integer by adding 2^K and checking if the result is unsigned < 2^(K+1) and otherwise convert to float, check for being finite, convert back, and check for equality or possibly do it bitwise on integers if that’s faster.

EDIT: converting back and forth and checking doesn’t work because the conversion saturates yet the maximum integer value is not representable as a same-sized float

2 Likes

I don’t like #3 because it’s unpredictable. For example, a f64 can only store even numbers in the interval between 253 and 254. Accepting even numbers and failing with odd numbers sounds error-prone to me: It has the potential to cause bugs that are really hard to track down.

Except maybe in WASM, I always want the implicit rounding behavior. This is already the default in Rust: Writing

let x: f64 = 9007199254740993.;  // stored value: 9007199254740992
let y: f64 = x + 1.;             // stored value: 9007199254740992

implicitly rounds x and y. And that’s exactly what we want when we use a floating-point number. If not, we should use a different type that has a higher precision on this scale.

So, IMHO #1 should be the default, with an additional function for choice #2.

But the thing is, with version #3, you would have found this error easily.

let x: f64 = f64::try_from(9007199254740993).unwrap();  // panics
let y: f64 = x + 1.;                                    // unreachable

Implicitly rounding seems even more error prone to me, because when I add 1, I expect that I should work. If it doesn’t then there is a bug in my code, so version 3 is more robust.

I think we should focus on generic code using TryFrom/TryInto rather than non-generic code. Because that is where the difference actually matters. In non-generic code if you need a different behaviour, it can be achieved with a different method, but not in generic code.

For generic code, it correctness matters more than ease of use. And with TryFrom, I expect that the following code holds generically, otherwise there is a logic bug in the implementation of TryFrom/TryInto.

let t: T = T::try_from(x)?
let u: U = U::try_from(t);
assert_eq!(u, x);

Version #1 does not preserve this property, #2 is too limiting, so I think version #3 is the best one.

1 Like

It's not an error, it's a feature, because a f64 only has 15–16 significant decimals. If you use a f64, that means you don't care about the last digit. Otherwise, you would use a different type (e.g. f128 or u64).

The fact that addition of two floats includes rounding and doesn't always give the correct result is well known.

1 Like

Yes I am aware of that, but it is easy to forget and make a mistake. Especially when writing generic code, which is where TryFrom/TryInto are going to be used a lot.

So, I wrote a small prototype in my local build that is pretty much:

+// lossless {integer} <-> {float} conversions
+macro_rules! try_roundtrip {
+    ($source:ty, $($target:ty),*) => {$(
+        #[stable(feature = "try_from", since = "1.34.0")]
+        impl TryFrom<$source> for $target {
+            type Error = TryFromIntError;
+
+            /// Try to create the target number type from a source
+            /// number type. This returns an error if the source value
+            /// cannot have a lossless round trip.
+            #[inline]
+            fn try_from(u: $source) -> Result<$target, TryFromIntError> {
+                let x = u as $target;
+                let y = x as $source;
+                if y != u {
+                    Err(TryFromIntError(()))
+                } else {
+                    Ok(x)
+                }
+            }
+        }
+    )*}
+}
+

Would this be a reasonable behavior? (Only convert things that can be losslessly converted, otherwise fail)

2 Likes