Scientific notation when formatting floating point numbers

These are two separate problems.

What you are saying is that as the author of the code, I might know, for example, that my result has no more than five digits of precision, and then it makes sense to use formatting options to trim the printout to five significant digits.

The other question is what the default should be if no formatting options are specified. I think when it comes to deciding the number of significant digits, the current state is best (i.e. use the smallest number that round-trips, which is limited by the fact that f64 has no more than 17 decimal digits of precision).

I see, thanks. Maybe Mathematica does not support e-notation, but then they support scientific notation using some alternative syntax.

Integers and floating point numbers are already quite different in the way the work and in the way they are used, so they need not be consistent in the way they are printed.

While an f64 might not be the intended real number due to rounding errors in prior calculations, it does represents exactly some other (rational) number.

To me it would make sense if by default the number represented was printed (even when it is not the original intended number). Whereas currently printing it produces a third number which is neither the intended number nor the represented rational number.

I think this is a significant source of confusion that people have with floating point numbers.

It depends on the reason you're printing it. If you're printing a number for some further processing then it probably makes sense to limit to the 17 digits. If you're printing it because you're debugging rounding errors, it makes sense to see the exact value represented. If you're printing the results of some calculation for human consumption, you want 3 digits or whatever.


Note that this is also true of integers in some cases; it's more common with floating point numbers, since they are explicitly designed as an approximation of the reals, but you get this with integers too when they're being used as a measure of reality, too.

Whether any number is an exact representation of the intended value is always a property of the program that the programmer tracks, and not a matter of how it's represented.


Once I have accepted that a floating point value is in general not equal to the intended value - and not even the representable value closest to the intended value - then what's the harm in replacing a value by a nearby one?

The larger concern with floating points is usually how errors accumulate and propagate, and potentially magnify, through a calculation.

Even if most people don't know fully how floats are serialized and deserialized, isn't that a fairly minor concern for most people?

You seem to agree that for further processing and human consumption, 17 digits or less are enough. So I'm assuming that's what Display::fmt should do, while the debugging part is the job of Debug::fmt.

Even for the debugging, I don't see how we would need three hundred digits. Eighteen should be enough to recover the original value. Except I'm not sure about special values - do these round trip anyway? How about different versions of NaN?

I'm not sure what we mean by "debugging rounding errors" - is this about debugging the processor? Or the compiler for platforms that don't have floating point arithmetic? Maybe in this case, you really want to see the binary representation?

If it was a random nearby value, the harm would be that it would be a misrepresentation of what is in the variable. The whole point of printing a variable is to see what is represented in the variable.

What saves this is that it is not a random nearby value but a reversible 1-1 algorithm, so it could be considered a kind of an accurate representation. So then the question is: is this representation understood by users? Definitely not by beginners at least (hence a lot of questions of the sort: I print x, y, z and get "0.1", "0.2", "0.3" exactly, and yet x + y != z).

Of course if you're explicitly asking the Display mechanism to round then that's a different story.

That's a reasonable compromise too.

The confusion is over examples like this:

fn main() {
  let x: f64 = 0.1;
  let y: f64 = 0.2;
  let z: f64 = 0.3;
  println!("x = {x}");
  println!("y = {y}");
  println!("z = {z}");
  println!("x + y == z: {}", x+y == z);

where one might expect the last line to print true, but it prints false:

x = 0.1
y = 0.2
z = 0.3
x + y == z: false

Increasing the number of digits to twenty makes it more clear what is going on:

x = 0.10000000000000000555
y = 0.20000000000000001110
z = 0.29999999999999998890
x + y == z: false

But the values printed are not exactly representing the actual values stored. To get the exact values, we need 54-55 digits:

x = 0.1000000000000000055511151231257827021181583404541015625
y = 0.200000000000000011102230246251565404236316680908203125
z = 0.299999999999999988897769753748434595763683319091796875
x + y == z: false

Considering that only the first 16 digits still represent the intended value, 54 digits seems like a lot.

A user might be doing:

let pi: f64 = 3.1415926535897932384626433;

where the exact value will be:


A user may or may not notice that something has changed.

(at least, Clippy will warn of "excessive precision" and suggest to remove everything after 793)

Making it easier to debug floats may help, but really, we want to warn users that anyone who wants to use floats should understand certain basics.

Using == or != on floats is almost always a mistake, I think there should be a Clippy rule about it.

(The only valid use case for ==/!= on floats I know of is writing a JavaScript engine, where floats are used as an integer substitute.)

On the other hand, having rules like that can also lead to nonsense like (a - b).abs() < f32::EPSILON getting more use. You'd need to be very careful with the messaging.

In an ideal world, there would be something like interval maths support or some other model for type representation of error propagation, but realistically hardly anyone wants to deal with that, even where it is possible.


There is indeed a Clippy rule about it (float_cmp). But unfortunately

is exactly what Clippy recommends instead.

1 Like


It's at least "an epsilon" rather than explicitly EPSILON in the message, but that's far too subtle if you're getting this message.

Worse, the example in the lint docs shows using f64::EPSILON, and links to a rather poor explanation of the issue.

This should absolutely be clarifying that it's on you to pick an actually meaningful error value, and at least guiding you to the relevant concepts on numerical stability and error analysis.

Or maybe I'm just a fussy little boy and nobody out there actually cares about comparing floats correctly.


If I needed to compare floats with some amount of error tolerance, I think I would prefer two functions that let me provide ULPs or absolute error as parameters. Something like:

// Untested.
fn compare_with_ulps(f32: left, f32: right, u32: ulps) -> Option<std::cmp::Ordering> {
  let cmp = left.partial_cmp(right)?;
  if (cmp.is_gt() && left.to_bits() - right.to_bits() <= ulps) {
    return Some(std::cmp::Ordering::Equal);
  } else if (cmp.is_lt() && right.to_bits() - left.to_bits() <= ulps) {
    return Some(std::cmp::Ordering::Equal);
  return Some(cmp);

fn compare_with_abs_error(f32: left, f32: right, f32: abs_error) -> Option<std::cmp::Ordering> {
  debug_assert!(abs_error >= 0.0);

  let cmp = left.partial_cmp(right)?;
  if (cmp.is_gt() && left - right <= abs_error) {
    return Some(std::cmp::Ordering::Equal);
  } else if (cmp.is_lt() && right - left <= abs_error) {
    return Some(std::cmp::Ordering::Equal);
  return Some(cmp);

Something like this could easily be in the stdlib so users don't have to reinvent the wheel. Sorry for continuing the derailing.

On topic, I wish:

  • Rust used Ryu-based float formatting by default.
  • Rust printing had a precision mode that just mean "exact" where it would print the exact float value with no trailing zeros.

FWIW, fixing this is an old issue. It's just hard to do so in an actually useful way.


Yeah, the "actual" answer is something like "so just take this college course, then when you get back, ..."

At least float_cmp isn't default?

1 Like

I think it's not that hard and float_cmp should be included in the default. I added a comment to the issue.

What I meant by 'an "actual" answer' is one where you aren't just picking an error bound out of your hat and checking if it seems to work, but able to exactly determine what the error bound is, or at least get a reasonable upper bound.

Given numerical analysis is, in fact, a college course, it literally is exactly "that hard", for however hard a 200 level college course is.

To be fair, in the only place I've actually had to look into this, rather than just replacing definitely exact comparisons (because they're using integers or checking for clamped values), it's just been to verify that the error is definitely below what would be visible to a user. I hardly consider myself an expert, but I at least know enough to know what I don't know.


And almost immediately after I say I've never had a use for a runtime float equality I implement a linear equation solver and need to write unit tests... ain't it always the way?