I tried without great success to post this both on the user forum and on IRC.
I post here since I don't think it is big enough for an RFC (maybe it is).
Context: formatting a float (f32 or f64) using the scientific notation for a human-facing output, especially for well aligned columnar data (a common use case, at least in sciences).
I would like to mimick something like e.g. the F(J) column here or on the screenshot a the end of the post if the link is broken.
Current behaviour:
let v = 1.34e+2_f32;
assert_eq!("+1.34e2", format!("{:+e}", val));
Expected behaviour:
let v = 1.34e+2_f32;
assert_eq!("+1.34e+02", format!("{:+e}", val));
Example: for a minimal table containing a single column and 2 rows (1.34e2 and -1.34e-2):
the current behavior using the format {:+e} (or {:+.2e}, ...) is
+1.34e2
-1.34e-2
while the expected behavior would be:
+1.34e+02
-1.34e-02
Remarks:
to keep the format expression simple, there is no way (to my knowledge) to modify the way the exponent part of the scientific notation is formatted
the current behavior is to use the smallest possible number of characters
it is a valid choice for non-tabular data or for ASCII serializations such as CSV, JSON, ...
(for ASCII serializations not intended to be read by a human, an equivalent to the %g would probably be more compact)
the expected behavior follows the C choice
force the sign to be printed, use 3 (4) characters and pad with '0' knowing that the exponent range is [-38, +38] ([-308, +308]) for a f32 (f64).
this solution is not the most compact, but it may be the best compromise to keep the format syntax complexity low and to allow well aligned columnar data
Wether you agree or not (I probably have an incomplete, biased view), I really would like someone to share is thoughts on the matter.
P.S: thank you for the great job you are doing, I really enjoy Rust
I think this is the kind of thing where it would be helpful to see an example of the kind of thing you're trying to do, as part of the motivation section.
(Scientific notation seems like it'd hit the same peeve I have with disk size reporting that shows 9K and 1G in the same column, making comparisons nontrivial.)
While I understand your qualm with different magnitude values being displayed this way, it’s extraordinarily common in the physics (and presumably other sciences) community - you just have to get used to reading the exponent before making any comparison.
The reasoning behind displaying data this way is that the number of significant figures measured is important (In the given example, 3).
With that in mind, I think it may be better to have a crate to do this, rather than extending the language - I don’t know that we need this to be a compiler-standardized numerical format. Unfortunately, while there are several crates that look like they may have some functionality like this, it doesn’t appear that any of them are well-documented and recently updated, or very complete.
Note I deliberately left concerns like this out of the {:g} proposal because the use case of user-facing output is easily serviced by third party libraries. The standard library is for features with high impact, or that can't go anywhere else.
I have many places where I output numbers in scientific notation and I don't even look at the mantissa. For numbers that vary wildly in magnitude, the exponent may be the only useful piece of information.
First of all, thank you @Zarenor and @ExpHP for your answers
(Here after I do not use emphasis to shout, but to ease a quick reading.)
I wonder about the purpose of the format! syntax (used in println! and write!):
is it to be used for (possibly lossy) ASCII serializations?
is it to be used for user-facing outputs?
both?
In the case of lossless ASCII serializations, the precision must not be used (not to remove significant digits),
and I tend to think that flags, width, ... are useless. The {:g} format would probably be the best (most compact) option.
If the first answer is the right one, I agree with you.
But what is the point of using '+', width, '<', '>', ... in non-user-facing outputs?
So, unless I do not understand well, I think that one can revert the argument:
a particular ASCII serialization is probably best serviced by the third party library implementing it and the main purpose of println! is to build user-facing outputs.
(Note e.g. that a JSON document containing integers or floats with a '+' sign is not valid).
In practice, I have the feeling that there is not always a clear separation between ASCII serializations and user-facing outputs.
So the right answer is probably the 3rd one: format! is general purpose.
In my opinion, the first aim of an ASCII serialization is not to be as compact as possible.
And the choice of C creators (as a good compromise between syntax complexity and output compactness) to
use %+03d (or {:+03}) to print the exponent of a float is not a random choice and is a choice which is still valid today.
So, I would like to know if the current choice implemented in Rust comes from a long process or is just an implementation detail.
P.S.: My personal choice would probably go for a more complexformat!syntax allowing to choose the format of the exponent.
It feels to me that format! is very much there for programmer convenience. It’s known not to be necessarily the fastest and most efficient, but it’s easy and there are debug and pretty-print-debug formats and other options. That convenience is high impact. So I think there’s a good case for a format that does what you want as a common expectation.
At the same time, offering every possible stringly knob for tweaking output is a sure path to line noise, less convenience due to more confusion, and bugs. “Engineering notation”, with the exponent always a multiple of 3, is also common enough for someone to want a similar formatting trait or option, and the list goes on.
I don’t know if changing the default format would be considered a breaking change, but I can imagine it might well be, especially if there is no way to get the old format back. So really this means more options, either way.
There was another thread a while ago with a similar discussion (maybe it was even yours?) where another possibility came up: implement the Display trait for a custom type wrapper just how you want. Either use that type throughout the code, or use it in a struct that represents your output table, with from/into the base type.