Deprecate f{32,64}::from_str_radix


#1

I’m currently putting the finishing touches on a complete rewrite of the float parsing. While figuring out where the impl FromStr for fN is, I was once again reminded that fN::from_str_radix is a thing that exists. This makes me very, very sad. Here are a few things that are wrong with from_str_radix for floats (note that integers are a completely different matter!):

  • It is completely useless. There are a few cases where you’d use hexadecimal or binary literals because they provide an easy way to precisely control the bits of the representation, but otherwise everything ever is base 10. I googled it. I asked on IRC. I found exactly two users: rust-bencode used to serialized in base 16 for some reason, but that changed a while ago. Second, the tests in std::num::f64 use from_str_radix with base 16 for lack of built-in hexadecimal float literals.
  • Other languages don’t support it either: I surveyed Python, Ruby, Go, JavaScript, Java, Nim, and C. No one has complained yet.
  • There are no tests for it. At all. The tests only exercise base ten. I guess if base 16 broke, the aforementioned unrelated tests would break but that’s not ideal. Other bases could be completely wrong without anyone noticing, and I wouldn’t be surprised if they already are.
  • The dual operation, to_str_radix, has already been removed a while ago.
  • If by pure chance someone has a terrible use case for which they need this, they can damn well write their own. It’s rather easy if you’re only interested in one particular radix, especially since exponents aren’t legal for bases other than 10 and 16. Besides, it can’t possibly be more inaccurate than the current implementation.
  • Speaking of which, our implementation is terrible w.r.t. rounding errors, and making it not terrible is simply not worth it. Making the conversion even remotely efficient requires several kilobytes of precomputed tables per base. Generalizing the base 10 code also makes it even more tricky than it already is and invalidates some very nice optimizations (based on the fact that 10 = 2 * 5). Duplicating the code for base 10 to keep these optimizations is an even greater maintenance burden. I mean, it’s possible, but it’s a lot of tricky code for something entirely useless and untested.
  • If the current, wrong, lightweight implementation is kept around (it isn’t exactly a big maintenance burden), it’s a footgun. There is no valid reason to write floating point numbers, or indeed any non-integer numbers, in base 7. We shouldn’t be encouraging such depraved behavior, double so when parsing in base 7 is massively inaccurate and base 10 parsing is perfectly accurate.

The API is unstable, so this is not even a backwards compatibility hazard. Let’s just deprecate it, collect exactly zero complaints because nobody uses it anyway, remove it for good three months later, and never look back.

Unless of course I’m completely mistaken and there is a good reason to have this. Actually, I’d be happy with a bad reason as well, if only to satisfy my morbid curiosity about about the origins of this, IMHO terrible, idea. Hit me up.

PS: If necessary, it is quite easy to add separate functions that support base 16 and base 2. You literally just need to parse the digits that are explicitly given into a sufficiently large integer, round it if it needs more bits than available (this step is not trivial, but doable and already implemented in my base 10 code), and add the exponent mandated by the rounding to the exponent given in the input string.


#2

For any power-of-two base it should be trivial since every floating-point number can then be represented exactly in a small number of digits: you can use integer binary-to-{octal, hex, whatever} conversion on the parts of the mantissa each side of the decimal point. Rounding can be done either before or after conversion.

The only reason I can think of for supporting other bases would be if the algorithm itself was commonplace yet fiddly enough that it would be beneficial to avoid having multiple potentially broken implementations on crates.io (if only because they would not be under as much scrutiny as the standard library).


#3

This made me want a list of all unstable functionality more. We need a clear cut list of all unstable items so that way we can audit them more effectively.


#4

It sounds like you’re thinking of float → string conversion. That direction has already been removed, I’m talking about string → float conversion. But yes, that one is also quite easy for power of two bases. Though I do wonder if it’s worth supporting base 4, 8, and 32. I can only imagine two APIs: Taking the base as parameter and failing if it’s not a power of two, or having separate functions for each base. The second is more appealing when it’s just about decimal, hexadecimal and binary, but the proliferation from other bases strikes me as silly.

That’s the thing, I don’t see any evidence that this conversion is even remotely commonplace. It’s fiddly alright (if you want to do it accurately & efficiently) but that alone is not a reason to go in the standard library. Plenty of community members manage to maintain implementations of extremely complicated algorithms out in the wild.