(Pre-?)Pre-RFC: Range-restricting wrappers for floating-point types

I’ve been drafting this for a while, and I mentioned it under the type limits RFC, but didn’t receive a response. I would like some feedback.

Floating-point wrappers RFC draft (version as of this writing)

6 Likes

Very nicely written! Almost every question and observation that came to my mind while reading turned out to be addressed later in the document.

While the noisy_float crate exists as you note, I agree that something like this does belong in std. Currently, we provide the f32 and f64 types, which don’t implement Eq/Ord, and also various useful algorithms and data structures, which do require Eq/Ord, and (as far as I’m aware) no way to bridge the gap between them nor any indication of where to look for one, which seems slightly user-hostile and like a recipe for frustration.

(The optimization opportunities are gravy; I hadn’t even thought of that.)

One implementation question: could it conceivably be possible to implement the arithmetic operations for these types by relying on hardware exceptions to detect the out-of-range values (e.g. with signalling NaNs?), to avoid (as much of a?) performance impact on the happy path? Presumably that would have to involve catching the exception in a signal handler and then somehow converting it into a panic? I have no idea whether or how any of this could work, which is why I’m asking.

2 Likes

I still expect Rust to have some subset integral values “soon”, and later to have “static predicates” (refinement typing) too for a little more complex situations:

type Nat = u64[1 ..];
let x1: u8[1 ... 5] = 2;
let x2: u8[1 .. 3, 7, 9] = 7;
let x3: u8[x | x < 100 && x % 2 == 0] = 50;
fn foo(i: u8[i < 187]) -> u8 { i + 1 }

Range-restricted floating point values should work well (future-proof) with both subset types and static predicates.

4 Likes

I'm on-board with the premise that this is a problem worth solving in std, and worth solving independently of const generics (it would be rather neat if const generics became powerful enough to express these restrictions, but I don't know of any reason to wait for that).

The only part of the RFC I'm skeptical about is the interaction with user-defined floating point types. If I read it correctly, this is all it had to say:

If the compiler is unable to recognise how to handle the floating-point type contained in these wrappers, they should behave like ordinary #[repr(transparent)] structs. This will make it easier for user code to mix and match user-defined and compiler-provided floating-point types.

This sounds like if I wrote a BigFloat crate with a BigFloat::INFINITY value, nothing would stop users from doing let x = Finite<BigFloat>::new(BigFloat::INFINITY); not even at runtime. While this does technically allow for greater interoperability, this specific sort of interoperability seems...very weird. I can't quite convince myself it leads to problems, but it doesn't seem beneficial either.

Perhaps more importantly, it's not obvious to me that std::Finite<ixrec::BigFloat> would be valuable in practice. The choice of Finite/NaNaN/UniqNaN as the only three first-class range-restricted float types seems primarily motivated by optimization potential (especially UniqNaN), so it's not obvious to me that a user-defined float type would want the same set of restrictions rather than providing its own. I also can't imagine any code taking a std::Finite<> parameter that would be perfectly fine with getting a BigFloat::INFINITY value as its argument, so it seems like the only way this could be valuable is if we found a way to make Finite "genuinely typesafe", which I have no idea how to do.

My gut feeling is that we should simply punt on the question of "first-class support" user-defined float types, and the RFC should just avoid precluding such a thing in the future. I think the only requirement that imposes is that the Ieee754 trait must be impossible to implement outside std, for now.

P.S. Why NaNaN instead of, say, NotNaN?

1 Like

Speaking of these, I've also got a draft for a #[compact] attribute, to address the bitfields use case, but let's do one thing at a time.

(Terminology note: what is usually called a 'exception', i.e. jumping to some dedicated error handler, the IEEE 754 standard calls a 'trap'; while what the standard calls 'exceptions' are the actual events that cause it, which may as well be exposed in a status flag register. I'll be using this nomenclature here.)

I know that the x87 FPU (since the earliest 8087s) provides a 'status word' register, which contains flags that are set whenever overflow, underflow or other exception happens. So, testing for some of these outputs can be as cheap as checking a flag; I'm not very familiar with SSE or other architectures, though. I'm also a bit worried that relying on traps or reprogramming the FPU (like changing the 8087 control word) might run into problems at the FFI boundary, where foreign code may expect FPU to maintain its configuration bits, or where Rust code may fail to be given ownership of the trap handlers.

I don't expect that to work even with built-in float types: Finite isn't meant to provide new like that, that should have been a try_from. What's the difference between Finite<BigFloat>::new(BigFloat::INFINITY) and Finite<f64>::new(f64::INFINITY)?

The design I had in mind here was that eventually the Ieee754 trait could be made to contain an associated constant (say, Ieee754::SIGNIFICAND_BITS), from which the compiler would be able to compute the memory layout of the type and determine which bit patterns are valid, based on the assumption that the type indeed represents an IEEE-type binary float (sign bit, biased exponent with the highest value reserved for infinities/NaNs, significand). I was thinking of user code that at one point uses f16 and f80 types provided by a crate, but then the compiler starts providing such types on their own; user code could then switch to the built-in ones with no hassle. Perhaps I should have written all this down. (And perhaps I should have named the trait Ieee754Binary; who knows, maybe Rust will have decimal floats at some point as well.)

I'm not sure how valuable this functionality would be; maybe we can do without it. I'll be fine with making the trait unstable indefinitely, maybe #[fundamental] as well.

NaN stands for 'not-a-number', so NaNaN would be 'not-a-not-a-number'. All this musing about NaNs made me think of Gary Bernhardt's 'wat' talk, and I was also wondering how much sense it would make to use these wrappers on one another, and whether it would inevitably lead to type Batman = NaNaN<NaNaN<NaNaN<NaNaN<NaNaN<NaNaN<NaNaN<f64>>>>>>>;... just felt like being silly. (Not sure if you noticed the branch name...)

I'm not terribly attached to these names, they are mostly placeholders. In fact, UniqNaN is slightly misleading (it reserves two NaN payloads and the sign bit, so there are four valid NaN bit patterns for this type). If you can come up with better ones, be my guest.

As mentioned in pre-RFC `const` function arguments - #25 by gnzlbg :

This is important because RFC2000 makes it very clear floating point values will never be usable in const generics because they are not reflexive

That also adds a bit of motivation for potentially adding "not a NaN" floating point types which the compiler has special knowledge of at some point.

1 Like

Or a way to implement structural equality for user-defined types, so that these wrappers can just be written by libraries.

1 Like

Is NaN the only problem? I would expect -0 be trouble too.

Hmm possibly, IANAE. You could declare that -0 < +0 and that’s that, but then, you could also declare that NaN == NaN for the purposes of typechecking, so…

:+1: :+1: :+1: I definitely want an obvious response to "waah, I can't sort my Vec" in the standard library.

However, the NaN case is the only such exception: eliminating it restores the reflexivity of equality, totality of ordering, +∞ and −∞ actually being the greatest and smallest floating-point values (as opposed to merely maximal and minimal ones, alongside NaN) and other desirable mathematical properties.

NaN being the only problem, however, makes the cases for the others seem less obvious. I don't really follow what code is totally fine with 1.8e308 but can't handle +∞, so I don't see the value in Finite the same way. Without the infinities, it can't be closed under exp, for example. And restricting to finite doesn't fix + being non-total either.

you can put the floating-point type in a wrapper

The unstable NonZero<> wrapper has been moved to more specific types, like NonNull<T> instead of NonZero<*mut T>, and proposed NonZeroU32 and friends. I think that argues for concrete types here instead of the generic wrapper. Strawman: ExtendedReal32 for f32-but-not-NaN. (Though I want a much shorter one so that people can use naturally use it instead of f32 in most places.)

UniqNaN

This feels very similar to Option<ExtendedReal32>, so I'd be tempted to leave it out originally too.

Domain optimisations

Unfortunately, LLVM's nnan means that both the arguments and the result are not NaN, so things like Add probably can't use them. Though the type will probably end up with an unsafe add_unchecked that skips indeterminism-checking that could use it.

I can totally believe that some assumes in the right places will help codegen when the inputs can't be NaN, though.

Each wrapper type should implement [...]

I assume they should also get Add/Sub/Mul/etc, checked_* versions (because -INF+INF, INF-INF, INF/INF, etc), and pretty much all the inherent methods from fN? They seem fairly straight-forward to define, and essential for ergonomics. Or would you propose waiting on that until later?

Also, I wish we had customizable literals...

3 Likes

Well, fair enough. To be honest, I think the potential for domain optimisations actually provides the weakest support for this feature; I doubt it's going to be useful until paired with some sort of attribute for code blocks (maybe unsafe blocks only?) like #[float_optim(assoc,distrib,cancel)], which would control what floating-point optimisations are allowed within that block.

And I was indeed the least sure about how useful Finite would be. I think I'm going to remove it.

The matter's hardly settled; the discussion may still turn out in favour of making NonZero unstable indefinitely, or until some form of 'closed traits' (not implementable by external crates) arrives. I think I'll propose something similar for my Ieee754 trait.

On the other hand, I also thought about creating projective floating-point types (which would identify +∞ with −∞). Doing this with generics would raise the question of whether a non-NaN projective float should be...

  • Projective<NaNaN<f64>>
  • NaNaN<Projective<f64>>
  • both, with From converting between the two forms
  • neither, a separate wrapper: ProjNaNaN<f64>

It kind of does, but UniqNaN has some semantics for which it is unclear if they will fit for Option<NaNaN> in every case; compare the table in your own post in the discussion about lifting operations for Option<_>, which you could basically frame as raising the question whether None represents a lack of a value, or a value that is unknown. Plus, it's probably going to be easier for the optimiser to take advantage of a 'ready-made' type when eliding and shuffling NaN canonicalisation operations. (I talk about this in the 'Alternatives' section.)

I mention this in the 'Unresolved questions' section; I think this is an issue that could be addressed during the course of the RFC proper. I assume you're in favour of the 'panicking' option?

I agree about checked_; these methods should be definitely provided.

That gave me the idea that perhaps type inference could be tweaked to allow fraction literals (1.234) to be of these types...

Agreed. I meant to say that it can be implemented outside of std just fine as a newtype over Option<ExReal>, and thus doesn't need to be in the proposal--given that the compiler knows where the notch is, it'll make None be some NaN. (And eventually we might have you-can-pick-the-descriminant even for enums that carry data, if you need a particular NaN.)

Yes, since a noisy_float-style approach is always available if the overhead of panic checking is unacceptable in a particular scenario.

I would certainly suggest just having the NaNaN32/NaNaN64 as the original proposal, since it has the strongest motivation (allowing Ord) and the others can be reasonably built atop it.

1 Like

The point is not to choose just any NaN bit pattern, but the same one as what is produced by operations like 0.0 / 0.0. Given this, and the IEEE 754 NaN propagation semantics I mention in the draft, the optimiser can elide NaN canonicalisations after arithmetic operations when it can prove no input can be a NaN in non-canonical form. You probably do not want user code to pick this canonical NaN bit pattern, because it is going to be architecture-dependent (see this link or this one).

And sure, you can teach the compiler to opportunistically pick a NaN payload for the None-like variant in a way that simplifies arithmetic defined on Option-like types in an IEEE-compliant manner, but I think achieving this will be more straightforward with a dedicated type the compiler knows about. Thus, UniqNaN.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.