[Pre-RFC?] NonNaN type

Yes, which is why I’m so excited about possibility of having a non-idiosyncratic “dumb” float-derived type.

I’m calculating pixels, not rocket orbits, so “fast math” optimizations are fine for me. I only use real numbers in 0…1 range, so everything else that IEEE float supports is a baggage for me.

The problem of /0 introducing Inf is tough. I’m afraid that with constraints of IEEE 754 hardware and maximal performance, it can’t be entirely prevented at compile time. However, I’d be fine with it being treated like integer overflow in Rust: it’s programmer’s responsibility not to do it, it’s checked in debug builds, but if happens in release mode, it produces well-defined garbage.

4 Likes

I think we should also be including fixed point on the short list of alternatives in this discussion, since it’s even more performant than floating point, and for certain use cases it could arguably be considered more accurate (more consistent precision and no “weird values”), which is a very different trade off from BigWhatever.

In fact, @kornel “pixel data” strictly within in 0..1 sounds like the sort of thing I’d expect fixed point to be a good fit for; is it or am I missing something? How important is the “don’t leave the FPU idle” thing?

1 Like

Good Point(tm) (see what I did there?) :slight_smile:

In general it's great, but using both integers and floats is faster than using only integers. If you use only integers, you're missing out on extra float operations that the float half of the CPU can do for "free".

There are also some operations, like sqrt, that are surprisingly fast as floats, and even roughest integer approximations are ridiculously slow in comparison.

We could make the float types special:

float types can have enum-like subtypes, NaN, PInf, NInf, Subnormal, Normal, NegativeZero.

these aren’t real enum variants, but can be matched on as if they are.

2 Likes

That would be solving a very different set of problems than actual subtypes would. But regardless, we can already do if x.is_nan() { ... } else if x.is_infinite() { ... } and so on, which seems hard to do significantly better even with custom pattern matching syntax.

I was thinking, since NonNaN would not be closed under operations, the most suitable math operations for it would probably be:

fn add(self: NonNaN<T>, other: NonNaN<T>) -> T

This still shifts the problem to the caller, but now the calls are at least chainable. It still gives the primary benefit of having size_of::<Option<NonNaN<T>>>() == size_of::<NonNaN<T>>()

If the caller has to choose when to do NonNan wrapping after every operation, then how is this better than just using assertions with a regular float type?

Because of the space saving optimization the compiler is able to perform on Option types. That is the same impetus behind NonNull and NonZero.

Also: NonNaN would be able to implement Ord.

Right, I get the idea for Ord. But usually, you just need some kind of sort_by functionality to get floats sorted, and if you use a newtype wrapper, you don’t really need to define operators on them to do sorting. The question is, if you’re providing arithmetic operations on NonNaN that unwrap it every time, then I’ll have to re-wrap with NonNaN every so often, and presumably that would do nothing more than assert !x.is_nan(). There’s no space saving here: the alternative is not Option, but a normal float.

You had discussed this in Avoiding PartialOrd problems by introducing fast finite floating-point types - #45 by johannesvollmer and I'm still wondering if specific floats with limited range could not meet your expectation (although unfortunately not @LateNiteMartyParty's). @Ixrec reminded that:

But there are interesting subsets that could be made closed under lacunary basic arithmetic. Floats inside [+0.0, +Inf] are closed under addition and can be made closed under multiplication and division (by adapting some rules like +0.0 / +0.0 == +0.0 * +Inf == +Inf / +Inf = +Inf). Floats inside [+0.0, +1.0] are closed under multiplication and are generally a boolean algebra.

Edit: typo [+0.0, +Inf] => [+0.0, +1.0]

1 Like

This is the kind of thing I had in mind with my caveat:

which is why I assumed alternate arithmetics weren't very relevant to this thread.

But I hadn't thought about positive only. Do you know if it's feasible in practice to efficiently implement positive-only lacunary floating point atop regular floating point hardware by "just" ignoring the sign bit and "just" making all NaNs compare equal to Inf?

Unfortunately, no I don't. But I would be very interested in getting the opinion of hardware experts.

I disagree for exactly the same reason that Option<T> is better than null: the programmer won't forget checking for NaN, when the language reminds you that the value could be NaN. The main argument against nullptr and NaN concerning ergonomics is that nullptr would have to be checked before each and every usage of the possibly absent value, and the programmer would have to remember it every single time.

Rust is all about safe and predictable defaults, without requiring the programmer to "better not forget doing stuff"

2 Likes

The point is to get the same space with better ergonomics.

1 Like

I think NonNaN math panicking is a fine and sane choice.

It “just works” for the normal cases, and matches the panicking that integers do (in debug mode). Then as an optimization, one can always drop down to using floating point directly as intermediate values in a computation, to have only one NaN check at the end, in places where you’re doing enough operations to the performance to matter.

3 Likes

I’d like to note here the possibility of using the floating point status word to decide when to panic. This could be way more efficient than checking for NaNs, since you need not check each number computed.

Another option would be to trap floating point exceptions. It might be annoying to do the signal handling in a way that doesn’t mess with manual handling of signals, and it might get ugly and/or slow if a user rapidly alternates between NonNaN and f64 arithmetic.

But it’s worth remembering that the IEEE folks did think about error checking, and didn’t leave us with only NaNs as an option.

2 Likes

I think there’s enough support for this type to warrant a real RFC. I can’t say when I’ll be able to make and submit one (I’m unfamiliar with the process and have work on the weekdays) but hopefully it’ll come in the following weeks. Stay tuned.

@hanna-kruppe would it be possible to allow the compiler to do layout optimizations under the assumption that NonNaNs are never NaN, but with optional run-time checks like what we do -C overflow-checks ?

It would be possible to have a safe float without the need of extensive runtime checks, but that would require rethinking floats.

We could define the float division operator with non-nan, finite, non-zero operand types, like for example fn div(a: FiniteNumber, b: FiniteNonZeroNumber), which would guarantee never-failing division at compile time. Apart from being ergonomic and bug-preventing, that would also allow for massive compiler optimizations. Runtime checks would only be necessary where a FiniteNumber is converted into FiniteNonZeroNumber, basically checking if the dividend is zero.

powf(base: FiniteNumber, exp: FinitePositiveNumber), would also be possible.

Division may look like this: average_age = (a.age + b.age) / 2 and you’d know that this can’t be NaN and it can’t be infinity because a.age and b.age are finite numbers.

We could do conversion like iter().sum() / n.expect_non_zero() where the divisor is expected to be non null, but the type system cannot guarantee it. ‘expect non zero’ would panic if the number is zero.

Of course this approach would be rather explicit. Also, it does not handle the addition two numbers where the sum is too large to be stored in a float. This could be handled like integer overflow, panicking in debug mode and returning NaN in release mode.