Elevate `NonZero*` types to true primitive integer types?

Having them as first-class primitives rather than them being stashed away in a std/core module would make those types much easier to use. And personally, I'd use them a lot more if they weren't so difficult to construct.

Currently, you have to do one of the following to create a NonZeroU32 of 123:

  • NonZeroU32::new(123).unwrap()
  • unsafe { NonZeroU32::new_unchecked(123) }

This quite frankly isn't very usable.

This would be a lot more usable if I could somehow pass a 123 where a NonZeroU32 is accepted, and it'd also be much more compelling for libraries to use as well.

Also, I'd like to see the operators better fleshed out. Currently, only these arithmetic operations are implemented for unsigned non-zero types:

  • Bitwise OR
  • Equality
  • Order checking
  • Division/remainder (as the divisor)
  • Checked add/multiply/exponentiate
  • Saturating add/multiply/exponentiate
  • Unchecked add/multiply/exponentiate
  • Checked round to next power of 2
  • Is power of 2
  • Integer logarithm of 2 and 10
  • Count leading/trailing zeroes

In reality, all of the integer operations could technically be implemented for the non-zero unsigned types as follows:

  • Addition, at least one non-zero: all of them, returning the non-zero type for non-wrapping operations and the possibly-zero type for wrapping ones
  • Subtraction: all of them, returning the possibly-zero type
  • Multiplication, at least one non-zero: all of them, returning the non-zero type for non-wrapping operations and the possibly-zero type for wrapping ones
  • Division: all of them, returning the possibly-zero type
  • Remainder: all of them, returning the possibly-zero type
  • Bitwise AND: all of them, returning the possibly-zero type
  • Bitwise OR, at least one non-zero: all of them, returning the non-zero type
  • Bitwise XOR: all of them, returning the possibly-zero type
  • Bitwise NOT: returning the possibly-zero type
  • Logical shifts: returning the possibly-zero type
  • Circular shifts: returning the non-zero type

(Of course, signed types will obviously differ here.)

For disambiguating literals, I could imagine a nzu32 primitive type (corresponding to u32) and similar, though naming discussion is admittedly a bit early. For some other nuances:

  • Conversions from non-zero to possibly-zero types would be allowed, but not vice versa.
  • NonNull pointers could be safely converted to and from nzusizes similarly to how nullable pointers could be to and from usizes.
  • All the checked methods could just return the non-zero type inside their option, doing the obvious thing there of preserving the type in question.
  • Probably other stuff that hasn't come immediately to mind yet.
1 Like

My meta-thought here is that it all sounds ok-ish, but that I wish every one of the things would go just a tiny bit futher.

For example, rather than having some traits return the non-zero type and some return the could-be-zero type, just have a ranged type so it always have the correct size (like I describe in Non-negative integer types - #20 by scottmcm).

And rather than adding a suffix for non-zero, extend {integer} to be something usable by more types -- so NonZeroU32, but also num::BitUInt and such. Using some sort of "const trait" support, so that the compiler can run the parsing as a const, so if it panics it's a compilation error.

While it's even longer, I'll toss in const { NonZeroU32::new(123).unwrap() } -- that way you'll get a compilation error if you passed in 0, so it's guaranteed as fast as the unsafe version while still being safe.

This sounds like a partial version of what I described

8 Likes

Yeah, the problem with NonZero* is that, unless you're using them purely for memory-conserving niche optimizations, you very quickly start needing NonOne* or NonZeroOrOne* etc. The API is also too heavy-handed, with all NonZero* being their own distinct unrelated types. I'd be very wary of extending it further.

2 Likes

That feels like a slippery slope argument.

I would also argue that NonZero is useful in and of itself. One important application is using type safety to ensure panic-free division, inversions, modulus/remainder, etc by using the type system to ensure the divisor is non-zero.

I strongly agree with the OP that the ergonomics are the main thing holding this back. It would be very, very awesome if the compiler could infallibly infer the type of NonZero* literals, e.g.

pub const X: NonZeroU32 = 42; 

To write something like that today on stable Rust, I think the best you can do is:

pub const X: NonZeroU32 = match NonZeroU32::new(42) {
    Some(n) => n,
    None => unreachable!()
};
2 Likes

It is, and you seem to have some issue with that. Slippery slope is an entirely valid objection to greasing the slope. There are already what, two dozen different NonZero* types in the std? EDIT: 12 for native Rust integers, and 11 for various C integers.

As designed, there is no sensible API for those types which wouldn't feel lacking. If you think otherwise, the burden of proof lies on you.

Except that division also panics on int::MIN / -1, which shows nicely why the NonZero* types are useless for practical correctness purposes. Also none of the current integer division APIs work with NonZero* types, not even talking about transitive usages, which is a nice illustration of the API creep those types require.

I don't see anyone suggesting adding more NonZero* types, only making the existing ones more useful/first-class.

That's a valid concern for signed integers, but not for NonZeroU* where they would still provide panic-free division operations. Perhaps tone down the hyperbole?

...because of the ergonomic issues, which could be solved by improving the ergonomics of NonZero*.

The fact something doesn't work (well) today is not a counterargument against fixing it.

4 Likes

Do you have some example pseudocode that is clearer with these proposed functions?

The type system in Rust isn't rich enough to express "the result of this arithmetic operation satisfies an inequality" for most nontrivial operations. (That's probably a good thing!) If x is nonzero then so are 2*x - 1 and x ^ (x >> 1). But we can't express these using the above API without going out of and back into nonzero integers unless I'm missing something.

We already have checked arithmetic functions, those could be overloaded for non-zero family numerics to check additional invariants.

NonZero is useful, but in practice it is very limited. If I have a function that receives a x and does 100 / (x - 1), then in order to avoid division by zero, the type of x must be NonOne, not NonZero.

In current Rust we can build NonOne in library code by wrapping NonZero, like what the nonmax crate crate does. But this has overhead (unless llvm neatly see through the whole thing)

What Rust really needs is integer types bounded by a range, like in Ada. There was a proposal here (pattern types) that suggested something like type NonZero = i32 in 1.. and that would be excellent. (But just a range is still not enough to build NonOne; one would also need a way to combine multiple disjoint ranges, like, type NonOne = i32 in 0..=0 | 2..)

3 Likes

This is once again an ergonomics issue with the NonZero* types.

For the type of code I'm writing (i.e. cryptography) I would never write an expression like x - 1 in new code when using the core integer intrinsics: we aim for deny(clippy::integer_arithmetic)-safe code. Instead I would write x.checked_sub(1).

Unfortunately the NonZero* types do not support checked_sub, but if they did it could enforce that subtracting from a NonZero* always returns another NonZero*, or else None.

Alternatively, Checked<NonZero*> is possible, which would allow you to simply use Sub, and permit multiple combinations of operands.

None of that requires adding NonOne for the purposes of ensuring a divisor is non-zero. That's not the invariant that actually needs upheld here.

Instead it requires better building out and making more consistent the features Rust already has for solving these problems.

2 Likes

For context, @oli-obk actually has a (mostly) working prototype implementation in the compiler. No current intent to make it user-facing yet; the maximum current goal of the prototype is AIUI just as a more principled implementation behind #[rustc_layout_scalar_valid_range_start] and friends that the compiler can track more precisely.

6 Likes

That's certainly a valid, if restricted, approach. But if that's all you want, you can easily implement NonZero* as a library type. The API won't even be too big. But I'm absolutely not keen on uplifting that approach into the standard library and deciding for each of the 100+ methods on integers whether they should be mirrored on NonZero* types and in which form.

As it were, I've written crypto-bigint which uses NonZero divisors exclusively which enables things like infallible, panic-free inversions.

I can report based on empirical usage that it works great.

The inability to do checked arithmetic with the NonZero* types seems like a pretty big gap.

1 Like

Sure, but where do you stop? For instance, x >> x.trailing_zeros() always returns a nonzero value if x is nonzero. But if this isn't implemented as a fundamental operation we'd need to do a checked shift and unwrap. I know it can't fail, but I can't assert that in the type system (and if I could it would be very hard to write a well-typed Rust program).

This feels like another slippery slope argument. Feature parity with the core integer types seems like a good stopping point.

You're pointing out a special case with interesting properties. That's an interesting case, but probably not one that belongs in core, and one someone can write for themselves in ~3 LOC.

1 Like

I acknowledge it's not a great argument, but the core integer functions are composable in a way that these proposed ones don't seem to be.

As it were, there's checked_add, but not checked_sub.

Note that there is no similar function on NonZeroI32. The reason, I believe, is that otherwise it's not clear whether the function returned None due to overflow, or due to a zero result.

Unwrapping the inner i32, adding them with overflow checks, and trying to convert to a NonZeroI32 is functionally equivalent to the potential checked_add method. It's slightly more verbose, but also more granular (overflow vs zero result). On the other hand, with NonZeroU32 the sum is never 0, so the final NonZeroU32::new call would be infallible. There's nothing the consumer could do with it other than unwrap, which is poor practice. So a separate checked_add method is more ergonomic, more correct, doesn't lose any information, and is semantically compatible with u32::checked_add (None means overflow).

2 Likes

It certainly makes sense that NZIN would omit checked_add/_sub, since it has a hole at 0. There's no hole involved with NZUN; it's a continuous range 1..=MAX.

The lack of checked_sub yet likely is due to it not being immediately clear whether the "correct" signature would return Option<NZUN> or Result<NZUN, (0_uN | None)>, as well as being relatively simple to write either semantic and be clear about what's desired, with either lhs.get().checked_sub(rhs.get()).flat_map(NZUN::new) or .map(NZUN::new).

Personally, because the domain of NZUN, I think it's easy to justify checked_sub existing and returning just Option<NZUN>. There's only one overflow condition from the continuous domain, even though you could differentiate between the zero and further overflow cases; I don't think there's much risk of anyone assuming that None means exactly zero the way it does from ::new. Combine this with the mirroring of uN's API making a differing return type highly unlikely, I think it's clear that NZUN::checked_sub(…) -> Option<Self> is both desirable and the correct API.

It's just that nobody's put in the effort to argue for adding it. The _add and _mul only got stabilized in 1.64, and only got added in the first place because iago-lito made an unprompted PR to add them. Notably, in that PR it's noted that _sub on NZUN could make sense but was being deferred to a later PR since it's less unambiguously just about the bit limits of the numerics.

IMHO, checked_sub would get accepted after a bit of discussion; it's just on someone to care enough to make the PR and drive that discussion.

5 Likes

Sure, you can argue that overflow x < 0 isn't that different from overflow x < 1. But it doesn't feel right to add methods with the same name, same semantics, but different signatures (which a very likely to be different on NonZeroI32 and NonZeroU32). This makes it harder to learn, harder to use for generic programming, including macros, and just generally more confusing.

This is a problem which requires a new approach and an entirely different proper generic solution. In the long run, adding a patchwork of different methods plugging opportunity holes in the API of those types would just lead to an incomprehensible mess.

2 Likes