Pre-RFC: ergonomics around NonZeroU* and literals


#1

I’ve been excited about the introduction of NonZeroUsize and friends in Rust 1.28 - they’re efficient and prevent safety concerns around e.g. divisions by zero, AND prevent APIs from being used the wrong way. Great! However, I feel somewhat strongly that Rust is missing a piece here to make non-zero types be really really useful / ergonomic. Hope this discussion can get us closer to that missing piece.

Background: Over the past few months I’ve designed and used some APIs based on non-zero types (e.g. the tests on fibonacci_codec and this warning ratelimit_meter). Unfortunately, they’re pretty hard to use correctly, and incur either an unnecessary runtime check overhead in Debug mode (see the test case above; thankfully, Release builds optimize away the check; but if you accidentally use 0, they optimize to a panic), or an unwieldy unsafe block.

I came up with a work-around with nonzero_ext, and while that feels better, it still seems a bit unwieldy, and the error messages are not very good. I think it would make sense to make the compiler both smarter and to allow users to specify non-zero literals directly, so I’ve written up a draft RFC here:

https://gist.github.com/antifuchs/7530075de2c2e894b97300dfb3ac9920 - my very first rust (draft) RFC!

It’s a writeup of what I think is the closest to a much better state from here: A compile-time checked way to specify literals that’s much less verbose than what we have now. Many alternative approaches exist! Please comment! (-:

(Edit: A kind reader pointed out that it would be good to have here a copy of the Summary & what it’s about! I’ll paste the Summary & guide-level explanation sections so you don’t have to read through all the above)

Summary

Add an extension to the INTEGER_LITERAL syntax that allows users to specify literals as non-zero unsigned integers. We introduce a new INTEGER_SUFFIX that starts with n to indicate non-zero literals. These literals get checked at compile-time to ensure they are not zero.

Guide-level explanation

When using APIs that specify non-zero unsigned integer types, code passing integers literals (like 20 ) to these APIs needs to assert that those literals are not zero.

This can be achieved in safe code by converting a u32 integer to a NonZeroU32 integer. Given the definition for a division function for unsigned integers that can never divide by zero:

fn divide_by(dividend: u32, divisor: NonZeroU32) -> u32 { dividend / divisor.get() }

We’d call the function like so:

divide_by(100, NonZeroU32::new(20).expect("inconveivable!"))

However, this long-form conversion performs the check that 20 is not zero at run time, so if somebody should accidentally delete the 2 while editing, the program will still compile and panic at run time.

The easier (and shorter) way to make this assertion is to use the n32 suffix on the integer literal:

divide_by(100, 20n32)

It is a compile-time error to use the literal zero with an n suffix, so no edits or slips of the finger will result in an accidentally compiling program that errors at run time.


#2

Most definitely an interesting idea.

Would the n* suffix denote signed or unsigned non-zero values? Have you disregarded the option of extending the native types u*/i* to new non-zero counterparts?


#3

Great questions!

Since the current implementation of NonZero* types applies only to unsigned integers and because I’m unsure what a signed non-zero int would be represented (probably similarly, but I’m not sure!), I decided to punt on this and focus on the one that is easy to explain to people: “positive non-zero numbers” (“natural” numbers if you are germanly inclined, “positive” if englishly).

Making nonzeroness a marker for other type suffixes would be interesting too - however, I fear it would feel like a bit of a weird corner case: A n marker for u32 as u32n seems cool, but I’m not sure what to make of usizen - especially if anyone introduces another letter-only type that ends in “n” (:


#4

This is a very specific solution. I’d like a much more generic solution based on static preconditions like in Ada:

Given that, nonzero literals become just one specific case. Rust should try to implement generic solutions.


#5

Division is the natural case for non-zero numbers, since it is legal to divide with negative numbers (hence n* must be possible to be negative, just as well as positive).


#6

I strongly oppose the idea of wiring library types directly into the core language by means of literal suffixes.

We should instead focus more on const fns to allow writing a safe const constructor which can check the value of the argument at compilation time whenever possible. Pseudo-Rust (what I would like to be able to write):

impl NonZeroUsize {
    const fn new_const(const x: usize) -> Self { // name is up to debate…
        static_assert!(x != 0);
        NonZeroUsize(x)
    }
}

This wouldn’t require adding runtime expect() calls, just a normal function call which directly produces a NonZero*, and fails to compile if its argument is zero.

In the meantime, on nightly you can write a macro like this:

#![feature(const_let)] // for the static_assertions crate

macro_rules! nz {                                                                 
    ($e:expr) => ({                                                               
        const_assert!($e != 0);                                                   
        unsafe { NonZeroUsize::new_unchecked($e) }                                
    })                                                                            
}                                                                                 

const X: NonZeroUsize = nz!(1);

#7

I would like some improvement here, but don’t think that suffixes in the language are the way to go. That’s mostly because there are a huge number of possible restrictions here: signed without the min (so they’re symmetric), unsigned without the max, unsigned without the high bit set (what allocation sizes are actually allowed to be), etc. That’s too many things to make nice suffixes for.

Instead, I’d rather something like coercion from {integer} literals. With type ascription hopefully showing up soon, I think 0:u8 is way nicer than 0u8 anyway. And ascription extends easily to other cases: 1:NonZeroUsize, "Hello":String, 1.1:NonNanF32, and just divide_by(100, 20) – that last one with no ascription necessary since it’s already a coercion point.

(I don’t know a good design for how to expose this, however.)


#8

Similar problem comes up when you want to wrap floats to be non-NaN. I’d very much like to see some generalized literal-to-newtype feature.


#9

I don’t think there are any plans to allow static assertions to depend on generic arguments because that will result in monomorphization time errors (i.e. could be delayed all the way to link time through multiple layers of crates generic functions). A better way to add this would be to support generic bounds on values, e.g. (using the same const in runtime position extension):

impl NonZeroUsize {
    const fn new_const(const x: usize) -> Self where x != 0 {
        NonZeroUsize(x)
    }
}

That would require all generic users of the function to propagate the bounds out so that you will get an appropriate error message at the point where you try to call it without proof that x != 0.


#10

That sounds absolutely lovely, especially because it avoids having to look at the body of the function in order to compile it, which I’m not a fan of but I couldn’t think of anything better in this case. It’s also naturally consistent with the notion of const generics being about reasoning with compile-time values (rather than types). It’s a much cleaner and more general solution.


#11

I imagine that most of the pains you have encountered constructing these values is particular to the fact that you are writing unit tests and documentation examples. These are types of code that have an unusually high concentration of literals. I doubt that most user code will see the same benefits.

I definitely don’t think it is worth making this a part of the language.