Ranged integers for performance

Hi, here I show an example usage of ranged integers to speed up code keeping full safety. This is a basic permutations iterator similar to the Python itertools.permutations, but it yields arrays instead of tuples:

#![feature(generators, iter_from_generator)]
use std::{iter::from_generator, array::from_fn};

fn permutations_array<T, const N: usize>
                     (data: &[T; N]) -> impl Iterator<Item=[&T; N]> {
    from_generator(move || {
        if N == 0 { return; }
        let mut perm = from_fn(|i| i);
        yield perm.map(|i| &data[i]); //1

        loop {
            let mut i = N - 1;
            while perm[i - 1] >= perm[i] {
                i -= 1;
                if i < 1 { return; }
            }
            let mut j = N;
            while perm[j - 1] <= perm[i - 1] { j -= 1; }
            perm.swap(i - 1, j - 1);
            i += 1;
            j = N;
            perm[i - 1 .. j].reverse();
            yield perm.map(|i| &data[i]); //2
        }
    })
}

fn main() {
    let items: [_; 12] = from_fn(|i| i as u8 * 10);
    println!("{}", permutations_array(&items).count());
}

With N=12 it prints 479_001_600, that is 12!. If I replace the second line:

yield perm.map(|i| &data[i]); //2

With a similar but unsafe version:

unsafe { yield perm.map(|i| data.get_unchecked(i)); }

The run-time on my PC with N=12 goes from 2.35 seconds to just 1.24 seconds.

I believe Ada language allows a similar optimization, but with safe code. Because you can put ranged integers inside the 'perm' array, avoiding all bound tests during those mappings. Rust could gain a very similar feature of ranged integers, that allows to replace this line:

let mut perm = from_fn(|i| i);

With something like:

type Id = usize[.. N];
let mut perm = from_fn(|i| Id::try_from(i).unwrap());

(Currently you can create a similar const generics wrapper for array indexes, but giving it a very good ergonomy is lot of work, and I hit some problems).

2 Likes

This makes me think of refinement types, which are another potential way of solving this, though doing so would likely require some compiler support.

Given a type T, refinement types are types that are used to restrict the value space of T. And that sounds a lot like a more generic version of ranged integers.

How they work is basically you define your type T, and then you have 3 additional pieces: a Refinement type and a predicate. The idea is to leverage a correct-by-construction pattern, using predicate when constructing values of type T, and erroring out if the predicate fails. That way, any value of Refinement<T, P> is guaranteed to adhere to the predicate.

Using refinement types can be done today with refinement. The only part that wouldn't work is the compiler optimization, which is why I say it would likely require some special support from rustc.

AFAIK this has already been discussed a couple of times, but no definite design has been produced yet. See for example:

The problems start to appear when you want to do operations on that type. If you add 1 to a usize[.. N] should you get a usize[1 .. N+1]? Or an Option<usize[.. N]>? Maybe just an usize? Keeping track of these invariants also quickly becomes quite a lot of work for the compiler and for who needs to implement the feature.

ps: Regarding your specific piece of code, it can be optimized with this weird trick:

fn permutations_array<T, const N: usize>
                     (data: &[T; N]) -> impl Iterator<Item=[&T; N]> {
    from_generator(move || {
        if N == 0 { return; }
-       let mut perm = from_fn(|i| i);
-       yield perm.map(|i| &data[i]); //1
+       let mut perm = from_fn(|i| &data[i]);
+       yield perm; //1

        loop {
            let mut i = N - 1;
-           while perm[i - 1] >= perm[i] {
+           while perm[i - 1] as *const _ >= perm[i] as *const _ {
                i -= 1;
                if i < 1 { return; }
            }
            let mut j = N;
-           while perm[j - 1] <= perm[i - 1] { j -= 1; }
+           while perm[j - 1] as *const _ <= perm[i - 1] as *const _ { j -= 1; }
            perm.swap(i - 1, j - 1);
            i += 1;
            j = N;
            perm[i - 1 .. j].reverse();
-           yield perm.map(|i| &data[i]); //2
+           yield perm; //2
        }
    })
}

Or, alternatively, you could define something like this, limiting the unsafe to a very small interface (which is the very point of unsafe!):

mod bounded {
    #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
    pub struct Bounded<const N: usize>(usize);
    
    impl<const N: usize> Bounded<N> {
        pub fn new(n: usize) -> Self {
            assert!(n < N);
            Self(n)
        }
        pub fn into_usize(&self) -> usize {
            let n = self.0;
            if !(n < N) {
                unsafe { std::hint::unreachable_unchecked() };
            }
            n
        }
    }
}
3 Likes

I think this is the thing that really needs to be addressed. Because num::Wrapping and a hypothetical num::NonNanF32 need the same kinds of ergonomic improvements.

1 Like

jjpe:

refinement types could be rather useful for Rust, but they are something much more complex compared to the ranged integers. ranged integers are upward compatible with refinement typing. Arbitrary bit-width integers are something sufficiently different from ranged integers (and I don't need them).

SkiFire13:

The problems start to appear when you want to do operations on that type. If you add 1 to a usize[.. N] should you get a usize[1 .. N+1]?<

You get the same type usize[.. N] or a run-time error, just like when you index arrays (safer functions that return an Option<usize[.. N]> could be added, but they are for later. My index wrapper has a succ() function that returns an Option).

ps: Regarding your specific piece of code, it can be optimized with this weird trick:

Your version of the code has an intermediate run time, 1.54 seconds (the other two timings were 2.35 and 1.24). In Rust there are often ways to solve performance/safety problems for tiny examples. But strong typing shines when you have large(r) amounts of code.

Or, alternatively, you could define something like this,

I have created a rather more refined index, implemented generically for arrays, but it has various downsides compared to a type system feature.

Can you elaborate on how this thread is different from your previous ones, like these?

Refinement types wouldn't alter the size of the original type. Instead a refinement type makes it illegal to construct some values of the original type, eg Refinement<usize, IsOdd> is still the same size as usize, it's just that in this example even numbers would fail to construct.

I'm inclined to say that yes refinement types are more general than ranged integers, which is why I thought of them. Optimizations related to refinement types might be useful for more than just range checks elimination. And as stated before, the rest of refinement types (i.e. the functional, non-optimization part) already works on stable today, which means that if this were used, less specific infrastructure specific to the compiler would have to be written.

scottmcm:

Can you elaborate on how this thread is different from your previous ones, like these?

It's an important topic, worth repeating from various angles, safety, performance, clarity of the code, to replace some code comments with types, to help future Rust proof systems, so let's help give it a bit of visibility :slight_smile: Some people seem interested in just performance, others in mostly correctness, etc.

jjpe:

Refinement types wouldn't alter the size of the original type.

Right.

I'm inclined to say that yes refinement types are more general than ranged integers,

Right. But they are far more complex to implement (see below).

And as stated before, the rest of refinement types (i.e. the functional, non-optimization part) already works on stable today.

I strongly prefer them to be handled at compile-time (by STM solvers, etc).

They really aren't. Have a look at the refinement crate, you can read through it in perhaps 15 minutes. The bulk of it is just a trait and a type.

Nothing in my answer suggests doing otherwise. I but before there is support, using it directly might be helpful.

jjpe:

They really aren't. Have a look at the refinement crate, you can read through it in perhaps 15 minutes. The bulk of it is just a trait and a type.

For reasonable refinement types have a look at Liquid Haskell: https://ucsd-progsys.github.io/liquidhaskell-blog/

But this thread is about having built-in ranged types, their relative simplicity in both syntax and semantics should allow them to be used more often in normal code by normal Rust programmers, unlike Liquid Haskell that still has rather niche usage.

Liquid Types go way further than refinement types. They're not the same thing. Liquid Types are like refinement types where the predicate is on steroids and needs a sat solver to evaluate. They're also complete overkill for the topic of this thread.

1 Like

I definitely want ranged integers with compiler support at some point. The niche values would be wonderful, not to mention the obvious benefits from an API perspective.

1 Like

I've been using no-niche ranged integer and arrays in my crate. See tex-rs/pascal.rs at master · crlf0710/tex-rs · GitHub

Heh, might want to check out the deranged crate!

Currently still tying to typenum version and const generics-based version doesn't work for me yet, because i need plusing and minusing sometimes lol. Will migrate when that's ready.

Fair enough! It's a limitation of const generics for sure.

But anyways, let's not derail the thread :slight_smile: (yes, it was my doing)

1 Like