Ranged integers for performance

leonardo · August 4, 2022, 12:53pm

Hi, here I show an example usage of ranged integers to speed up code keeping full safety. This is a basic permutations iterator similar to the Python itertools.permutations, but it yields arrays instead of tuples:

#![feature(generators, iter_from_generator)]
use std::{iter::from_generator, array::from_fn};

fn permutations_array<T, const N: usize>
                     (data: &[T; N]) -> impl Iterator<Item=[&T; N]> {
    from_generator(move || {
        if N == 0 { return; }
        let mut perm = from_fn(|i| i);
        yield perm.map(|i| &data[i]); //1

        loop {
            let mut i = N - 1;
            while perm[i - 1] >= perm[i] {
                i -= 1;
                if i < 1 { return; }
            }
            let mut j = N;
            while perm[j - 1] <= perm[i - 1] { j -= 1; }
            perm.swap(i - 1, j - 1);
            i += 1;
            j = N;
            perm[i - 1 .. j].reverse();
            yield perm.map(|i| &data[i]); //2
        }
    })
}

fn main() {
    let items: [_; 12] = from_fn(|i| i as u8 * 10);
    println!("{}", permutations_array(&items).count());
}

With N=12 it prints 479_001_600, that is 12!. If I replace the second line:

yield perm.map(|i| &data[i]); //2

With a similar but unsafe version:

unsafe { yield perm.map(|i| data.get_unchecked(i)); }

The run-time on my PC with N=12 goes from 2.35 seconds to just 1.24 seconds.

I believe Ada language allows a similar optimization, but with safe code. Because you can put ranged integers inside the 'perm' array, avoiding all bound tests during those mappings. Rust could gain a very similar feature of ranged integers, that allows to replace this line:

let mut perm = from_fn(|i| i);

With something like:

type Id = usize[.. N];
let mut perm = from_fn(|i| Id::try_from(i).unwrap());

(Currently you can create a similar const generics wrapper for array indexes, but giving it a very good ergonomy is lot of work, and I hit some problems).

jjpe · August 4, 2022, 1:41pm

This makes me think of refinement types, which are another potential way of solving this, though doing so would likely require some compiler support.

Given a type T, refinement types are types that are used to restrict the value space of T. And that sounds a lot like a more generic version of ranged integers.

How they work is basically you define your type T, and then you have 3 additional pieces: a Refinement type and a predicate. The idea is to leverage a correct-by-construction pattern, using predicate when constructing values of type T, and erroring out if the predicate fails. That way, any value of Refinement<T, P> is guaranteed to adhere to the predicate.

Using refinement types can be done today with refinement. The only part that wouldn't work is the compiler optimization, which is why I say it would likely require some special support from rustc.

SkiFire13 · August 4, 2022, 1:56pm

AFAIK this has already been discussed a couple of times, but no definite design has been produced yet. See for example:

github.com/rust-lang/rfcs

Range types for integers (or refinement types?)

opened 07:42PM - 21 Jan 15 UTC

steveklabnik

T-lang

<a href="https://github.com/BryanQuigley"><img src="https://avatars.githubuserco…ntent.com/u/702754?v=3" align="left" width="96" height="96" hspace="10"></img></a> **Issue by [BryanQuigley](https://github.com/BryanQuigley)** _Saturday Dec 13, 2014 at 03:46 GMT_ _For earlier discussion, see https://github.com/rust-lang/rust/issues/19801_ _This issue was labelled with: A-an-interesting-project, E-hard in the Rust repository_ --- It seems like a natural extension of how variables (immutable by default, mutable if specified) are defined to allow the programmer to dictate a specific range of allowed values for an integer. If I know a value is only valid between 0-1000 the sooner I declare that the better it is for catching bugs, off by one errors, and more... I'm not sure what exact syntax would work, maybe: ``` rust let mut(0,1000) x = 0i; ``` x is only valid from 0-1000 inclusive. (Apologies if this is already possible, I've been parsing the docs trying to learn Rust.)

The problems start to appear when you want to do operations on that type. If you add 1 to a usize[.. N] should you get a usize[1 .. N+1]? Or an Option<usize[.. N]>? Maybe just an usize? Keeping track of these invariants also quickly becomes quite a lot of work for the compiler and for who needs to implement the feature.

ps: Regarding your specific piece of code, it can be optimized with this weird trick:

fn permutations_array<T, const N: usize>
                     (data: &[T; N]) -> impl Iterator<Item=[&T; N]> {
    from_generator(move || {
        if N == 0 { return; }
-       let mut perm = from_fn(|i| i);
-       yield perm.map(|i| &data[i]); //1
+       let mut perm = from_fn(|i| &data[i]);
+       yield perm; //1

        loop {
            let mut i = N - 1;
-           while perm[i - 1] >= perm[i] {
+           while perm[i - 1] as *const _ >= perm[i] as *const _ {
                i -= 1;
                if i < 1 { return; }
            }
            let mut j = N;
-           while perm[j - 1] <= perm[i - 1] { j -= 1; }
+           while perm[j - 1] as *const _ <= perm[i - 1] as *const _ { j -= 1; }
            perm.swap(i - 1, j - 1);
            i += 1;
            j = N;
            perm[i - 1 .. j].reverse();
-           yield perm.map(|i| &data[i]); //2
+           yield perm; //2
        }
    })
}

Or, alternatively, you could define something like this, limiting the unsafe to a very small interface (which is the very point of unsafe!):

mod bounded {
    #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
    pub struct Bounded<const N: usize>(usize);
    
    impl<const N: usize> Bounded<N> {
        pub fn new(n: usize) -> Self {
            assert!(n < N);
            Self(n)
        }
        pub fn into_usize(&self) -> usize {
            let n = self.0;
            if !(n < N) {
                unsafe { std::hint::unreachable_unchecked() };
            }
            n
        }
    }
}

scottmcm · August 4, 2022, 6:50pm

I think this is the thing that really needs to be addressed. Because num::Wrapping and a hypothetical num::NonNanF32 need the same kinds of ergonomic improvements.

leonardo · August 4, 2022, 7:42pm

jjpe:

refinement types could be rather useful for Rust, but they are something much more complex compared to the ranged integers. ranged integers are upward compatible with refinement typing. Arbitrary bit-width integers are something sufficiently different from ranged integers (and I don't need them).

SkiFire13:

The problems start to appear when you want to do operations on that type. If you add 1 to a usize[.. N] should you get a usize[1 .. N+1]?<

You get the same type usize[.. N] or a run-time error, just like when you index arrays (safer functions that return an Option<usize[.. N]> could be added, but they are for later. My index wrapper has a succ() function that returns an Option).

ps: Regarding your specific piece of code, it can be optimized with this weird trick:

Your version of the code has an intermediate run time, 1.54 seconds (the other two timings were 2.35 and 1.24). In Rust there are often ways to solve performance/safety problems for tiny examples. But strong typing shines when you have large(r) amounts of code.

Or, alternatively, you could define something like this,

I have created a rather more refined index, implemented generically for arrays, but it has various downsides compared to a type system feature.

scottmcm · August 4, 2022, 8:14pm

Can you elaborate on how this thread is different from your previous ones, like these?

jjpe · August 4, 2022, 8:18pm

Refinement types wouldn't alter the size of the original type. Instead a refinement type makes it illegal to construct some values of the original type, eg Refinement<usize, IsOdd> is still the same size as usize, it's just that in this example even numbers would fail to construct.

I'm inclined to say that yes refinement types are more general than ranged integers, which is why I thought of them. Optimizations related to refinement types might be useful for more than just range checks elimination. And as stated before, the rest of refinement types (i.e. the functional, non-optimization part) already works on stable today, which means that if this were used, less specific infrastructure specific to the compiler would have to be written.

leonardo · August 4, 2022, 8:27pm

scottmcm:

Can you elaborate on how this thread is different from your previous ones, like these?

It's an important topic, worth repeating from various angles, safety, performance, clarity of the code, to replace some code comments with types, to help future Rust proof systems, so let's help give it a bit of visibility Some people seem interested in just performance, others in mostly correctness, etc.

jjpe:

Refinement types wouldn't alter the size of the original type.

Right.

I'm inclined to say that yes refinement types are more general than ranged integers,

Right. But they are far more complex to implement (see below).

And as stated before, the rest of refinement types (i.e. the functional, non-optimization part) already works on stable today.

I strongly prefer them to be handled at compile-time (by STM solvers, etc).

jjpe · August 4, 2022, 8:53pm

They really aren't. Have a look at the refinement crate, you can read through it in perhaps 15 minutes. The bulk of it is just a trait and a type.

Nothing in my answer suggests doing otherwise. I but before there is support, using it directly might be helpful.

leonardo · August 4, 2022, 9:12pm

jjpe:

They really aren't. Have a look at the refinement crate, you can read through it in perhaps 15 minutes. The bulk of it is just a trait and a type.

For reasonable refinement types have a look at Liquid Haskell: https://ucsd-progsys.github.io/liquidhaskell-blog/

But this thread is about having built-in ranged types, their relative simplicity in both syntax and semantics should allow them to be used more often in normal code by normal Rust programmers, unlike Liquid Haskell that still has rather niche usage.

jjpe · August 4, 2022, 9:44pm

Liquid Types go way further than refinement types. They're not the same thing. Liquid Types are like refinement types where the predicate is on steroids and needs a sat solver to evaluate. They're also complete overkill for the topic of this thread.

jhpratt · August 5, 2022, 4:21am

I definitely want ranged integers with compiler support at some point. The niche values would be wonderful, not to mention the obvious benefits from an API perspective.

crlf0710 · August 5, 2022, 4:40am

I've been using no-niche ranged integer and arrays in my crate. See tex-rs/pascal.rs at master · crlf0710/tex-rs · GitHub

jhpratt · August 5, 2022, 4:41am

Heh, might want to check out the deranged crate!

crlf0710 · August 5, 2022, 4:54am

Currently still tying to typenum version and const generics-based version doesn't work for me yet, because i need plusing and minusing sometimes lol. Will migrate when that's ready.

jhpratt · August 5, 2022, 4:55am

Fair enough! It's a limitation of const generics for sure.

But anyways, let's not derail the thread (yes, it was my doing)

system · November 3, 2022, 4:56am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
No sane way to generate [1., 1.5, 2., 2.5].iter().cloned()	10	4099	January 24, 2022
Pi types v2	2	1922	March 25, 2019
Some examples of simple stronger static typing language design	8	1620	March 25, 2019
More on ranged integers language design	4	3065	March 25, 2019
Feature idea: impl Iterator for tuple (1,2..n) as range with step	10	1794	April 15, 2021

Ranged integers for performance

Related topics