Array bound tests 2

leonardo · June 8, 2020, 11:10am

I've decided to write this post after this Reddit thread:

A little Rust program that computes a Hamming Number:

use std::num::NonZeroUsize;

fn hamming(n: NonZeroUsize) -> u64 {
    let n = n.get();
    let mut h = vec![0; n];
    h[0] = 1;

    let mut x2 = 2;
    let mut x3 = 3;
    let mut x5 = 5;
    let mut i = 0;
    let mut j = 0;
    let mut k = 0;

    for idx in 1 .. n {
        h[idx] = x2.min(x3).min(x5);
        if h[idx] == x2 {
            i += 1;
            x2 = 2 * h[i];
        }
        if h[idx] == x3 {
            j += 1;
            x3 = 3 * h[j];
        }
        if h[idx] == x5 {
            k += 1;
            x5 = 5 * h[k];
        }
    }
    h[n - 1]
}

fn main() {
    let nz = |x| NonZeroUsize::new(x).unwrap();
    // See https://github.com/rust-lang/rust/issues/73121
    println!("{:?}", (1 .. 21).map(|i| hamming(nz(i))).collect::<Vec<_>>());
    println!("{:?}", hamming(nz(1_691)));
}

LLVM of the the last Nightly rustc with "best" compilation flags leaves about five panic_bounds_check:

.LBB3_9:
    lea     rdx, [rip + .L__unnamed_1]
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_15:
    lea     rdx, [rip + .L__unnamed_2]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_19:
    lea     rdx, [rip + .L__unnamed_3]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_23:
    lea     rdx, [rip + .L__unnamed_4]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
.LBB3_12:
    ud2
.LBB3_32:
    call    alloc::raw_vec::RawVec<T,A>::allocate_in::{{closure}}
    ud2
.LBB3_33:
    mov     rdi, rsp
    call    alloc::raw_vec::RawVec<T,A>::allocate_in::{{closure}}
    ud2
.LBB3_11:
    lea     rdx, [rip + .L__unnamed_5]
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
    mov     rbx, rax
    mov     rdi, rsp
    call    core::ptr::drop_in_place
    mov     rdi, rbx
    call    _Unwind_Resume@PLT
    ud2

I could rewrite the code in WhyML ( http://why3.lri.fr/try/ ), and even the single and limited online solver is able to discard all array accesses as safe (so a Rust compiler can replace them with get_unchecked/get_unchecked_mut). This code has extra invariants because I've added an ensures{} to also verify a small amount of functional correctness (if the result isn't zero then the dynamic programming computation isn't too much botched, considering the array contains only the result of chains of multiplications):

module Hamming
    use int.Int
    use ref.Ref
    use array.Array
    use int.MinMax
    
    let hamming (n: int) : int =
        requires { n > 0 }
        ensures { result > 0 }
        let h = make n 0 in
        h[0] <- 1;
    
        let x2 = ref 2 in
        let x3 = ref 3 in
        let x5 = ref 5 in
    
        let i = ref 0 in
        let j = ref 0 in
        let k = ref 0 in
    
        for idx = 1 to n - 1 do
            invariant { 0 <= !i < n /\ !i < idx }
            invariant { 0 <= !j < n /\ !j < idx }
            invariant { 0 <= !k < n /\ !k < idx }
            invariant { !x2 > 0 /\ !x3 > 0 /\ !x5 > 0 }
            invariant { forall i: int. 0 <= i < idx -> h[i] > 0 }
    
            let aux = min !x2 (min !x3 !x5) in
            h[idx] <- aux;
            if aux = !x2 then begin
                i := !i + 1;
                x2 := 2 * h[!i];
            end;
            if aux = !x3 then begin
                j := !j + 1;
                x3 := 3 * h[!j];
            end;
            if aux = !x5 then begin
                k := !k + 1;
                x5 := 5 * h[!k];
            end;
        done;
        h[n - 1]
    end

In WhyML integers aren't fixed-size, this makes proofs simpler. Adding a similar proof system to Rust is harder because most numbers are fixed size. Allowing to prove everything that's allowed in Rust is probably of little use and/or too much hard. As ADA-SPARK shows it's OK to allow only a subset of Rust features for the functions that you want to prove some invariants for (and as years pass the subset of Ada allowed in SPARK code grows).

CAD97 · June 8, 2020, 6:03pm

Playing around with inserting some assume (&& was not actually used, instead using multiple assume; && is used here to save space):

On the playground, I started with six (6) callqs to core::panicking::panic_bounds_check@GOTPCREL(%rip). One for each index.
Adding an assume(n>0) for the statically guaranteed nonzero n reduced this to five (5). (#49572)
Adding assume(i<n && j<n && k<n) at the end of the function is not sufficient to eliminate any further bounds checks.
Adding assume(idx<n && i<idx && j<idx && k<idx) to the start of the loop is not sufficient to eliminate any further bounds checks.
Adding assume(idx<n && i<n-1 && j<n-1 && k<n-1) to the start of the loop is not sufficient to eliminate any further bounds checks.
From this we can assume (no pun intended) that LLVM is (currently) unable/unwilling to prove that any of these invariants are sufficient to prevent out-of-bounds access.
The first optimization barrier seems to be that LLVM does not know that h does not change length during execution of the function. Adding let h = &mut *h; reduces this to three (3) calls to panic_bounds_check. (.into_boxed_slice() also reduces the bounds checking equivalently but for some reason adds a panic check for trying to shrink the vector into a larger slice.)
At this point, the three insufficient assumptions above are still insufficient.
From this, we can assume (no pun intended) that LLVM is (currently) unable/unwilling to prove that any of those assumptions are sufficient to show that no out-of-bounds access occurs. (As a corallary to this, the invariants provided in the WhyML are likely also insufficient for LLVM's bounds check elimination.)
An assume(i<n) (and for j, k) directly after the increment is sufficient to eliminate the remaining three bounds checks and unroll/vectorize the loop.
At this point, removing assume(n>0) seems to have no effect, likely because fn hamming is entirely inlined into fn main.

[playground]

(I did this on the playground rather than godbolt because godbolt is still currently unusable on mobile.)

leonardo · June 8, 2020, 7:09pm

Thank you for your interesting experiment (even if I don't fully understand what your purpose was).

Is this worth writing a bug report for?

I was partially aware of this problem (the Why3 documentation has a note about this) but I will have to remember this and use the idea in Rust, perhaps it will improve my code.

CAD97 · June 8, 2020, 10:04pm

It was interesting to do, nothing much more than that, really.

I did a little bit of checking on godbolt and... yes, definitely, this is a stable-to-stable regression between 1.43 and 1.44. Filed as #73152.

system · September 6, 2020, 10:04pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bounds checking at initialization ideas (deprecated)	10	3720	March 25, 2019
Array bound tests	2	611	August 25, 2020
Replace array bounds checking with modulo? ideas (deprecated)	20	5355	March 25, 2019
Compile-time array indexing bounds checks for constants language design	6	1451	February 14, 2022
Indexing into a slice with a known length	9	874	April 11, 2023

Array bound tests 2

Related topics