Array bound tests 2

I've decided to write this post after this Reddit thread:

A little Rust program that computes a Hamming Number:

use std::num::NonZeroUsize;

fn hamming(n: NonZeroUsize) -> u64 {
    let n = n.get();
    let mut h = vec![0; n];
    h[0] = 1;

    let mut x2 = 2;
    let mut x3 = 3;
    let mut x5 = 5;
    let mut i = 0;
    let mut j = 0;
    let mut k = 0;

    for idx in 1 .. n {
        h[idx] = x2.min(x3).min(x5);
        if h[idx] == x2 {
            i += 1;
            x2 = 2 * h[i];
        }
        if h[idx] == x3 {
            j += 1;
            x3 = 3 * h[j];
        }
        if h[idx] == x5 {
            k += 1;
            x5 = 5 * h[k];
        }
    }
    h[n - 1]
}

fn main() {
    let nz = |x| NonZeroUsize::new(x).unwrap();
    // See https://github.com/rust-lang/rust/issues/73121
    println!("{:?}", (1 .. 21).map(|i| hamming(nz(i))).collect::<Vec<_>>());
    println!("{:?}", hamming(nz(1_691)));
}

LLVM of the the last Nightly rustc with "best" compilation flags leaves about five panic_bounds_check:

.LBB3_9:
    lea     rdx, [rip + .L__unnamed_1]
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_15:
    lea     rdx, [rip + .L__unnamed_2]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_19:
    lea     rdx, [rip + .L__unnamed_3]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
.LBB3_23:
    lea     rdx, [rip + .L__unnamed_4]
    mov     rdi, rax
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
.LBB3_12:
    ud2
.LBB3_32:
    call    alloc::raw_vec::RawVec<T,A>::allocate_in::{{closure}}
    ud2
.LBB3_33:
    mov     rdi, rsp
    call    alloc::raw_vec::RawVec<T,A>::allocate_in::{{closure}}
    ud2
.LBB3_11:
    lea     rdx, [rip + .L__unnamed_5]
    call    qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
    jmp     .LBB3_12
    mov     rbx, rax
    mov     rdi, rsp
    call    core::ptr::drop_in_place
    mov     rdi, rbx
    call    _Unwind_Resume@PLT
    ud2

I could rewrite the code in WhyML ( http://why3.lri.fr/try/ ), and even the single and limited online solver is able to discard all array accesses as safe (so a Rust compiler can replace them with get_unchecked/get_unchecked_mut). This code has extra invariants because I've added an ensures{} to also verify a small amount of functional correctness (if the result isn't zero then the dynamic programming computation isn't too much botched, considering the array contains only the result of chains of multiplications):

module Hamming
    use int.Int
    use ref.Ref
    use array.Array
    use int.MinMax
    
    let hamming (n: int) : int =
        requires { n > 0 }
        ensures { result > 0 }
        let h = make n 0 in
        h[0] <- 1;
    
        let x2 = ref 2 in
        let x3 = ref 3 in
        let x5 = ref 5 in
    
        let i = ref 0 in
        let j = ref 0 in
        let k = ref 0 in
    
        for idx = 1 to n - 1 do
            invariant { 0 <= !i < n /\ !i < idx }
            invariant { 0 <= !j < n /\ !j < idx }
            invariant { 0 <= !k < n /\ !k < idx }
            invariant { !x2 > 0 /\ !x3 > 0 /\ !x5 > 0 }
            invariant { forall i: int. 0 <= i < idx -> h[i] > 0 }
    
            let aux = min !x2 (min !x3 !x5) in
            h[idx] <- aux;
            if aux = !x2 then begin
                i := !i + 1;
                x2 := 2 * h[!i];
            end;
            if aux = !x3 then begin
                j := !j + 1;
                x3 := 3 * h[!j];
            end;
            if aux = !x5 then begin
                k := !k + 1;
                x5 := 5 * h[!k];
            end;
        done;
        h[n - 1]
    end

In WhyML integers aren't fixed-size, this makes proofs simpler. Adding a similar proof system to Rust is harder because most numbers are fixed size. Allowing to prove everything that's allowed in Rust is probably of little use and/or too much hard. As ADA-SPARK shows it's OK to allow only a subset of Rust features for the functions that you want to prove some invariants for (and as years pass the subset of Ada allowed in SPARK code grows).

4 Likes

Playing around with inserting some assume (&& was not actually used, instead using multiple assume; && is used here to save space):

  • On the playground, I started with six (6) callqs to core::panicking::panic_bounds_check@GOTPCREL(%rip). One for each index.
  • Adding an assume(n>0) for the statically guaranteed nonzero n reduced this to five (5). (#49572)
  • Adding assume(i<n && j<n && k<n) at the end of the function is not sufficient to eliminate any further bounds checks.
  • Adding assume(idx<n && i<idx && j<idx && k<idx) to the start of the loop is not sufficient to eliminate any further bounds checks.
  • Adding assume(idx<n && i<n-1 && j<n-1 && k<n-1) to the start of the loop is not sufficient to eliminate any further bounds checks.
  • From this we can assume (no pun intended) that LLVM is (currently) unable/unwilling to prove that any of these invariants are sufficient to prevent out-of-bounds access.
  • The first optimization barrier seems to be that LLVM does not know that h does not change length during execution of the function. Adding let h = &mut *h; reduces this to three (3) calls to panic_bounds_check. (.into_boxed_slice() also reduces the bounds checking equivalently but for some reason adds a panic check for trying to shrink the vector into a larger slice.)
  • At this point, the three insufficient assumptions above are still insufficient.
  • From this, we can assume (no pun intended) that LLVM is (currently) unable/unwilling to prove that any of those assumptions are sufficient to show that no out-of-bounds access occurs. (As a corallary to this, the invariants provided in the WhyML are likely also insufficient for LLVM's bounds check elimination.)
  • An assume(i<n) (and for j, k) directly after the increment is sufficient to eliminate the remaining three bounds checks and unroll/vectorize the loop.
  • At this point, removing assume(n>0) seems to have no effect, likely because fn hamming is entirely inlined into fn main.

[playground]

(I did this on the playground rather than godbolt because godbolt is still currently unusable on mobile.)

9 Likes

Thank you for your interesting experiment (even if I don't fully understand what your purpose was).

Is this worth writing a bug report for?

I was partially aware of this problem (the Why3 documentation has a note about this) but I will have to remember this and use the idea in Rust, perhaps it will improve my code.

It was interesting to do, nothing much more than that, really.

I did a little bit of checking on godbolt and... yes, definitely, this is a stable-to-stable regression between 1.43 and 1.44. Filed as #73152.

3 Likes