Possible optimization for boundary check when accessing without side effects in-between?

For example, the following code:

pub fn to_u32(buf: &[u8]) -> u32 {
    ((buf[0] as u32) << 0) +
    ((buf[1] as u32) << 8) +
    ((buf[2] as u32) << 16) +
    ((buf[3] as u32) << 24)
}

would currently be compiled to something similar to

if buf.len() == 0 { panic!("the len is 0 but index is 0"); }
if buf.len() == 1 { panic!("the len is 1 but index is 1"); }
if buf.len() <  3 { panic!("the len is 2 but index is 2"); }
if buf.len() == 3 { panic!("the len is 3 but index is 3"); }
*transmute(buf)

which is four boundary checks on the happy path.

If we add an assertion at the beginning of the original code to assert that buf.len() > 3, then we would only have a single boundary check. So this is not really zero cost.

I think the compiler is actually able to optimize it to only a single boundary check on happy path. One problem is the panic message, but that's still doable.

If the compiler can generate something like

if buf.len() > 3 { return *transmute(buf); }
if buf.len() == 0 { panic!("the len is 0 but index is 0"); }
if buf.len() == 1 { panic!("the len is 1 but index is 1"); }
if buf.len() == 2 { panic!("the len is 2 but index is 2"); }
panic!("the len is 3 but index is 3");

then we have a single boundary check but still preserve all the panic message.

Similarly, for loops:

pub fn sum_n(n: usize, arr: &[i32]) -> i32 {
    let mut sum = 0;
    for i in 0..n {
        sum += arr[i];
    }
    sum
}

rather than compiling it to

let mut sum = 0;
for i in 0..n {
    if arr.len() == i { panic!("the len is {}, but index is {}", i, i); }
    sum += arr.get_unchecked(i);
}
sum

it would be faster to be

if arr.len() > n {
    let mut sum = 0;
    for i in 0..n {
        sum += arr.get_unchecked(i);
    }
    return sum;
}
panic!("the len is {}, but index is {}", arr.len(), arr.len());

I know that we have assertions and iterator pattern, but making it optimize boundary checking better may still be beneficial for certain computation load, and it would also help people from other languages to write a faster Rust code in their first attempt :slight_smile:

WDYT?

2 Likes