Missed optimization opportunity for bounds checks

This code

fn sum(a: &[i32]) -> i32 {
    a[0] + a[1] + a[2]
}

is compiled to something with 3 bounds checks, like this:

fn sum(a: &[i32]) -> i32 {
    if a.len() == 0 {
        panic!("0 out of range");
    } else if a.len() == 1 {
        panic!("1 out of range");
    } else if a.len() == 2 {
        panic!("2 out of range");
    } else {
        unsafe { a.get_unchecked(0) + a.get_unchecked(1) + a.get_unchecked(2) }
    }
}

However, if LLVM can see that the first 3 cases are unlikely because they go to panic code, why doesn't it optimize it to something with only 1 bounds check on the hot path, such as:

fn sum(a: &[i32]) -> i32 {
    if a.len() < 3 {
        if a.len() == 0 {
            panic!("0 out of range");
        } else if a.len() == 1 {
            panic!("1 out of range");
        } else {
            panic!("2 out of range");
        }
    } else {
        unsafe { a.get_unchecked(0) + a.get_unchecked(1) + a.get_unchecked(2) }
    }
}
4 Likes

that would do the thing you want to do

1 Like

I understand it's possible for programmers to optimize the code manually, but that's not the point. The point is that the compiler could optimize the code as written, by itself, and is not doing it for some reason. Question is, why is it not doing it, and whether it can be fixed.

6 Likes

The "as-if" rule. The compiler sees this:

(a[0] + a[1]) + a[2]

If a[0] + a[1] overflows, there's a panic in there before it gets to the [2] indexing. Your proposed optimization changes the panic that arises in that case (for non-release builds at least). A release build could probably "see" more.

In my original post I used the release mode, see the linked compiler output. There is no addition overflow check in the assembly.

AFAIK this is only a problem if "touching" the a[2] momory address result in a invalid access or has side effects (and this might be why LLVM doesn't completly remove all but one bound checks), but, correct me if I'm wrong, this should not be the case in (safe) Rust. Nevermind, I misunderstood the problem.

Previous discussion on this topic: More musings about slices and arrays

I think the existing compiler rules for side effects don't cover this case properly.

panic is still a side effect, so it can't be simply removed. It can't be freely reordered either, because you can't end up with executing the read before the panic. It can't freely moved/hoisted earlier either, because you don't want a panic from a[x] escape something like if x < len { a[x] }.

So it's a tricky case where the compiler would probably have to explicitly understand the concept of bounds checks, and track relationships between them to properly reorder them to remove redundancy.

4 Likes

I don't think this is it. What the code actually compiles to, and what I proposed it should compile to, actually do all the same reads and panics in the same order, and neither ever attemps to read anything out of bounds.

I think it's certainly important to keep in mind that the "standalone" codegen for functions is nearly always entirely unlike the codegen seen in contexts where they're actually being called.

See here, for example. rustc optimizes both the valid and invalid calls to sum out entirely, instead just immediately printing the literal number 3 and then immediately panicking.

Obviously when a.len() is a compile time constant then all the ifs will get constant-folded. The issue is when the length is not known at compile time.

1 Like

It's still likely to not look exactly as it does in your original link. You'd have to examine various realistic uses of it to get an idea of the "average" contexual codegen quality.

I would suggest opening an issue on LLVM -- their issues are on github, now, so it's much easier than it was before.

Rust's panicking functions are marked cold, so this transformation should probably happen in general in LLVM for all things that lead to cold paths with only side-effect-free things between them -- not just indexing and panics.

(The panicking things being marked cold is why LLVM puts the panicking paths at the end of the assembly for the function, for example.)

2 Likes

Opened an issue on LLVM .

1 Like