"Reslicing" like this is basically always the right thing to do before a bunch of indexing. See this recent post for another example of it: Mir optimization pass that implements auto-vectorization - #6 by scottmcm
And I don't know if it's still true, but at least 1½ years ago it was the case that indexing through Vec
optimized worse than through a slice (We all know `iter` is faster than `loop`, but why? - #3 by scottmcm - The Rust Programming Language Forum), which just gives another reason that you should reslice it.
Note that you can opt into more focused optimization for them by changing your for p in STUFF {
to STUFF.for_each(|p| {
.
This loop is kinda weird, BTW.
Anything | 1
will be > 0
, and you break at the bottom before i
can get down to 0
.
So I think it should be a loop
?
But more importantly, I think you can reduce this report down to something more specific, by asking whether LLVM more directly whether it knows certain things.
For example, I'm extra confident about the while
condition because if you ask LLVM to compile
pub fn demo_lower(n: usize, p: usize) -> bool {
assert!(p >= 3);
let i = (n / p - 1) | 1;
i > 0
}
the output is clearly "oh yeah, that either asserts or returns true".
Whereas if you ask it about
pub fn demo_upper(n: usize, p: usize) -> bool {
assert!(p >= 3);
let i = (n / p - 1) | 1;
i < (n + 1)
}
Then it obviously has no idea
define noundef zeroext i1 @_ZN7example10demo_upper17hed0063195a0d922bE(i64 %n, i64 %p) unnamed_addr #0 !dbg !13 {
%_4 = icmp ult i64 %p, 3, !dbg !14
br i1 %_4, label %bb1, label %bb3, !dbg !15
bb1: ; preds = %start
tail call void @_ZN4core9panicking5panic17hef60c19188bfa7ddE([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [24 x i8] }>* @alloc14 to [0 x i8]*), i64 24, %"core::panic::location::Location"* noalias noundef readonly align 8 dereferenceable(24) bitcast (<{ i8*, [16 x i8] }>* @alloc16 to %"core::panic::location::Location"*)) #2, !dbg !15
unreachable, !dbg !15
bb3: ; preds = %start
%_8 = udiv i64 %n, %p, !dbg !16
%_7 = add nsw i64 %_8, -1, !dbg !17
%i = or i64 %_7, 1, !dbg !17
%_13 = add i64 %n, 1, !dbg !18
%0 = icmp ult i64 %i, %_13, !dbg !20
ret i1 %0, !dbg !21
}
And thus the problem isn't that it doesn't know the bounds, it's that regardless of whether it knows the vec/slice bound or not, the indexing logic you're using is too complicated for it to figure out.
So you might consider other ways to write that logic that it might understand better, or consider filing an LLVM issue about it if you think that LLVM should understand what you're doing.
(Repro for those two snippits: https://rust.godbolt.org/z/qYcnM1E5G)