@rkruppe I had to think about this some more, and came up with this litmus test: https://godbolt.org/g/AkZYVy.
We have a function that compares an integer against all possible integers using comparison at pointer-type:
static void compare_to_all(uintptr_t i) {
uintptr_t cur = 0;
while(true) {
if ((int*)i == (int*)cur) return;
if (cur == UINTPTR_MAX) break;
++cur;
};
unreachable();
}
and then we call that on the address of a local:
void test() {
int i = 1;
compare_to_all((uintptr_t)&i);
}
This optimizes to
test(): # @test()
jmp unreachable() # TAILCALL
Even I can read this piece of assembly.
(gcc, btw, produces the same final assembly.)
First I thought this is definitely a miscompilation. However, then I was reminded that casting out-of-range pointers to integers is UB in C. So gcc is actually perfectly in its right to do whatever; for LLVM, we have to look at the IR of the comparison:
%5 = load i64, i64* %2, align 8, !dbg !29
%6 = inttoptr i64 %5 to i32*, !dbg !32
%7 = load i64, i64* %3, align 8, !dbg !33
%8 = inttoptr i64 %7 to i32*, !dbg !34
%9 = icmp eq i32* %6, %8, !dbg !35
From what I can tell, inttoptr is never UB. However, all of this made me realize something I had missed before: This alloca-optimization we talked about only kicks in when icmp works at pointer type! That’s great, we can have a rule saying something like comparisons at pointer type require both pointers to be allocated or 0; otherwise the result is non-deterministic.
Does this make sense?
There are other troublesome examples that do not involve comparisons at pointer type, but instead involve noalias. I am not surprised by noalias making trouble.
However, this does show the problems of pure lock-based approaches: [Playpen]
#[inline(never)]
fn id(x: usize) -> usize { x }
fn main() {
let mut x = Box::new(0u32);
let xp = &mut *x as *mut _ as usize;
drop(x);
let mut y = Box::new(0u32);
let yp = &mut *y as *mut _ as usize;
if id(xp) == yp {
unsafe { *(yp as *mut u32) = 2u32; }
unsafe { *(xp as *mut u32) = 1u32; }
println!("{}", unsafe { *(yp as *mut u32) });
}
}
When compiled with optimizations, this prints 2. IOW, the pointers compare equal (at integer type!), but the write to xp does not affect the following read from yp. I cannot come up with any other explanation for this other than somehow making this program UB. (If this was a comparison at pointer type, since xp is deallocated, the comparison would be non-deterministic and could evaluate to true even if the pointers are different, explaining the behavior. But when comparing at integer type, I don’t think such a rule can be made to work. After all, 3 == 2 will probably compare two un-allocated pointers, and we certainly want the result to be deterministic.)
However, with a lock-based approach satisfying the principle “if you can access an address, it doesn’t matter which alias you use”, the above program cannot be UB – it is certainly fine to access yp, and hence because xp is an alias in the conditional, accessing xp cannot be UB. (Unless, again, we find some justification for the conditional to be non-deterministic.)
Interesting enough, I was not able to reproduce this kind of optimization in C.
Another point that surprised me here is that LLVM has special optimizations for alloca that seem mostly orthogonal to noalias-based optimizations. I would have expected alloca to just have a noalias annotation and then get all of its optimizations from there.