According to the doc of x86 instruction bsr:
If the content source operand is 0, the content of the destination operand is undefined.
Let's try the following Rust code:
fn foo1(a: u32) -> u32 {
a.leading_zeros()
}
fn foo2(a: u32) -> u32 {
if a == 0 {
return 32;
}
a.leading_zeros()
}
Both of the two functions compiles to the following assembly codes (in release mode, playground):
mov eax, 63
bsr eax, edi
xor eax, 31
ret
It's obviously that when edi is zero, the value of eax is undefined after the bsr instruction, and thus the return value is undefined.
And I also tried this C code:
unsigned int foo(unsigned int a) {
if (a == 0) {
return 32;
}
return __builtin_clz(a);
}
And by GCC, the code is compiled to the following assembly code (with -O3, Compiler explorer):
foo:
bsr eax, edi
mov edx, 32
xor eax, 31
test edi, edi
cmove eax, edx
ret
So GCC handles the corner case without any UB.
Unfortunately, clang also has this UB. So this seems related to LLVM. However, I'm not so familiar with LLVM, so could anyone look into this and find out what's wrong?
P.S. I found that there is a similar post in 2017, and what they said implies that, maybe there was no UB at that time?