These do have intrinsics, for example https://doc.rust-lang.org/std/intrinsics/fn.ctlz_nonzero.html , introduced in
rust-lang:master
← scottmcm:ctz-nz
opened 05:26AM - 09 Jun 17 UTC
It turns out that LLVM can turn `@llvm.ctlz.i64(_, true)` into `@llvm.ctlz.i64(_… , false)` ([`ctlz`](http://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic)) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases.
A contrived-but-clear example https://is.gd/VAIKuC:
```rust
fn foo(x:u64) -> u32 {
if x == 0 { return !0; }
x.leading_zeros()
}
```
Generates
```asm
testq %rdi, %rdi
je .LBB0_1
je .LBB0_3 ; <-- wha?
bsrq %rdi, %rax
xorq $63, %rax
retq
.LBB0_1:
movl $-1, %eax
retq
.LBB0_3:
movl $64, %eax ; <-- dead
retq
```
I noticed this in `next_power_of_two`, which without this PR generates the following:
```asm
cmpq $2, %rcx
jae .LBB1_2
movl $1, %eax
retq
.LBB1_2:
decq %rcx
je .LBB1_3
bsrq %rcx, %rcx
xorq $63, %rcx
jmp .LBB1_5
.LBB1_3:
movl $64, %ecx ; <-- dead
.LBB1_5:
movq $-1, %rax
shrq %cl, %rax
incq %rax
retq
```
And with this PR becomes
```asm
cmpq $2, %rcx
jae .LBB0_2
movl $1, %eax
retq
.LBB0_2:
decq %rcx
bsrq %rcx, %rcx
xorl $63, %ecx
movq $-1, %rax
shrq %cl, %rax
incq %rax
retq
```
I'd love to find a good way to expose them, since last I checked LLVM is unable to simplify ctlz(x, false)
to ctlz(x, true)
when it knows x != 0
.
They could always be exposed as an _unchecked
method, but it feels like there's something better to find. I was pondering putting them on NonZero*
, for example, but those have no such methods yet, so I'm not sure that's right...
2 Likes