Why so complex way to calculate i32::MAX?

I read this topic on reddit

And wonder why a.saturating_add(1) not implemented as a.checked_add(1).unwrap_or(i32::MAX) ?

Because of

pub fn s_inc1(a: i32) -> i32 {
    a.checked_add(1).unwrap_or(i32::MAX)
}

converted to (by rustc nightly -O):

incl    %edi
movl    $2147483647, %eax
cmovnol %edi, %eax
retq

while

pub fn s_inc2(a: i32) -> i32 {
    a.saturating_add(1)
}

looks like this:

leal    1(%rdi), %eax
sarl    $31, %eax
addl    $-2147483648, %eax
incl    %edi
cmovnol %edi, %eax
retq

so looks like a.checked_add.unwrap_or is much greater way to implement a.saturating_add then saturating_add instricts.

If I understand assembly correctly:

leal    1(%rdi), %eax
sarl    $31, %eax
addl    $-2147483648, %eax

is fancy way to calculate i32::MAX or I missed something?

1 Like

saturating_add has to account for negative overflow too, i.e. i32::MIN.saturating_add(-1) should be i32::MIN, while your implementation would return i32::MAX. Though this could be optimized when some operand is constant.

16 Likes

I suppose it is impossible with llvm, because of rustc generates @llvm.sadd.sat.i32 arg, 1 for a.saturating_add, so llvm can not execute any optimisations passes, except target backend, but it is too late for constant folding and expressions elimination and so on things, that happens before target backend.

So may be usage of instrict for a.saturating_add is not so good idea?

2 Likes

Not exactly. This snippet computes i32::MAX or i32::MIN depending on whether a.wrapping_add(1) is negative or not.

2 Likes

While it is "more difficult" of an optimization (due to being target specific), LLVM is absolutely capable of lowering sadd.sat.i32(arg0, const 1) differently than sadd.sat.i32(arg0, arg1). So it is "just" a missed optimization on their part that could be added.

9 Likes

LLVM can absolutely run optimizations on its intrinsics. Trivial example of it optimizing saturating_add: https://rust.godbolt.org/z/carqPnava

If you want LLVM to optimize some case, file a bug: https://github.com/llvm/llvm-project/issues/new

9 Likes

Also worth noting that if vectorization kicks in, LLVM knows to turn saturating_add directly into a saturating addition. See Compiler Explorer for an example, where LLVM uses PADDUSW to handle the addition, which is a packed saturating add.

2 Likes

I've filed [X86] Inefficient legalization of sadd.sat with constant operand · Issue #63386 · llvm/llvm-project · GitHub for this issue.

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.