Why so complex way to calculate i32::MAX?

davemilter · June 11, 2023, 10:18am

And wonder why a.saturating_add(1) not implemented as a.checked_add(1).unwrap_or(i32::MAX) ?

Because of

pub fn s_inc1(a: i32) -> i32 {
    a.checked_add(1).unwrap_or(i32::MAX)
}

converted to (by rustc nightly -O):

incl    %edi
movl    $2147483647, %eax
cmovnol %edi, %eax
retq

while

pub fn s_inc2(a: i32) -> i32 {
    a.saturating_add(1)
}

looks like this:

leal    1(%rdi), %eax
sarl    $31, %eax
addl    $-2147483648, %eax
incl    %edi
cmovnol %edi, %eax
retq

so looks like a.checked_add.unwrap_or is much greater way to implement a.saturating_add then saturating_add instricts.

If I understand assembly correctly:

leal    1(%rdi), %eax
sarl    $31, %eax
addl    $-2147483648, %eax

is fancy way to calculate i32::MAX or I missed something?

SkiFire13 · June 11, 2023, 10:33am

saturating_add has to account for negative overflow too, i.e. i32::MIN.saturating_add(-1) should be i32::MIN, while your implementation would return i32::MAX. Though this could be optimized when some operand is constant.

davemilter · June 11, 2023, 1:07pm

I suppose it is impossible with llvm, because of rustc generates @llvm.sadd.sat.i32 arg, 1 for a.saturating_add, so llvm can not execute any optimisations passes, except target backend, but it is too late for constant folding and expressions elimination and so on things, that happens before target backend.

So may be usage of instrict for a.saturating_add is not so good idea?

tczajka · June 11, 2023, 1:46pm

Not exactly. This snippet computes i32::MAX or i32::MIN depending on whether a.wrapping_add(1) is negative or not.

CAD97 · June 11, 2023, 2:06pm

While it is "more difficult" of an optimization (due to being target specific), LLVM is absolutely capable of lowering sadd.sat.i32(arg0, const 1) differently than sadd.sat.i32(arg0, arg1). So it is "just" a missed optimization on their part that could be added.

scottmcm · June 11, 2023, 7:48pm

LLVM can absolutely run optimizations on its intrinsics. Trivial example of it optimizing saturating_add: https://rust.godbolt.org/z/carqPnava

If you want LLVM to optimize some case, file a bug: https://github.com/llvm/llvm-project/issues/new

farnz · June 12, 2023, 9:42am

Also worth noting that if vectorization kicks in, LLVM knows to turn saturating_add directly into a saturating addition. See Compiler Explorer for an example, where LLVM uses PADDUSW to handle the addition, which is a packed saturating add.

nikic · June 19, 2023, 10:07am

I've filed [X86] Inefficient legalization of sadd.sat with constant operand · Issue #63386 · llvm/llvm-project · GitHub for this issue.

system · September 17, 2023, 10:07am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Increment / Decrement (x86 - IA32, AMD64) compiler	6	692	August 11, 2022
Checked_ sum() for all types that have both iter::Sum and checked_add() internals	23	2204	December 1, 2022
Where is std::num::Saturating? (Going to pre-RFC!) libs	51	3564	February 14, 2021
Help Us Benchmark Saturating Float Casts!	19	8346	March 25, 2019
Why does Rust generate 10x as much unoptimized assembly as GCC? compiler	23	4751	November 2, 2021

Why so complex way to calculate i32::MAX?

Related topics