Here is the code:
use std::time::Instant;
fn main() {
let start = Instant::now();
(0..=1000)
.flat_map(|a| {
(0..=1000).flat_map(move |b| {
(0..=1000)
.filter(move |&c| a * a + b * b == c * c && a + b + c == 1000)
.map(move |c| (a, b, c))
})
})
.for_each(|(a, b, c)| {
println!("a: {} b: {} c: {}", a, b, c);
});
let duration = start.elapsed();
println!("{} seconds", duration.as_secs_f64());
}
After updating from Rust 1.81 to 1.85, the same code runs approximately 80% slower with:
cargo run --release
on my laptop (Windows 11 26100.2894 + Intel Core i7-10750H), so I tried version 1.82 to 1.84 and found that Rust 1.82 produces roughly the same output as Rust 1.85.
The difference between the ASM outputs from Rust 1.81 & 1.82 are:
1.81:
cmp r14d, eax
jne .LBB5_6
cmp esi, r12d
jne .LBB5_6
1.82:
xor eax, r14d
mov ecx, esi
xor ecx, r12d
or ecx, eax
jne .LBB5_5
And from 1.81:
cmp dword ptr [rsp + 116], 0
jne .LBB5_10
cmp r14d, 1000000
jne .LBB5_10
1.82:
xor r14d, 1000000
or dword ptr [rsp + 116], r14d
jne .LBB5_8
So basically, starting from Rust 1.82, the compiler sometimes generates bitwise operations on comparison results instead of multiple compare & jump operations, which makes those comparisons no longer short-circuited.
I'm not familiar with how rustc/LLVM works, so I don't really know how we should handle this kind of regression.