I’ve been recently writing a bunch of performance-sensitive float code and I was surprised to see that rustc will not emit SSE4.1 instructions by default on macOS. As a result, very common operations like floor, ceil, and round become library calls. This obviously has impacts on performance. Even though the minimum support microarchitecture can be changed with
RUSTFLAGS, this is a library, and I’m concerned that downstream users will not get optimum performance by default.
The relevant code seems to want to emit code for a Core 2 by default. This means that the compiler ensures that the code will work on Conroe or Merom, both of which are 65 nm CPUs manufactured in 2006. This CPU spec basically dates back to the start of the Rust project.
I’m aware that the decision has been made to not use
cpu=native, even with
cargo install. I’m OK with the decision to not use
cpu=native by default, but do we have a policy for bumping the minimum default supported CPU over time? It seems overly conservative to me if we never move the default over time.
If there aren’t any such policies yet, as a straw man proposal, I suggest setting the default at the minimum supported CPU used by the overwhelming majority of the Rust userbase: for example, 99%. Thoughts?