Hey there folks.
I would like to draw attention of those above my pay grade, who can shed some light on something i can't understand as of now.
I was doing some benchmarks on different crates and i noticed that a crate https://crates.io/crates/matchit
when compiled for stable-x86_64-apple-darwin does not have any SIMD instructions but when compiled for x86_64-unknown-linux-gnu it does have SIMD instructions and is much faster as a result.
That's surprising, because at the compiler level, macOS should have the advantage in target features. The x86_64-unknown-linux-gnu target is very generic with only fxsr, sse, and sse2, while x86_64-apple-darwin uses a "core2" base CPU adding cmpxchg16b, sse3, and ssse3.
Maybe something in the crate itself (or its dependencies) is using explicit SIMD gated for Linux only?
The same results can be observed when I use my Linux machine as well. The crate is using SIMD UTF-8 validation. However, I have removed it and it's not using any other crates. There are still SIMD instructions which are not present on Mac. Several people, including me, the author of the crate and others from the Rust discord server have tried to understand what's happening here but couldn't.