For things like AVX and AES-NI in x86-only crates, I oppose this proposal and would strongly prefer library crates to do runtime feature detection using CPUID. No extra work for Cargo or the end application, and gives you wide binary compatibility as a perk. AFAIK, no recent user-mode x86 extension has provided fundamentally new capabilities, so you can always provide a fallback using older instructions and/or pure Rust code.
This doesn’t work for SSE2, because SSE2 affects the ABI, but at this point I think it’s reasonable to assume that by default (as Rust currently does).
This also doesn’t work as well for the supervisor-mode extensions that do provide new capabilities, but you can still have a runtime check and return an error.
If you plan on exporting very short functions that do this, it would be a good idea to cache the relevant feature flag in a static mut AtomicUsize, since CPUID is a serializing instruction. Also this runs into trouble if you have a multi-socket system with CPUs of different families/generations, because the CPUID could run on the newer CPU and the process rescheduled to the older one, but runtime feature testing is common enough in non-Rust programs that that ship sailed long ago.
So, what can we do to Rust to facilitate and encourage runtime feature testing?
Finally, remember that not all the world is x86, and we should design Rust to facilitate using the pure-Rust option on non-x86 CPUs (and I don’t just mean ARM – I’m also talking about the architecture neither of us has heard of that will be widely deployed 15 years from now.)