Raise a compile warning if you use an x86 intrinsic that's not supported by the target_feature

I was playing with simd intrinsics today and I noticed the following behavior that was hard for me to debug.

If I have a function which uses x86 simd intrinsics (say, avx512bw) but isn't inside a callchain that declares the appropriate target_feature, those simd intrinsics are still emitted, but they're emitted as out-of-line calls. So the code still works if I'm running on an avx512bw machine, it's just very slow.

#![feature(stdarch_x86_avx512,avx512_target_feature)]
use std::arch::x86_64::*;

//#[cfg(target_arch = "x86_64")]
//#[target_feature(enable = "avx2")]
pub unsafe fn test(x: __m256i) -> u16 {
    _mm256_cmpgt_epu16_mask(x, x)
}

I think I understand why you're doing it this way (the idea is, ultimately the whole function gets called by something which does have this target_feature, then everything gets inlined), but even being a compiler hacker myself I was very confused and initially thought this was a bug in rust/llvm and I needed to force it to inline harder.

It would be nice if rust warned me that I'm screwing this up.

1 Like

Ideally, I don't think you should have to declare a target feature on the whole function in order to use an intrinsic in one branch of it. It seems reasonable for a function to, for instance, detect a CPU feature and then have a loop over calls to an intrinsic available with that CPU feature, with intrinsic being inlined.

I don't know how feasible that is in the compiler. But it seems preferable to do that if possible.

1 Like

Yeah, the current handling of target features is disappointingly "not Rusty" (i.e. instead of proper compile-time checks we have unsafe everywhere). Long time ago I wrote the following proposal, but unfortunately amount of development activity in this area is very small:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.