Packed_simd: `cfg(target_feature)` does not play well with `#[target_feature]`


The packed_simd crate ( does not play well with #[target_feature]. Given:

extern crate packed_simd;
use packed_simd::m32x8;

#[target_feature(enable = "sse4.2")]
unsafe fn foo(m: m32x8) -> bool { m.all() }

#[target_feature(enable = "avx2")]
unsafe fn bar(m: m32x8) -> bool { m.all() }

the functions foo and bar are potentially compiled with different target features. Assume that the binary is compiled for one of the x86_64-targets, then the binary is compiled with SSE2 globally enabled, while foo and bar extend SSE2 with SSE4.2 and AVX2.

The method all on masks, like all methods in packed_simd, is #[inline]. The crate packed_simd will be compiled with SSE2 enabled, but because foo and bar extend the feature set that packed_simd was compiled with, the functions defined in the packed_simd crate can be inlined into foo and bar, and for them LLVM will generate code using SSE4.2 and AVX2 in this case.

Now comes the problem. The packed_simd crate has many work arounds for LLVM bugs:

All these workarounds are currently implemented using cfg(target_feature), like this:

impl m32x8 {
    cfg_if! {
        if #[cfg(target_feature = "sse4.2")] {
            #[inline] fn all(self) -> bool { ... }
        } else if #[cfg(target_feature = "avx2")] {
            #[inline] fn all(self) -> bool { ... }
        } else {
            #[inline] fn all(self) -> bool { ... }

and that shows the problem. Because the packed_simd crate was compiled with the global feature set, and neither SSE4.2 nor AVX2 were enabled globally, the worst possible implementation for every work around will be selected independently of the features that the calling context, foo and bar, supports.

So that is the problem in practice. This is a reduced version of this problem:

// interaction within a single function
// (macros won't solve packed_simd's problem)
#[target_feature(enable = "avx2")]
unsafe fn foo() -> bool {
    // often returns false:
    if cfg!(target_feature = "avx2") { true } else { false }

// interaction across functions
#[target_feature(enable = "avx2")]
unsafe fn bar() -> bool { foo() }

How do we improve / fix this? I have no good answer.