Pre-RFC: stabilization of target_feature

Hmm. Just out of interest: do you have some examples of this? Are they stable?

I imagine those other types of intrinsics would just not be annotated with #[requires_target_feature]. Portable vector intrinsics are indeed odd. I feel like they should be roughly #[requires_target_feature(any)], in the sense that they do depend on target features, but any are fine, and the effects on ABI are similar to other #[requires_target_feature] things.

Sure, for example, all of the bit manipulation intrinsics in "platform-intrinsic", e.g. x86_bmi_....

Are they stable?

No, AFAIK no intrinsic is stable and probably they will never be. (note: standard has safe wrappers over some intrinsics that provide fallback implementations when the CPU doesn't support the intrinsic, but the raw intrinsics themselves are never exposed).

I feel like they should be roughly #[requires_target_feature(any)], in the sense that they do depend on target features

But it does not depend on target features, it behaves just like completely normal rust code. If the target supports some features, LLVM might generate code that uses them, just like for normal Rust code. Wrapping them in a function annotated with #[target_feature] just tells LLVM that it can use these features, but LLVM doesn't need to, and sometimes it doesn't (e.g. because it cannot, e.g., if arguments are not zero-extended or what not).

@burntsushi do we need to stabilize #[target_feature] to stabilize stdsimd ?

I donā€™t think so but maybe I forgot the reason. OTOH stabilizing cfg_target_feature does seem necessary to at least be able to use stdsimd in common cases.

Strictly speaking, no, because std can use the unstable #[target_feature]. But if we don't stabilize it, then it's going to hamstring the ability to produce a single binary that can detect CPU features at runtime and dynamically dispatch to the appropriate routines.

std itself does need #[target_feature] though. Otherwise, you would need to compile std for each cfg_target_feature value you want, which is a non-starter. And if std is exposing functions that are tagged with #[target_feature], then callers need to deal with the semantics of said functions, which in turn, implies to me, that #[target_feature] probably needs to be stabilized in order for the vendor intrinsics to be usable on stable Rust.

IIUC the functions in stdsimd will be unsafe, and whether the stdsimd functions are annotated with #[target_feature] or not is irrelevant for their users as long as we specify their guarantees (e.g. calling an avx function in a target without avx is UB). We can always relax these guarantees later.

Users can then use a third-party crate to do run-time feature detection just fine and dispatch to the appropriate standard features.

What users won't be able to do is mark their own functions with #[target_feature], but that's about it. They can still write:

// This function is only safe to call in a target with AVX
unsafe fn foo() {
   unsafe { 
     std::some_avx_intrinsic();  // these are annotated with #[target_feature] in std
     std::another_avx_intrinsic();
   }
}

fn bar() {
  if has_feature("avx") {
    unsafe { foo() }
  } else { panic!(""); }
}

They will also be able to use cfg!(target_feature) to wrap intrinsics in functions that provide a fallback and choose the best implementation at compile time, and they still can use --target-feature to enable features for the whole binary. To me it looks like we still get a lot of bang for buck if we stabilize cfg_target_feature ASAP and the stdsimd crate and something like the bitintr crate.

Not saying that we should fall back and remove #[target_feature] from the RFC, but that once the RFC is accepted we can actually push for stabilizing cfg_target_feature as is ASAP, and let the rest be stabilized later. There aren't any unresolved questions for cfg_target_feature.

The vendor intrinsics must be defined with #[target_feature]. Could you please clarify the alternative?

The intent is to put the vendor intrinsics into std since they rely on permanently unstable APIs of the compiler. If they're in std (well, really, they'd be in core), then conditional compilation is no longer an option. Therefore, the std that is distributed with Rust must contain all definitions of vendor intrinsics for a particular architecture.

That seems significant to me. If I'm passing around SIMD vector types to multiple distinct vendor intrinsics in a single function, then it seems like that function needs to have an appropriate #[target_feature] annotation?

If #[target_feature] is really unsafe and we can't do anything about it, then I feel like that makes stabilization much easier. We can throw our hands up and say "that's UB."

I wrote that whether the unsafe intrinsics are annotated with #[target_feature] is irrelevant for the caller, not that the intrinsics do not need to be annotated. This is however wrong because the caller must care due to the vector types.

If I'm passing around SIMD vector types to multiple distinct vendor intrinsics in a single function, then it seems like that function needs to have an appropriate #[target_feature] annotation?

This is what I was missing, it would make stdsimd pretty useless otherwise. For the bmi instructions this is not necessary but for SIMD we really need to stabilize #[target_feature] due to the vector types.

We can throw our hands up and say "that's UB."

That's the TL;DR: of the RFC.

Wait a second, the following should work just fine:

unsafe fn foo() { // default target
  let a: f32x4;  // generated for the default target
  unsafe { std::bar_avx(a);  } // probably requires an implict conversion due to ABI differences
} 

It worries me a bit that anything that deals with the portable vector types like f32x4 might need to be annotated with #[target_feature]. I don't have a better idea, but it worries me a bit.

1 Like

Code that uses vector types doesn't need to be annotated with #[target_feature], you can still use --target-feature= to select a feature globally, and use #![cfg!(target_feature)] to provide different implementations.

So even without stable #[target_feature], run-time dispatch would be possible, one would just need to split the code for each platform in different crates, compile them with --target-feature=, link them, and then dispatch to the appropriate crate using run-time feature detection. Extremely cumbersome, but possible.

I'm a bit afraid of having to mark all the functions with:

#[ifunc("default", "sse2", "sse3", "sse4", "avx", "avx2")]  // or #[ifunc("all_x86")]
fn foo() { /* .. */ }

to be able to write portable vector code that works as fast as possible on each architecture. I am also a bit afraid of compiling one crate enabling some feature, but that requires some other crate which was compiled with a different feature, and having those link. That is creating potential hard-to-debug ABI performance issues on interfaces.

Random high-level comments from an outsider: (Sorry, I couldnā€™t resist. I hope this is useful feedback.)

  • I find it fairly confusing that target_feature sometimes tests whether the feature is available at compile-time, and sometimes changes the way a function is compiled. I.e., #[cfg(target_feature)] vs #[target_feature]. Is it really necessary for both to have the same name? I see there has been some discussion about this (unconditional_target_feature), but the RFC should mention why using the same name is still considered the best way forward. (As far as bikeshedding is concerned, Iā€™d call them has_target_feature and use_target_feature, but whatever :wink: )
  • I also agree with the comments saying that there should be a way to test for features at run-time that uses the same syntax as the compile-time tests. The sample code further up that used Cpuid relies on being able to figure out which exact Cpuid flags correspond to which target features. Using the same ā€œspecification languageā€ is just much better API design. #[target_feature] is fairly useless without run-time detection, so I think this is totally in scope for this RFC.
2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.