- Feature Name: (fill me in with a unique ident,
zst-scoped-safe-fncast) - Start Date: (fill me in with today's date, 2026-05-22)
- RFC PR: rust-lang/rfcs#0000
- Rust Issue: rust-lang/rust#0000
Summary
The cast of a target-feature enabled function into a safe, callable rust-ABI
function necessarily requires an intermediate unsafe cast step. Make this step safe in scopes where a token-value certifies that the appropriate CPU-feature is available in the current program. This is an alternative or augmentation for struct_target_feature
Motivation
In image related crates we do a lot of SIMD implementations of core routines
to guarantee a level of performance on matching targets. This happens via
runtime dispatch: we query the feature set and from that point onwards an
appropriate function should be used. Crucially this does not happen at call
sites of the function. Those are in tight enough loops that the overhead would be problematic to redo featur detection. Instead, a custom fn-pointer table is built at startup and then an indirect
call into function references is performed.
While building the table it is necessary to make functions with required target
feature sets (tagged with #[target_feature(enable = …)]) compatible with
other functions. We do not use a trampoline (performance) but instead constrain
ourselves that the signature here does not mention any SIMD specific types and
so is ABI compatible. This allows function pointers of different feature sets
to be stored into the same fn-pointer table field. The only requirement this
imposes on us is that each function is scoped such that there is no expensive
spill of any SIMD value from another block to pass them through a more
primitive ABI.
With safe arch intrinsics as well as the use of helper crates constraining
unsafe intrinsics to a safe interface, some of these can be entirely safe
implementations. However, the function cast is very hard to properly abstract
away. There is no way to query and represent the feature-enabled set from a
type or to represent the gained information in a const way that would allow
verifying and discharging the obligations at the conversion site. With macros
it is possible to wrap those or create a conversion-utility at the definition
site of such functions but this still suffers defects in ABI-compatibility
detection and does not allow more advanced selection patterns (i.e. choosing
the implementation among the feature-available function set based on runtime
performance information)
Guide-level explanation
The instruction-set-specific modules in core::arch define a set of zero-sized
types. Each one is a token verifying the availability of a runtime feature of
the currently running program. These have a non-const fn constructor that
returns an Option<Self>, returning Some(_) if the feature was enabled for
compilation or if it can be detected at runtime as is_X_feature_detected`.
In scopes where a local of the zero-sized type is in scope and it is surely initialized, the cast of a function item into a function pointer is safe for a subset of simple function signatures:
#[target_feature(enable = "sse4.1")]
fn crazy() {
…
}
fn boring_scalar_implementation() {
…
}
fn choose_implementation(has_sse: Option<core::arch::HaveSse41>) {
if let Some(_value) = has_sse {
// Safe!
crazy as fn()
} else {
boring_scalar_implementation as fn()
}
}
Allowed signatures are checked based on the target fn-pointer, which is
required to be concrete enough, which is then already matched against the
function ZST-type. The target function pointer is required to use the Rust ABI.
Allowed arguments and return type are (a subset of those documented in fn's
ABI section):
- Each primitive type and reference with itself and any super-type of itself.
- References and
Box<T>are compatible withNonNulland raw pointers of the same metadata but not the other way around. NonZero*in anOptionmatched with the relevant simple type and the converse.
Reference-level explanation
When a cast from a method into a function pointer is attempted, query the local
typing context before determining if a unsafe block is required or not. When
a local (i.e. including function parameter) in the scope is unified into the
corresponding language marker type then all dominated basic blocks are
augmented with additional information in their typing context. Different
features of the same architecture unify.
The important difference in ABI compatibility is that we assert the soundness for values with the compatibility instead of only the ABI compatibility. So we have additionally ensured that all passed argument values are also correct for the parameter (validity and safety invariants). So zero-sized-align-1 types are not compatible with another (other than via subtyping relationship).
Drawbacks
Code verification passes become more complicated and this check necessarily moves after type unification. Some behavior may be a surprising to users if the passes notion of dominated basic blocks or unification behavior disagrees with the surface level Rust code. We can not express this behavior behind a generic type despite the mechanism using the type system.
Rationale and alternatives
The inability for generics has precedent. In a union, all fields are required
to statically denote their lack of Drop. For concrete types this can be
determined but for generics there are restrictive rules: ManuallyDrop
specifically can be used as the top-level type wrapper for fields that would
otherwise not be provable.
Alternative, do not do this: function pointer casts remain unsafe, SIMD
libraries require a little bit of unsafe and can not annotate themselves with
forbid(unsafe_code).
Or, only do this on call sites. To avoid the biggest performance problems, provide a dynamic representation of the CPU feature set which represents the valid feature set and switch on this. To avoid all unsafety we still need some amount of language integration: this type must dispatch into a set of given functions based on its value. There is no type information on fn-ptr to do this so we also still need concrete ZSTs and magic to query it despite lack of trait bound.
Prior art
Despite the union part:
struct target feature, The RFC is #3525 here, contains the zst types but not a builtin dispatch mechanism. Instead the function signatures are explicitly incompatible due to differing struct / zst parameters.
Instead of overloading the existing function pointer cast, we could have a macro which only allows the safe pointer casts. This macro would consume the token and probably be implemented as a compiler internal. (A method on the type would require magic bounds and instantiations, I'm not sure that would be a good idea).
Unresolved questions
Which ZST marker types to introduce, where should they live.
We could also write the types as FeatureAssertion<T> for various instantiations of T. This might simplify query and be proof against future directions. Looking at the history and internals of NonZero this may not be crucial to decide right away.
Should there be more constructors on the ZST marker types.
Should there be an unsafe constructor with documented invariants? Currently, there are a number of alternative cpu-feature-detection implementations in the crates.io ecosystem. It seems obvious to expect that these would otherwise transmute the ZST valids out of thin air anyways so as to match the standard library integration. This is just as soundness critical but not documented from std's side.
Future possibilities
When we gain the ability to represent target features on function pointers, not only zero-sized function types, we should extend the cast ability to these. Also we might be able to make the token types work under generic code if there were function-pointer metadata that matches some trait-bound on a feature detection ZST.