[Pre-RFC] Make some feature-detected function-to-fn-pointer casts safe through ZST token types

197g · May 22, 2026, 12:53pm

Feature Name: (fill me in with a unique ident, zst-scoped-safe-fncast)
Start Date: (fill me in with today's date, 2026-05-22)
RFC PR: rust-lang/rfcs#0000
Rust Issue: rust-lang/rust#0000

Summary

The cast of a target-feature enabled function into a safe, callable rust-ABI function necessarily requires an intermediate unsafe cast step. Make this step safe in scopes where a token-value certifies that the appropriate CPU-feature is available in the current program. This is an alternative or augmentation for struct_target_feature

Motivation

In image related crates we do a lot of SIMD implementations of core routines to guarantee a level of performance on matching targets. This happens via runtime dispatch: we query the feature set and from that point onwards an appropriate function should be used. Crucially this does not happen at call sites of the function. Those are in tight enough loops that the overhead would be problematic to redo featur detection. Instead, a custom fn-pointer table is built at startup and then an indirect call into function references is performed.

While building the table it is necessary to make functions with required target feature sets (tagged with #[target_feature(enable = …)]) compatible with other functions. We do not use a trampoline (performance) but instead constrain ourselves that the signature here does not mention any SIMD specific types and so is ABI compatible. This allows function pointers of different feature sets to be stored into the same fn-pointer table field. The only requirement this imposes on us is that each function is scoped such that there is no expensive spill of any SIMD value from another block to pass them through a more primitive ABI.

With safe arch intrinsics as well as the use of helper crates constraining unsafe intrinsics to a safe interface, some of these can be entirely safe implementations. However, the function cast is very hard to properly abstract away. There is no way to query and represent the feature-enabled set from a type or to represent the gained information in a const way that would allow verifying and discharging the obligations at the conversion site. With macros it is possible to wrap those or create a conversion-utility at the definition site of such functions but this still suffers defects in ABI-compatibility detection and does not allow more advanced selection patterns (i.e. choosing the implementation among the feature-available function set based on runtime performance information)

Guide-level explanation

The instruction-set-specific modules in core::arch define a set of zero-sized types. Each one is a token verifying the availability of a runtime feature of the currently running program. These have a non-const fn constructor that returns an Option<Self>, returning Some(_) if the feature was enabled for compilation or if it can be detected at runtime as is_X_feature_detected`.

In scopes where a local of the zero-sized type is in scope and it is surely initialized, the cast of a function item into a function pointer is safe for a subset of simple function signatures:

#[target_feature(enable = "sse4.1")]
fn crazy() {
    …
}

fn boring_scalar_implementation() {
    …
}

fn choose_implementation(has_sse: Option<core::arch::HaveSse41>) {
    if let Some(_value) = has_sse {
        // Safe!
        crazy as fn()
    } else {
        boring_scalar_implementation as fn()
    }
}

Allowed signatures are checked based on the target fn-pointer, which is required to be concrete enough, which is then already matched against the function ZST-type. The target function pointer is required to use the Rust ABI. Allowed arguments and return type are (a subset of those documented in fn's ABI section):

Each primitive type and reference with itself and any super-type of itself.
References and Box<T> are compatible with NonNull and raw pointers of the same metadata but not the other way around.
NonZero* in an Option matched with the relevant simple type and the converse.

Reference-level explanation

When a cast from a method into a function pointer is attempted, query the local typing context before determining if a unsafe block is required or not. When a local (i.e. including function parameter) in the scope is unified into the corresponding language marker type then all dominated basic blocks are augmented with additional information in their typing context. Different features of the same architecture unify.

The important difference in ABI compatibility is that we assert the soundness for values with the compatibility instead of only the ABI compatibility. So we have additionally ensured that all passed argument values are also correct for the parameter (validity and safety invariants). So zero-sized-align-1 types are not compatible with another (other than via subtyping relationship).

Drawbacks

Code verification passes become more complicated and this check necessarily moves after type unification. Some behavior may be a surprising to users if the passes notion of dominated basic blocks or unification behavior disagrees with the surface level Rust code. We can not express this behavior behind a generic type despite the mechanism using the type system.

Rationale and alternatives

The inability for generics has precedent. In a union, all fields are required to statically denote their lack of Drop. For concrete types this can be determined but for generics there are restrictive rules: ManuallyDrop specifically can be used as the top-level type wrapper for fields that would otherwise not be provable.

Alternative, do not do this: function pointer casts remain unsafe, SIMD libraries require a little bit of unsafe and can not annotate themselves with forbid(unsafe_code).

Or, only do this on call sites. To avoid the biggest performance problems, provide a dynamic representation of the CPU feature set which represents the valid feature set and switch on this. To avoid all unsafety we still need some amount of language integration: this type must dispatch into a set of given functions based on its value. There is no type information on fn-ptr to do this so we also still need concrete ZSTs and magic to query it despite lack of trait bound.

Prior art

Despite the union part:

struct target feature, The RFC is #3525 here, contains the zst types but not a builtin dispatch mechanism. Instead the function signatures are explicitly incompatible due to differing struct / zst parameters.

Instead of overloading the existing function pointer cast, we could have a macro which only allows the safe pointer casts. This macro would consume the token and probably be implemented as a compiler internal. (A method on the type would require magic bounds and instantiations, I'm not sure that would be a good idea).

Unresolved questions

Which ZST marker types to introduce, where should they live.

We could also write the types as FeatureAssertion<T> for various instantiations of T. This might simplify query and be proof against future directions. Looking at the history and internals of NonZero this may not be crucial to decide right away.

Should there be more constructors on the ZST marker types.

Should there be an unsafe constructor with documented invariants? Currently, there are a number of alternative cpu-feature-detection implementations in the crates.io ecosystem. It seems obvious to expect that these would otherwise transmute the ZST valids out of thin air anyways so as to match the standard library integration. This is just as soundness critical but not documented from std's side.

Future possibilities

When we gain the ability to represent target features on function pointers, not only zero-sized function types, we should extend the cast ability to these. Also we might be able to make the token types work under generic code if there were function-pointer metadata that matches some trait-bound on a feature detection ZST.

ais523 · May 23, 2026, 7:31pm

A possible alternative: give the ZST tokens themselves a method that casts from function item to function pointer. I suspect this alternative doesn't help improve the number of special cases compared to the original proposal (because you need some way to be generic over function item types that require a specific set of target features to be used safely, which is the sort of thing that is hard to represent using just traits), but it would be less magical than "this value has to be in scope" and thus might be easier to understand upon seeing it in code.

The least magical version would be "functions that require target features must be given the ZST values that represent those target features as arguments", which would be simple both in terms of using it and in understanding how it works, but unfortunately I don't think it would be backwards-compatible (and it also wouldn't help with the problem of wanting to store pointers to functions with different target feature requirements in a variable of a single concrete type, which should in theory be sound if all the features in question are actually available at runtime and the calling convention doesn't change as a consequence).

197g · May 23, 2026, 7:57pm

I briefly touch on this in alternatives, so thank you for considering it further. I believe a function is going to be considerable headache of its own. At least it does not seem reasonable to provide any useful signature that could be rendered via rustdoc for the lack of denoting the ABI-constraints. Even if we just permit casts between the same function args and return types we still have two sets of type generics there where one argument must be F: impl Fn(A…) -> B + Zst and the other fn(A…) -> B. That just can not be expressed concisely by any syntax (HKTs maybe but that seems absurdly unreadable). Plus I would not mind not having type unificatiom happen here just in case we do later want to permit some ABI casts afterall.

The best alternative after a night's sleep, acknowedging everything against using the local's existence directly, may be the macro solution where the token type can be explicit and there is no surprise about type oddities. Plus tokens are Copy.. Syntax should be no issue (apart from painting the bikeshed)

std::fn_feature_cast!(has_sse, crazy as fn())

RalfJung · May 24, 2026, 12:16pm

I would strongly prefer that, or some other variant that explicitly requires the token to be named. Having the mere presence of an unmentioned local variable in scope unlock further operations seems way too implicit to me.

Also note that we already have something kind of similar:

#[target_feature(enable = "avx2")]
fn fun_with_avx2() {}

#[target_feature(enable = "avx2")]
fn make_fn_ptr() -> fn() {
    fun_with_avx2
}

If the current function has a target feature enabled, it can coerce other function items with that target feature to safe function pointers.

Is there a way to use this to achieve what you are looking for?

197g · May 24, 2026, 3:09pm

At least some context outside the target feature attributes needs to have the ability to create that first entry point. I was not aware you could do that cast inside the context already (so soundness of this is already promised). This does change the perspective on struct-target-feature/3525 a bit: an entry point might have a target token type in its signature but it can still be used to construct many other safe function pointers with a simple signature—so those tokens already provide a useful base case. I suppose I would still prefer another (macro) way that avoids the whole function layer for ergonomics reasons.

Maybe that should be discussed at the RFC directly though instead of as a competing one.

fintelia · May 25, 2026, 7:36pm

Doing a function pointer cast seems to me like a very direct way to do multi-version dispatch.

Would it be any easier to implement if the cast didn't use ZST's but was just a macro? There's already is_x86_feature_detected so the main thing would be checking the target_feature on the passed method (which I suppose probably isn't known until after macro expansion but may that's solvable?)

#[target_feature(enable = "sse4.1")]
fn crazy() {
    …
}

fn boring_scalar_implementation() {
    …
}

fn choose_implementation() -> ... {
    let sse_impl: Option<_> = try_target_feature_cast!("sse4.1", crazy);

    see_impl.unwrap_or(boring_scalar_implementation) 
}

197g · May 25, 2026, 9:59pm

That way of casting is an instruction I'd really like to see in wasm so that the environment has a choice in feature support instead of any mention of simd128 requiring it for the whole module.

As a language feature however it would be less flexible. It works for setup function calls but is inappropriate for most runtime call dispatch. This is as would exhibit the same problem as the is_*_feature_detect macro itself, each evaluation loads from an atomic (to avoid the repeated cpuid) and this causes LLVM to treat the cast as a side-effect and sequence point; if you write N casts you get N loads instead of being able to re-use the feature bitmask in any way. With struct_target_feature you get to avoid that part.

Topic		Replies	Views
Pre-RFC: Struct target feature language design	27	2139	November 30, 2022
Pre-RFC: Generic Pointer Casts, aka ptr.cast() for fat pointers language design	11	1042	January 23, 2024
Can we have as cast *T to unsafe fn libs	17	1769	March 5, 2021
Creating 1-ZSTs guaranteed to have same extern "C" ABI as () Unsafe Code Guidelines	18	1478	August 31, 2023
Pre-RFC: stabilization of target_feature compiler	129	12359	July 27, 2017