Pre-RFC: Struct target feature

sarah · November 29, 2022, 10:55am

sorry, i'm not sure i fully understand all your questions. so let me know if there's something i forgot to address.

you're right, i meant Copy instead of Clone. i'm not sure if that specific ABI is desirable, and such discussions will have to be maybe brought up with the compiler team. the idea i had in mind for how this could be used safely is to have a utility similar to std::mem::transmute, (which in this case is the as cast) that checks at compile time that the source and target function pointers have the same ABI, and that they differ only by ZST+Copy parameters.

the cast can be misused, for example by casting a function pointer that takes a token to a function pointer that doesn't take that token, and then calling it without making sure that we already have an instance of that token, as this will have the effect of creating a token out of thin air without first checking that it's possible. that's the part that makes it unsafe.

as long as we have at least one instance of every token that is removed, then the cast is safe. which is what makes my_fn_ptr in this example sound

fn my_fn<S: Copy>(simd: S, v: &mut [f64]) {}
fn my_fn_ptr<S: Copy>(simd: S) -> fn(&mut [f64]) {
    assert_eq!(core::mem::size_of::<S>(), 0);
    // SAFETY:
    // S is ZST + Copy, and we have an instance of S, so the cast is sound
    unsafe { my_fn::<S> as _ }
}

CAD97 · November 29, 2022, 3:00pm

It's possible to write the cast today with

fn my_fn_ptr<S: 'static + Copy>(simd: S) -> fn(&mut [f64]) {
    assert_eq!(core::mem::size_of::<S>(), 0);
    |v| {
        // SAFETY: this is logically a copy of the token
        //         provided to my_fn_ptr
        let s: S = unsafe { core::mem::zeroed() };
        my_fn(s, v)
    }
}

sarah · November 29, 2022, 3:13pm

this technically adds the cost of a function call indirection. since often my_fn can't be inlined into the closure due to it having a #[target_feature] attribute in this kind of scenario.

CAD97 · November 29, 2022, 3:22pm

If the ABIs are the same, then the lack of inlining is "just" a quality of implementation issue, and ideally we should be able to teach backends to allow inlining for unconditional calls to different target feature sets.

sarah · November 29, 2022, 3:23pm

that's a good point

197g · November 29, 2022, 8:36pm

So with most mechanisms in user space, the one thing still needed is a better bound and the compiler deducing the feature set from that generic parameter during monomorphisation? I'm pretty sure this could be experimented with in a crate to validate the API (and its benefits) against multiversion.

Sketch:

mod sealed {
    type FeatureSet = crate::arch::TargetFeatureSet;

    pub trait ArchDispatch {
        const Features: Self::FeatureSet;
    }
}

pub trait PlatformFeatureSet: sealed::ArchDispatch + Sized + 'static {}

#[target_feature]
fn dispatched<S: PlatformFeatures>(_: S) {…}

// Expands to:

fn dispatched<S: PlatformFeatures>(_: S) {
    #[target_features="sse3"]
    unsafe fn _dispatched_sse3() { /* body */ }
    // etc.

    match S::Features {
        SSE3 =>  unsafe { _dispatched_sse3() },
        SSE2 =>  /* etc */,
        _ =>  { /* body */ }
    }
}

Eventual inclusion in Std would profit from replacing the proc-macro with something using presumably less upfront duplication of the function body?

CAD97 · November 29, 2022, 9:37pm

The main benefit AIUI is

being able to pass along S to get multiversion's dispatch! implicitly, aiding inlining (which passing through the trampolines in the sketch will unfortunately inhibit); and
only needing to monomorphize the multiversions actually used, implicitly collecting those in monomorphization.

197g · November 30, 2022, 2:38pm

That's all true. Given that we now had already seen that most user-visible and observable features can be written standard constructs instead of language features, I was wondering if there's value in validating the interface first. And if validating the interface then we can't (and don't need to) worry about the internal gains yet. That should be implementable as a crate even with the unfortunate trampoline overhead.

I'm also not generally sold on inlining as a major required feature, yet. A good deal of SIMDified code will do large chunks of work and shouldn't suffer too much from some constant overhead in the call paths. Some other code might be inlined anyways due to the match path actually being a constexpr choice post-monomorphization. (Hm, hasn't rustc begun to perform some code flow analysis for such situations already?. Seems limited to non-generic-const for now.) Still unfortunate strain on the compiler though.

system · February 28, 2023, 2:39pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: stabilization of target_feature compiler	130	11922	March 25, 2019
[Pre-Pre-RFC] Target restriction contexts language design	8	3024	March 25, 2019
[Pre-RFC] Make some feature-detected function-to-fn-pointer casts safe through ZST token types Unsafe Code Guidelines	6	378	May 25, 2026
Pre-RFC: Cargo Target Features cargo	21	8131	March 25, 2019
Pre-RFC: contextual target feature detection libs	16	915	September 10, 2023

Pre-RFC: Struct target feature

Related topics