Temporary intrinsics::trait_vtable

There's currently no way to access trait implementation vtable for generic types in stable rust from base type (i.e. not a fat pointer).

Specialization allows this with default implementations. But I wonder whether it will be stable in the upcoming 5 years, if even, given the number of requirements and edge-cases a stable implementation has to satisfy.

In the meantime, having a stable intrinsic function, that can be deprecated once the specialization lands, would help a lot in cases where those vtables need to be accessed.

The following code works for storing drop glue in stable:

type DropFn = fn(*mut ());
const fn drop_fn<T>() -> DropFn {
    if core::mem::needs_drop::<T>() {
        |ptr: *mut ()| unsafe { core::ptr::drop_in_place(ptr as *mut T) }
    } else {
        |_: *mut ()| {}
    }
}

I'd like to be able to write a similar function for Display, but it's not possible. I can't implement default Display without specialization, nor match Option<DisplayFn> at runtime.

So, while it's perfectly clear how specialization will help, it currently doesn't and I'd rather avoid making a library require nightly just to use unsound/incomplete features.

What are your oppinons on introducing a function like:

/// Returns pointer to trait implementation vtable in the local compilation unit, if one exists.
const fn trait_vtable<T: Sized + 'static, Trait: ?Sized + 'static>() -> Option<const*>;

The 'static lifetimes can be parameterized at a later point to allow for more uses.

That would allow code like:

impl Display for MybDisplayWrapper {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        if let Some(vtable) = intrinsics::trait_vtable::<Self, dyn Display>() {
            let d: &dyn Display = unsafe {
                let inner_ptr: *const () = self.data;
                std::mem::transmute((inner_ptr, vtable))
            };
            d.fmt(f)
        } else {
            f.write_str("Opaque")
        }
    }
}

Such an intrinsic would have the same expressive power of specialization I think and thus the same soundness holes.

1 Like

For what it’s worth, there is a trick to do this with macros instead of functions, relying on Rust’s method resolution rules: impls - Rust

I think it wouldn't because it doesn't deal with lifetimes, selecting the correct impl, etc. only returns a location in the binary. Not sure how that location would be affected by LTO and such, but there's ways of preventing it from being optimized out once trait_vtable uses it.

That crate only provides the checking part of the problem (i.e. core::mem::needs_drop::<T>()). There's no (unsafe) way to tell the compiler "I know T implements Display, create a fat pointer to dyn Display" because there's no way to get the vtable part of the pointer.

If you tried <T as Display>::fmt right after checking, type system would complain that T doesn't have a Display bound.

Note, I updated the OP to include some examples.

It’s a macro based on the static type, so

if impls!(T: Display) {
  Some(&value as &dyn Display)
} else {
  None // or whatever
}

is doable. I’ll agree it’s not convenient though.

It's not, the compiler returns an error:

for code:

let debug_result = if impls!(T: Debug) {
    format!("{:?}", (&value as &dyn Debug))
} else {
    "Opaque".to_string()
};

, as I've said. impls! lowers the knowledge of type characteristics from type system into const, there's no way to move it in the other direction atm.

Also switching the implementation with a where clause, based on that infomation, by introducing Check<{impl!(T: Display)}>: True trick, doesn't work because the compiler things both True and False can be satisfied under certain conditions or by presence of other crates.

During codegen the intrinsic needs to know if the trait is implemented. At that point all lifetimes have been erased, so for any trait conditionally implemented for certain lifetimes codegen can't know if the trait is implemented or not.

1 Like

That's why I added the example which uses dyn Display without any lifetimes. Having this only work for 'static types and implementations would cover 90% of use cases. I only need it for traits like std::ops, Debug, Display, Hash where I know/expect T to be static.

To clarify, I'd like to use something like this to provide better Debug/Display information if possible from a crate like mlua (this time). So the code that's going to rely on these implementations is never going to outlive the parent scope that's using it (i.e. its types).

I'm fine with the intrinsic panicking when called on traits like:

intrinsics::trait_vtable::<Self, dyn MyTrait<'a>>()

because that's a very niche use case.

Dereferencing vtable pointer is unsafe already, so the consumer would have to know whether the vtable is still loaded when they try to use it. A type that's using the vtable is erased in my code, so I know my drop_fn example will fail if impl Drop is unloaded (the same problem). But that's not going to happen.

I'll add 'static bound to T in the example.

A function which works for generic T which lacks any trait bound and gets a trait vtable if present has all of the exact same issues as specialization does. Restricting to T: 'static helps, but still amounts to being specialization, and is thus unlikely to be made available separately from actual specialization work.

However, I do think that something along the lines of fn try_specialize<Dyn: ?Sized + 'static, T: 'static>(val: &T) -> Option<&Dyn> is a reasonable first step into providing basic specialization capabilities.

Just to note, in this specific case, it's proper to just always call drop_in_place::<T> irrespective of needs_drop::<T>. From the needs_drop docs:

Note that drop_in_place already performs this [needs_drop] check, so if your workload can be reduced to some small number of drop_in_place calls, using this is unnecessary. In particular note that you can drop_in_place a slice, and that will do a single needs_drop check for all the values.

Generally, you should only use needs_drop to skip work which is not drop_in_place calls.

It's certainly far from convenient, but it's possible to do using autoderef specialization. In its full (but nonconst) generality, dispatching on type name, it'd be roughly:

pub type FmtFn<T: ?Sized> = fn(&T, &mut fmt::Formatter) -> fmt::Result;
pub type PolyFmtFn = unsafe fn(*const (), &mut fmt::Formatter) -> fmt::Result;

// NB: CFI doesn't like reinterpret-cast-by-ABI
#[cfg(not(sanitize = "cfi"))]
pub fn poly_fmt<T: Sized>(f: FmtFn<T>) -> PolyFmtFn {
    unsafe { transmute(f) }
}

#[macro_export]
macro_rules! fmt_fn {
    ($T:ty) => {{
        use $crate::{__FmtFn_Display as _, __FmtFn_TypeName as _};
        // two refs for two specialization levels
        (&&$crate::__Dispatch::<$T>($crate::PhantomData)).__fmt_fn()
    }};
}

pub struct __Dispatch<T: ?Sized>(PhantomData<T>);

pub trait __FmtFn_Display {
    type This: ?Sized;
    fn __fmt_fn(&self) -> FmtFn<Self::This>;
}

// level 1, tried first
impl<T: ?Sized> __FmtFn_Display for &__Dispatch<T>
where
    T: Display,
{
    type This = T;
    fn __fmt_fn(&self) -> FmtFn<T> {
        <T as Display>::fmt
    }
}

pub trait __FmtFn_TypeName {
    type This: ?Sized;
    fn __fmt_fn(&self) -> FmtFn<Self::This>;
}

// level 0, tried last
impl<T: ?Sized> __FmtFn_TypeName for __Dispatch<T> {
    type This = T;
    fn __fmt_fn(&self) -> FmtFn<T> {
        |this, f| {
            f.debug_struct(type_name_of_val(this))
                .finish_non_exhaustive()
        }
    }
}

The limitation is of course that this is still macro specialization; it only tells you what the given type is known to implement in the invoking scope and as such (by design) cannot see "through" generic type binders.

1 Like

How so?

The issues arise when trying to select the correct implementation out of multiple overlapping implementations, specifically when they have different lifetimes, which isn't possible without specialization. The proposed intrinsic only queries zero-or-one implementation which the compiler already does for discrete types.

An unbounded type can implement any trait, but when the compiler generates different implementations based on generic requirements, it could infer the value of the intrinsic based on requested signature. Statements can't affect the inferred generic types in fn signatures (calls do), so they should be able to rely on it (AFAIK).

So when the compiler decides it needs to generate a new function for a call, it can inline None or Some(vtable) in place of the intrinsic based on generated signature argument types.

With specialization the 'static bound could be lifted and then intrinsic would be able to select the correct impl out of many overlapping ones.

Thank you for that input.

Right, but I have something like:

struct ErasedType {
  data: *mut (),
  drop_fn: *const (),
  display_fn: Option<*const()>
}

impl ErasedType {
  fn new<T>(value: T) -> Self {
    let data = Box::leak(Box::new(value)) as *mut T;
    Self {
      data: data as *mut (),
      drop_fn: drop_fn::<T>(),
      display_fn: initrinsics::trait_vtable::<T, dyn Display>(),
    }
  }
}

So by the time I need to use display_fn, type information from T is long gone.

I mean, in this case I could produce the string in advance as the data and display output won't change. But just assuming it will change prevents me from showing useful debug/display information.

No, the fundamental issue is that in order to determine which specialized impl applies the compiler has to check whether some bounds hold, at codegen time. At that point however lifetimes are already erased, and that makes this impossible to do reliably.

This blogpost has more details on this.

The situation here is the same (if you omit the 'static requirement), because it also requires to determine whether some bound hold at codegen time.

2 Likes

The "solution" is still straightforward: instead of ErasedType::new(value), create instances with a macro, e.g. make_erased!(value). You don't need to name the type for autoderef specialization either — just do the dispatch on __Dispatch(&value) instead of __Dispatch(PhantomData::<Type>). That's the basic way of doing it, even; the indirection through the __Dispatch type is just for handling some (common) edge cases better.

This is still far from pretty, as the requirement to use macros goes the whole way up to wherever the type isn't generic, but the point is that it's possible.

There is no functional difference between using specialization, e.g.

impl<T> Display for Type<T> {
    default fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        f.debug_struct(type_name_of_val(&self.0))
            .finish_non_exhaustive()
    }
}

override impl<T: Display> Display for Type<T> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        self.0.fmt(f)
    }
}

and using procedural impl querying, e.g.

impl<T> Display for Type<T> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        if let Some(this) = spec_as_dyn::<_, dyn Display>(&self.0) {I
            this.fmt(f)
        } else {
            f.debug_struct(type_name_of_val(&self.0))
                .finish_non_exhaustive()
        }
    }
}

because they accomplish the exact same thing — selecting what codepath to take based on whether a bound is satisfied.

The issue with specialization is that impl selection (of which "the intrinsic does something different based on revealed type" is a version) fundamentally cannot differ based on lifetimes; Rust's generics get instantiated polymorphically quantified over all possibly parametrizing lifetimes, and this is a necessary feature of the lifetime model.

And since impls can be conditional on relationships between lifetimes, generic code must not be able to discover arbitrary bounds that can potentially have their impls be lifetime generic.

Querying zero-or-one is no different from querying general-or-specific. I suspect you do have a basic understanding of the issue in "direct" specialization on lifetimes, but the sneaky part being missed is that lifetime specialization can arise from any and every trait bound, even if no lifetimes are visibly involved.

General-or-specific is a zero-or-one query, FWIW; it's “if (has specific) { use specific } else { use general }”.


That the proposed intrinsic functions using dynamic dispatch and not further monomorphization makes the implementation more straightforward, but it does not do anything to address the real issues with Rust's specialization story.

To repeat, I do think that the 'static-bound function that casts &T -> Option<&Y> is a good API for MVP specialization. But it's still specialization and trying to argue otherwise only serves to obfuscate things.

Also, note:

This isn't allowed, formally speaking; it's reliant on unstable layout implementation details which are not guaranteed to be what they currently are. You want to be using DynMetadata and ptr::from_raw_parts, if you're using unstable functionality anyway and not just committing code crimes to work on stable.

2 Likes