Make PhantomData dispatchable to write static methods on dyn-traits

There's an open issue for arbitrary receiver types. The overarching goal is allowing more general dyn-objects. One prominent extension are static methods, for instance conceptualized here. On a language level one huge problem is that there's no way to refer to the vtable, the dyn-object part, without also referring to a valid object with a reference whereas a raw pointer (*const dyn Tr) need not be valid in any way. We can't fully invent the 'static dyn object' from thin air either without seemingly complex rules around the places where it should be allowed. So let's not, let's use as many semantics as we already have to help.

Can we use PhantomData<T> as a quasi workaround? This would allow using plain reference but is also a simple replacement for the need to a valid instance of an underlying T that's being dynified.

trait StaticName {
    fn get_constant(self: PhantomData<Self>) -> &'static str;
}

struct Foobar;

impl StaticName for Foobar {
    fn get_constant(self: PhantomData<Self>) -> &'static str {
        "foobar"
    }
}

fn main() {
    let v: PhantomData<Foobar> = PhantomData;
    assert_eq!(v.get_constant(), "foobar");
    let by_ref: &PhantomData<dyn StaticName> = &v;
    assert_eq!(by_ref.get_constant(), "foobar");
}

This should be a somewhat straightfoward change, with the only obstacle being to stabilize PhantomData<Self> as an allowed receiver type. It would be much simpler than arbitrary-self-types but still enables quite a few of its possibility.

Does anyone have helpful tips for implementing this in the compiler?

3 Likes

No, &PhantomData<dyn Trait> carries no metadata. (Nor does PhantomData<dyn Trait>.)

use core::any::Any;
use core::marker::PhantomData;
use core::mem::size_of;

fn main() {
    dbg!(
        size_of::<PhantomData<()>>(),
        size_of::<PhantomData<dyn Any>>(),
        size_of::<&PhantomData<()>>(),
        size_of::<&PhantomData<dyn Any>>(),
    );
}

Rust Playground

[src/main.rs:6] size_of::<PhantomData<()>>() = 0
[src/main.rs:6] size_of::<PhantomData<dyn Any>>() = 0
[src/main.rs:6] size_of::<&PhantomData<()>>() = 8
[src/main.rs:6] size_of::<&PhantomData<dyn Any>>() = 8

A TypedMetadata<T>-style struct (see link below for what I mean by that) however can technically be made to support this (which is something I've mentioned in that thread too):

As long as it has a trivial constructor, there's technically no need to conflate it with PhantomData. One can aparently not even unsize that structure, which I should have tried out before. Thanks for the pointer, this rules out this particular construct. (Although it is definitely unexpected to find a sized structure for an unsized type parameter).

That said, the TypedMetadata<T> would not be right, either, at least not in the sense that I had initially imagined. Part of the reused semantics would be the pointer unsizing which hoists the metadata into the fat-pointer-part. But Unsizing the 🚲<T> behind a reference does not work as that transformation changes the representation. One could consider coercing by-value here but that's also a new required operation to add then.

On that note it would also be interesting to investing whether it'd be legal to call those methods from a &dyn _ directly if the self type used references in the first place. (Since this resolves the motivation part, I'll invent a name for bikeshedding).

// A zero-sized type, not containing a value of the indicated type.
#[lang_item]
struct Static<T>(());

// Compiler generated, in contrast to PhantomData.
// impl core::marker::Unsize<Static<_>> for Static<T> { … }

trait StaticName {
    fn get_constant(self: &Static<Self>) -> &'static str;
}

struct Foobar;

impl StaticName for Foobar {
    fn get_constant(self: &Static<Self>) -> &'static str {
        "foobar"
    }
}

With usage as above. Now, would it even be allowable to call get_constant on a &dyn Foobar? Just invent a zero-sized allocation of Static<_> in front of the real object?

I have no idea what you really want, just wrote something that met your examples:

struct Flag<T:?Sized+'static>([&'static T;0]);
impl<T> From<T> for Flag<T> {
    fn from(_:T)->Self{Flag([])}
}
impl<T:'static> StaticName for T{}
fn main() {
    let v = Flag::<i32>([]);
    println!("{}",v.get_constant());
    let by_ref = &v;
    println!("{}", by_ref.get_constant());
}
// Actually you may want this, convert a reference to a Flag
trait GetFlag {
    fn get_flag(&self)->Flag<Self>{Flag([])}
}
impl<T> GetFlag for T {}
trait StaticName where Self: 'static {
    fn get_constant(&self) -> String {format!{"{:?}",core::any::TypeId::of::<Self>()}}
}
1 Like

It's quite similar. Though it's a shame we can't have it without the reference-indirection (and resulting lifetime bound). I'm assuming your idea actually originated from looking at, if it were allowed:

struct Static([T; 0]);

I like how it would "agree" with the required semantics for justifying the ability to derive it from a reference of &T: at least for sized T there's core::array::from_ref with the proper layout already etc.

Unfortunately, of course, the indirection variant in your comment doesn't unsize. The by_ref can't have the type &Flag<dyn StaticName> but the ability to not only call the method on the dyn-trait-object but also to ergonomically create it is crucial. Neither does this other covariant type:

struct Static<T: ?Sized>([fn() -> T; 0]);

Aside: why doesn't the language permit 0-length arrays of unsized types?

What exactly is the use case for this? As in what real world code patterns would it allow (and how would you write them today)?

I'm wondering if the language really needs this feature. I would find being able to return Self much more useful (e.g. for cloning, but it would require capabilities similar to what Swift has, and I don't know that Rust wants to go down that route, because alloca)

I want to write a type-erased Serialize/Deserialize pair. It must be possible to store as a dyn-trait object. Basically:

use serde_json::Value; // or similar.

trait SerDe: Any {
    fn name(self: &Static<Self>) -> &'static str;
    fn serialize(&self) -> Value;
    fn deserialize(self: &Static<Self>, Value) -> Arc<dyn SerDe>;
}

fn register(&mut self, de: &'static Static<dyn SerDe>) {}
fn register_by_value(&mut self, de: Arc<dyn SerDe>) {}

It's fine for me that the return type of deserialize must be type-erased already. A Self return type wouldn't significantly provide absolutely guarantees, but the ability to ergnomically write this as a trait seems necessary.

I'd happily take a builtin way of creating Static<T: ?Sized> from Arc<T: ?Sized>, but will also go any unsafe route if need be. Reading out the ZST shouldn't have soundness problems as long as the primary function of it having the same pointer metadata is guaranteed.

This would benefit from the ability to refer to a valid vtable for a type-erased, hidden, concrete type without having a value of said type around.

use std::any::Any;
use std::sync::Arc;

use serde_json::Value; // or similar.

// supertrait construction for convenient default
// implementation for sized types
trait SerDeStatic {
    fn get_static(&self) -> StaticDynSerDe;
}
impl<T: SerDe> SerDeStatic for T {
    fn get_static(&self) -> StaticDynSerDe {
        StaticDynSerDe(&StaticDynSerDeInner {
            name: Self::name,
            deserialize: |v| Self::deserialize(v),
        })
    }
}
trait SerDe: SerDeStatic + Any {
    fn name() -> &'static str
    where
        Self: Sized;
    fn serialize(&self) -> Value;
    fn deserialize(_: Value) -> Arc<Self>
    where
        Self: Sized;
}

// feel free to come up with a better name for this “&'static Static<dyn SerDe>” replacement type:
#[derive(Copy, Clone)]
struct StaticDynSerDe(&'static StaticDynSerDeInner);
struct StaticDynSerDeInner {
    name: fn() -> &'static str,
    deserialize: fn(Value) -> Arc<dyn SerDe>,
}
impl StaticDynSerDe {
    fn name(&self) -> &'static str {
        (self.0.name)()
    }
    fn deserialize(&self, v: Value) -> Arc<dyn SerDe> {
        (self.0.deserialize)(v)
    }
}

struct Registry {}
impl Registry {
    fn register(&mut self, de: StaticDynSerDe) {
        // safe `de` somewhere
        // …

        // at later point, you can call the methods easily
        de.name();
    }
    fn register_by_value(&mut self, de: Arc<dyn SerDe>) {
        self.register(de.get_static());
    }
}
1 Like

Sure, a customized vtable object works. It's realistically what will end up in the code base in the short term. The idea hinges on the ergonomical advantage of integrating it with the standard vtable / dyn-trait object. It's a good argument that true integration, i.e. the ability to call static methods on a &T, would be necessary to make the feature convincing for inclusion in rust. Thank you.

Well, maybe it isn't so important afterall. Maybe a proc-macro is sufficiently nice.