Casting families for fast dynamic trait object casts and multi-trait objects

In I briefly described a mechanism for casts between dyn Trait types that resembles vorner's suggestions in Where's the catch with Box<Read + Write>? as mentioned in

We define "casting families" with ZSTs that impl DynCastFamily which structure the compilation unit problem.

pub trait DynCastFamily { const NUM_TRAITS: usize; }

Assume some fixed CF: DynCastFamily which we take as synonymous with the crate defining CF.

We first impl DynCastTo<CF> for dyn Trait { .. } for any dyn Trait types to which trait object casts occur, where

unsafe pub trait DynCastTo<CF: DynCastFamily> : ?Sized {
    const dyncast_index: usize;

In essence, these provide vormer's trait_id but compacted into lookup indices since any::type_id(dyn Trait)s are way too large. We define these indexes in the compilation unit CF so impl DynCastTo<CF> for dyn Trait becomes hard when Trait lies downstream of CF (see below).

We next impl DynCastFrom<CF> for T for any type T from which we permit dynamic casts within CF, which morally resembles vomer's MultiTraitObject. We never use this when we know T but the object safe trait Trait: DynCastFrom<CF> makes this available from dyn Trait objects.

unsafe pub trait DynCastFrom<CF: DynCastFamily> {
    const DYNCAST_VTABLES: &'static [VTablePointer; CF::NUM_TRAITS] where Self: Sized;
    const fn dyncast_vtables(&self) -> &'static [VTablePointer; CF::NUM_TRAITS] { DYNCAST_VTABLES }

There are no polymorphic methods on trait objects, so we implement the dynamic cast on the trait object itself, like

impl<CF: DynCastFamily> dyn DynCastFrom<CF>> + ?Sync + ?Send + 'static {
    fn dyncast<T,P>(mut self: P::Pointer<Self>) -> Option<P::Pointer<T>> 
    where T: DynCastTo<CF>, P: PointerFamily+PointerVTable,
        let i = <T as CastTo<CF>>::dyncast_index;
        let new_vtable = self.dyncast_vtables()[i];
        if new_vtable.is_null() { return None; }
        let old_vtable = PointerVTable::vtable_offset(self);
        Some(unsafe {  *old_vtable.write(new_vtable);  mem::transmute(self)  })

where PointerVTable::vtable_offset permit altering vtable pointers manually via the PointerFamily ATCs (see

We'd expect high performance from this solution because it requires only two pointer dereferences from potentially well traversed tables. We need one if check too in the full dynamic cast setting, but not for supertrait casts.

We can freely impl DynCastFrom<CF> for T for types T downstream of CF or use Trait: DynCastFrom<CF> for traits downstream of CF, so this suffices for upcasting among trait objects.

We represent the trait object dyn TraitA+TraitB by dyn DynCastFrom<CF>, provided both dyn TraitA: DynCastTo<CF> and dyn TraitB: DynCastTo<CF>. If TraitAB: TraitA+TraitB then we need TraitAB: DynCastFrom<CF> too, probably including the trait alias case pub trait TraitAB = TraitA+TraitB too.

We support multiple CF being declared in one crate if one prefers shorter &'static [VTablePointer], which enables some executable size optimizations, and makes casting more useful in memory constrained environments that avoid monomorphisation.

We can only impl DynCastTo<CF> for dyn Trait if (a) CF reserves space for Trait and (b) all impl DynCastFrom<CF> tell the linker what goes where, so doing this downstream from CF requires build tools that couple CF extremely tightly with the downstream crate. It's likely this creates headaches for extremely large object oriented programs, but the performance should outweigh such costs.

We cannot currently make associated constants object safe with where Self: Sized bounds, so one either removes DYNCAST_VTABLES or else fixes this limitation. I included DynCastFrom::DYNCAST_VTABLES here because doing so helps indicate that rustc should inline the slice directly into the DynCastFrom vtable, avoiding one indirection for the dyn TraitA+TraitB case.

As written, we could implement the underlying casting machinery using only proc macros, lazy_static, and planned extensions like arbitrary self types and PointerFamily ATCs, but rustc could build the &'static [VTablePointer] more cleanly than lazy_static. In other words, anyone interested could work towards this completely outside the rust tree without consuming any lang team resources!

We'd need rustc support for truly implicit DynCastFrom<CF> supertraits of course, and the notation TraitA+TraitB equaling some DynCastFrom<CF>, but really crates could enable this selectively, perhaps via a #[derive(DynCastFrom)] proc macro that deduces CF somehow.

In principle, rust could expose dyn TraitA+TraitB and upcasting, while another unstable crate exposed the casting families machinery for projects that'd benefit, much like std::future exposes stable futures, while the futures crate exposes unstable functionality. We could thus reject pressure to ever bring the casting families machinery into std, like from OO proponents, while still permiting its usage in more complex casting scenarios.

1 Like

I find it hard to follow this description, possibly because the earlier discussions that this builds on are out of cache for me. I'm especially unclear on this CF ZST: there are multiple such types, right? How are they introduced, implicitly by the compiler (according to which rules?) or explicitly by users (how?)? How does any particular (multi-)trait object get associated with one (or multiple?) of the CF types?

More fundamentally, I have the same sinking feeling I already had in relation to vorner's approach before: there seems to be "free parameters" in this implementation scheme that can be tweaked to make trade-offs about code size and performance, and such knobs are not a good thing. If good values for these parameters cannot be determined automatically and users have to tweak them manually on a case-by-case basis, then most prospective users of multi-trait objects and upcasting might be better off waiting for the simpler feature of upcasting single-trait objects and just defining aggregate traits (trait FooBar: Foo + Bar {} with blanket impl) for the combinations they need.

I am also unclear about many other details and concerned by the off-hand gesturing at needing linker cooperation for some scenarios (I don't really understand it so I might be reading too much into it). If an implementation as a library is possible (modulo not getting to insert constant data directly into vtables), doing that first seems valuable, not just to avoid being bottle-necked on rust-lang processes but also to get a more concrete idea of how this approach would look in practice and what kind of code size and performance it can obtain.

Something simpler than implementing the full general machinery as a library, which would still be very useful for understanding, would be to spell out all the traits and impls for a concrete example, say, a program that deals with six traits A...F and multi-trait objects dyn A+B+C and dyn D+E+F. IIUC there are multiple possibilities for how to support this with different trade-offs, it would be helpful to spell out these options (or some interesting points, if the full space of possibilities is too large).

Any compilation unit could define its own CF: DynCastFamily, which yes creates one big headache: dyn TraitA+TraitB = dyn DynCastFrom<CF> differs for different CF. We'd hopefully address this by always selecting the most upstream applicable CF.

I'd naievely imagined rustc could select CF automagically in such a way that dyn TraitA+TraitB "just works", but actually this does not work:

We've crates color and shape that define traits Color and Shape, as well as crates cat and dog that handle dyn Color+Shape as dyn DynCastFrom<CAT> and dyn DynCastFrom<DOG>. Another crate owner could handle both separately, but its own dyn Color+Shape must become dyn DynCastFrom<OWNER> so that it can upcast to both dyn DynCastFrom<CAT> and dyn DynCastFrom<DOG>. It sadly cannot downcast from those back to dyn DynCastFrom<OWNER> however. :frowning:

We'd need some unification process by which rustc inserted some virtual compilation unit above color and shape and below cat and dog. I suspect complex crate hierarchies could make doing this depend upon the full crate hierarchy though.

Also, we're more constrained anyways if we've no rustc support, and thus not doing dyn TraitA+TraitB anyways:

I suppose users would define one cf crate that contained their CF and define any Trait from which they cast, so that Trait: DynCastFrom<CF>. In cf, they can impl DynCastTo<CF> for dyn Trait for upstream Traits they cannot cast from those traits.

If one developed some complex GUI system with many crate layers, then one could probably do custom infrastructure that defined DYNCAST_VTABLES large enough for traits from several other crates. I'll make an edit that maybe supports this. Thanks!

Assuming you only need one compilation unit then it'd look like:

pub struct CF;
impl DynCastFamily for CF { const NUM_TRAITS: usize = 6; }

impl DynCastTo<CF> for dyn A { const dyncast_index: usize = 0; }
impl DynCastTo<CF> for dyn F { const dyncast_index: usize = 5; }

We impl DynCastFrom<CF> for T for all T: 'static satisfying any A,..,F too, meaning one extra trait overhead. We'd populate their tables with lazy_static if we've no rustc support.

We now represent all multi-trait objects upwards from dyn A+..+F with DynCastFrom<CF>. We make rustc optimize upcasting without any if check. We can however sidecast between dyn A+B and dyn E+F with an if check and error path. We can similarly sidecast from dyn A to dyn B+C too provided A: DynCastFrom<CF> + 'static.

All this falls apart without 'static everywhere of course, although maybe you could avoid 'static for upcasting somehow, not sure.

We could drop the 'static bound on DynCastTo and DynCastFrom, and make rustc could enforces all the internal lifetimes itself, but dyn DynCastFrom<CF>::dyncast, side casting, etc. requires a 'static bound.

I removed the 'static trait bounds since they appear unnecessary, but the dyn DynCastFrom<CF>::dyncast requires a 'static bound.

I think upcasting non-'static trait objects can only be done by the compiler providing some Upcastable trait that enforces all internal lifetimes correctly.

I'm worries ATC style associated lifetimes break everything if made object safe, i.e. can this trait be object safe?

pub trait Foo {
    type 'a where Self: 'a;

We've both where Self: Sized methods and object safe methods in ... but the object safe methods call the where Self: Sized methods so can 'a somehow surface inside the object safe methods?