Trait upcasting


I’ve been looking into different approaches for this and it seems to me that in the general case, especially if you want to support Box<A+B> type pointers, there are two extremes which I’d like to eventually look for a reasonable middle ground. These are:

  1. Duplicate every super trait’s vtable everywhere a cast might be needed. The size of this can easily become absolutely unreasonable if we implement A+B pointers this way (imagine listing vtables for A+B, A+C, B+C, etc all with known offsets between them. ouch.) On the other hand, casting is always just an add/sub.
  2. Pointers to pointers to vtables. This approach is more complex to generate - the order of trait bounds can be rearranged to overlap pointer tables and that algorithm needs to consider every type and trait combination with potential casting paths between them, but there’s minimal size overhead if done correctly, even for multi-trait pointers. Every cast is two dereferences, which is better than dynamic_cast, but not ideal.

I’ve been looking into some hybrid approaches which I think could be promising. If we take the pointer table approach as the general case, we have vtables, tables of vtable pointers (these are the core casting mechanism), and pointers to vtable pointer tables (to avoid repeating the cast table pointers everywhere). I think in a lot of cases though, we can eliminate some of these pointers. For traits with few requirements, it doesn’t make sense to add an extra level of indirection when we’d just be duplicating a couple pointers everywhere its vtable appears to support casting. And similarly, for traits with few methods, it doesn’t make sense to add any indirection at all when its whole vtable is only a couple of pointers.

So the question for me then becomes, casting can have variable overhead. Should this be decided just based on a size optimization? Should we try to optimize the most common casts in a program? Look for casts inside a loop? Let the user hint at which casts should be optimized for speed?

I’d also like to collect some more data on this during the impl period so that the eventual RFC for this will be well-informed. Looking for the most common and most extreme patterns on should help clarify a good approach to take, but at the same time I can’t entirely know how adding this will change the way people write code. In the meantime, are there thoughts on different approaches, comments, something I’ve overlooked?



The combinatorial explosion is only an issue if Box<A+B> has a single pointer to a vtable that combines the vtables for A and B. The other option is to make it have two vtable pointers (one for each trait). Has there been a decision to definitely do or not do that? Obviously the “bigger trait object pointer” approach has its own downsides, and the alternatives need to be explored, I’m just asking because I don’t see this option mentioned at all in this post.



Yeah this seems like the most obvious solution to me.



That approach definitely gives you the fastest casts, but it’s not great for size. Box<A + B + C> would take up four machine words, and a collection of these would be four times as big as most people expect a collection of pointers to be. I will suggest if we go this route though, that we can use one vtable pointer and packed 16-bit offsets, which would at least save one word on 64-bit platforms.

My one other issue with that is that now there’s a difference between the representation of these two types:

trait T: A + B { /* empty */ }
impl<S: A + B + ?Sized> T for S {}
size_of::<Box<A + B>>() == 3 * size_of::<usize>()
size_of::<Box<T>>() == 2 * size_of::<usize>()

To me it seems that if we’re going to support casting from T to either of those traits, we should treat the nameless version the same. Or maybe there’s a difference between these that I’m missing?



There’s no denying that putting multiple vtable pointers into trait object pointers has disadvantages. I also see some advantages, but I don’t even want to go into the pros and cons now. I just wanted to check if I’d missed a consensus to not pursue that option.

I don’t see why? T is not merely an alias for A + B (trait aliases are actually a proposed feature, but separate from this), it is its own trait that happens to have some supertraits and a very widely-applicable blanket impl. Upcasts should be supported, of course, but I don’t see why the representation should be the same.



As far as I’m aware that hasn’t been decided, and if community preference is for fatter fat pointers I’d be glad to work on that design. One advantage of that approach is that you get a choice between A + B where the cast is essentially free but the pointer is bigger, and a new trait with a smaller pointer but possible extra layers of indirection.

My main motivation for starting this thread is that I have a system that would work, but there are going to be tradeoffs either way, and I want to get some more input on which tradeoffs we’re willing to make and in which cases.

I think what I was going for with the representations being the same is that if we’re already going to have a casting system in place, we could easily use it for A + B too, but we don’t have to if there’s a preference the other way.



Do we need to commit to any particular representation? I have no idea whether multiple indirections or fatter fat pointers or some heuristic hybrid thereof would be best in practice (I wouldn’t be surprised if we eventually end up providing #[repr(nested_vtables)] and friends), but as long as none of these options preclude committing to a specific ABI someday, it seems like we could just leave it unspecified for now, right?



I’m not really familiar with the whole RFC process, although I’m aware they’re on hold right now. Would it be better to leave the implementation unspecified? I didn’t really intend for this to be final, I just thought it’d be good to come up with a reasonable starting point and evolve it based on usage.

That was one of the questions I asked in my first post, and I definitely support it given that being able to choose your own tradeoffs is a big part of Rust.


closed #9

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.