First, let me step back and state what my motivation here is. The current situation without any multi-trait-object support is unsatisfactory. Even the support without „narrowing“ support would be a great help in most scenarios. I’d like this to start moving somewhere, if I can help to push it a bit, that’s even better.
My impression of the current situation is that it’s stuck because we don’t have an obviously good solution, only several clearly sub-optimal in some ways. But from my point of view:
- A sub-optimal solution is still better than none.
- I don’t think if we introduce some solution that the compiler would guarantee it’ll always use the same way ‒ so if a better solution is found later on, it could be replaced.
The rest of this discussion is an attempt to brain-storm possible solutions ‒ just throwing incomplete ideas around, if something looks promising. Sure, it has problems that need to be solved ‒ for that, replies in the sense „Hey, this has this concrete problem“ are very useful. On the other hand, I’d really like this not to get stuck again on „we don’t have the optimal solution, so it can’t be done“ mind-set.
From the proposals that I’d seen so far, mine looks with the least severe drawbacks ‒ but that could very well be a personal opinion (both because I have inclination to like my own proposal, and because I might have different notion of the costs).
Different copies of the same vtable can already happen today
Don’t these get deduplicated during link time? I hoped that if two crates in the dep graph both use
HashMap<u32, u32>, only one monomorphisation gets put into the result and a vtable generated on the spot when converting to trait object looks like a very similar concept to this.
but because in your strategy a Read+Write vtable might include information for a dozen other traits, trait objects would exhibit different overhead on virtual calls depending on how they were created. Not necessarily a deal breaker, but not exactly desirable either.
Agreed. In that sense, it’d probably make sense to create only the slimmest vtables needed (eg. don’t include the traits that are not used as part of the multi-trait-object), so the needless overhead is as small as possible. Then we could get worse performance than other multi-trait-objects with the same set of trait only if it got created by the narrowing operation, which I think would be quite rare. Furthermore, I expect the usual set of traits to be rather small in practice and I’d still expect the virtual call itself (opaque to both CPU and compiler optimisations) to dominate the cost. I’ll think about a way to measure it without having to implement the support in the compiler.
This still creates a new vtable for each different subset o:f the traits actually used, but we avoid creating all the n² combination of that subset we would have to make just because they could be used by narrowing down. The fact the user must explicitly write each such subset on the conversion site means the number is reasonably limited.
Something along those lines has been suggested by several people. It’s a natural impulse, but it means that compiling a Rust program requires whole-program analysis
Not necessarily to that level. I could imagine (read: I have no idea if this is possible in current rustc) that the type erasure would link to an extern symbol for the vtable, so most of the compilation could happen without knowing the whole set of traits the type implements. Only the generation of the vtable would have to happen at the end, when all the traits for the type are known, and the vtables are likely to be reasonably small bits of code, so that work after this all is gathered together wouldn’t be that big.
Something like this was actually a step before what I was thinking. I wanted to avoid the heap allocation (what if we are in no-std situation? We may not have the heap. And, who frees it?) and the double-virtual-indirection.
The other side where it may happen is in the fat pointer itself (eg. before the indirection, not after), but that makes it even more beefy.
Uh. How would that work? A trait object is a type in its own right, so things like
&(Read + Write) and
Rc<Read + Write> should still work. Would you introduce special names for all these? That doesn’t sound right and makes the language irregular (eg. different than
Read + Send).