Where's the catch with Box<Read + Write>?

hanna-kruppe · January 21, 2018, 3:37pm

Upcasting is not (Rust) compiler jargon, it's OOP jargon

I don't believe multi-trait objects that are never upcasted can be insulated from the implications of supporting upcasts. This is because, again, there might be code anywhere in the application that performs an upcast, so all multi trait objects need to be represented in whatever way is needed for upcasts.

With that in mind, supporting upcasts potentially has a huge cost. It's possible that we'll pick the "N vtable pointers" approach for other reasons, in that case upcasting is cheap to add. But these decisions are interwined.

Hm, good point about proxy impls on trait objects. However, in my experience trait objects in interfaces (including interfaces within a crate) are somewhat common, despite the popularity of generics, especially in "business logic" crates as opposed to foundational crates that provide data structures, algorithms, serialization, etc.

Are you suggesting we restrict which upcasts people can do? That, too would be a valid language design decision informed by what is and isn't implementable efficiently.

skade · January 21, 2018, 9:41pm

Trait objects can be passed through FFI and used in dynamic libraries, so you can't expect that it is never upcasted, even if you'd delay the analysis as much as possible.

atagunov · January 21, 2018, 10:43pm

...that's as much upcasting as C++ supports isn't it?

Matching C++ is probably not that shabby. Surely C++ people have weighted pros and cons of different approaches and how much upcasting could be supported without incurring unreasonable implementation costs?

hanna-kruppe · January 22, 2018, 1:02am

C++'s entire approach to ad hoc polymorphism (both in the design philosophy and the implementation) is so different from Rust's that I don't think such a comparison is useful.

But let's run with the analogy. I suppose you're referring to multiple inheritance:

class Foo {};
class Bar {};
class Quux {};
class Multi : Foo, Bar, Quux {};

Now FooBar* can be upcasted to either Foo* and Bar* and Quux*, but there's no way to get something that inherits the methods of two of the base classes but not the third. If one wants that, additional classes have to be defined and arranged into an inheritance hierarchy.

The moral equivalent of that is this Rust code, which already works today:

trait Foo {}
trait Bar {}
trait Quux {}
trait Multi: Foo+Bar+Quux {}

impl<T: Foo+Bar+Quux> Multi for T {}

Here, too, we explicitly called out the combination of Foo, Bar, and Quux and reified it into a separate entity Multi, and similar limitations to the C++ variant.

Since we can already do that, I'd expect multi-trait objects to be somewhat more flexible -- a dynamically dispatched analogue to T: Foo+Bar+Quux bounds in generics, rather than to T: Multi. Something that is freely composable, rather than forcing the programmer to create and maintain an artifical and inflexible hierarchy.

atagunov · January 22, 2018, 1:36am

Exciting, thx a bunch!

One feature missing is upcasting &Multi to &Foo isn't it?

mikeyhew · January 22, 2018, 7:00am

Wait, how is that blocked on const generics?

mikeyhew · January 22, 2018, 7:12am

FWIW, if vtable generation were delayed until the main crate is compiled, that would mean generic methods could be object-safe. Which would be really cool and powerful. But it seems like the costs are too high

newpavlov · January 22, 2018, 7:25am

See example in this post, N there is a number of trait combinations in trait object signature, e.g. N will be equal to 2 for Read+Write object.

mikeyhew · January 22, 2018, 7:45am

Thank you for providing that context. As far as I understand, the only use for raw::Traitobject is to be able to split trait objects apart and put them back together. Is that not true? If that is the case, then functions can be provided to do just that, generically, for all ?Sized types and not just trait objects, which would anyway be a big improvement over the current situation. So not blocked on const generics, although that is going to be an awesome feature that I’m really looking forward to.

hanna-kruppe · January 22, 2018, 8:24am

Oh yeah sorry forgot to mention that. You can sometimes add methods as_foo(&self) -> &Foo etc. or even as_foo(self: Box<Self>) -> Box<Foo> (on stable, this only works for Box right now, not for e.g. Rc) to emulate it.

vorner · January 22, 2018, 8:37am

Only as a last resort… just saying it would be possible, not something to aim for.

Can we somehow summarise the costs for various approaches? Because I don't think some of the costs mentioned here would be that huge and maybe measuring it might make sense. I definitely want to have a look at measuring the overhead of my approach (but I'm afraid it might take some time before I get to it).

atagunov · January 22, 2018, 1:02pm

To expand a bit more on the “last resort” solution, one could spell out manually what upcasts are allowed:

trait Foo {}
trait Bar {}
trait Quux {}

// r can be cast to &Foo, &Bar, &Quux, &<Foo + Bar> but not to &<Foo + Quux>
let r : &<Foo + Bar + Quux + (Foo + Bar)> = ...

maybe some shorthand like type FooBar = Foo + Bar could be used to make &<Foo + Bar + Quux + FooBar> equivalent to the above

hanna-kruppe · January 23, 2018, 12:59pm

Very broadly, all approaches that support arbitrary upcasts seem to have either:

binary size overhead from vtables exponential in the number of traits used per trait object type
run-time space overhead per trait object pointer (not trait objects) that is linear in the number of traits used in the trait object

Some approaches also add double indirection to virtual calls to improve the constant factor on either of the above.

vorner · January 23, 2018, 7:45pm

Either I'm missing something important, or explained my approach in a wrong way, because I believe it has neither of these two.

A multi-trait object has just one vtable pointer.

There are two ways how to tune binary-size vs. speed costs:

When doing type erasure to a multi-trait object, create one big vtable with all the locally known traits. This is AFAIK at most one vtable (with autogenerated code of size linear in number of total methods of the type) per compiled crate (I hope rustc can do at least crate-global analysis, if not whole-program-global analysis).
The other option is to create a vtable with only the needed traits for this type erasure. This in theory can be exponential number of different vtables. But as I create one only when doing the type erasure (not upcast from type object to type object), this happens only in exponentially large programs, so the size of binary is still polynomial in the size of the source code.
A hybrid is to create one vtable per crate, but include only union of traits needed across all the type erasures happening in that crate. This is basically the first option, but optimised not to include things that are never used.

Yes, it does add a double indirection, but only one of them is dynamic.

The fact some trait objects may have faster or slower dynamic dispatch (because they carry lookup tables of different sizes for different sets of traits) is unfortunate, but as different trait objects can have different implementations of the methods being called through that dispatch, this doesn't sound like a very big deal.

hanna-kruppe · January 23, 2018, 8:17pm

Sorry, you’re right, I somehow forgot about that approach.

comex · January 26, 2018, 4:08am

It’s not just a question of indirection, but the cost of executing lookup_vtable itself - which, as far as I can tell, would have to occur on each method call to a multi-trait object (albeit with some optimizations possible).

What would the representation of TraitId be? You’d expect it could be similar or identical to the existing type_id, which returns a statically known u64 based on hashing the contents of a type (this is how Any works). But this system has a long-known flaw: 64 bits are not enough to reliably prevent hash collisions.

One proposed solution was to embed the actual trait names and add a strcmp-like operation. An alternative would be to make the IDs much larger, large enough to use a collision-resistant cryptographic hash. I think a better approach would be to use globally-unique symbol names and letting the linker sort out uniqueness (just once, at link or load time when using static and dynamic linking, respectively); but I think there are concerns that that isn’t possible on all platforms. If it is possible, IDs would become pointers. Doing a single comparison would be as cheap as with the current 64-bit hash, but for multiple comparisons like you envision, since the compiler doesn’t know the exact value of the pointer in advance, it wouldn’t be able to (hypothetically) pre-generate a hash table or even a binary search in advance; it would have to use a linear search. Then again, a linear search of, say, <= 4 values (probably the common case for multi-trait objects) may be faster than any more complicated scheme.

Whatever the case, the cost of figuring out the right vtable might become nontrivial, more like, e.g., an Objective-C method call than a C++ virtual call.

This approach is definitely possible, and it might even be the best one, but I wouldn’t be sure before seeing an implementation and benchmarks.

Personally I suspect the best approach might be the “exponential, but low constant factor” design, where a multi-vtable pointer points to an array of single-vtable pointers. But I’m just guessing.

In any case, has anyone tried to prototype any of these approaches in a library? Shouldn’t be that hard…

hanna-kruppe · January 26, 2018, 4:23pm

TypeId collisions can already cause unsoundness (by permitting wrong downcasts via Any). But more importantly, unlike with Any, the compiler could actually check for TraitId collisions when constructing the "vtable".

vorner · January 26, 2018, 4:50pm

Hmm, I didn't know about this one. That's… nasty :-|. I expected it is some kind of pointer to something in the binary, which is known at compile time.

I intend to do just that. But haven't gotten around to it yet (a day job and an inability to concentrate just on one thing at a time).

H2CO3 · January 27, 2018, 9:47am

Well, I'm not sure about that. "Two traits — two pointers" is not a bad tradeoff, especially given the significantly lower implementation complexity compared to the "generate every possible combination of trait-subsets as a separate vtable" approach. Which, by the way, would also increase the size of the binary itself, so it wouldn't necessarily lower memory footprints either. Lower implementation complexity in the compiler is also desirable as a first take, for the sake of fewer opportunities of compiler bugs sneaking in. This is especially true with a feature like traits, which interact with almost every other part of the language.

Furthermore, if the size of trait objects is a real bottleneck regarding the memory usage of some code, I suspect there are a great deal of much more serious problems with that piece of code.

system · March 25, 2019, 8:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: repr(C) for traits language design	14	2579	September 18, 2020
Trait upcasting compiler	8	1706	March 25, 2019
Pre-RFC: Lightweight cross-casting (Closed) language design	17	2228	March 25, 2019
Sieve-tables for multiple traits objects (Box<A+B+C> to Box<A+C>, #2035) language design	8	1837	January 25, 2022
Blog post: Extended Enums and Thin Traits language design	32	12169	March 25, 2019

Where's the catch with Box<Read + Write>?

Related topics