Where's the catch with Box<Read + Write>?

Hm, you are right, in the best case scenario we'll get quadratic number of vtables and at the undesirable cost.

Can't we improve this "double indirection" approach in the following way? Let "combinator" trait objects be represented as the following heap allocated structure:

pub struct CombTraitObject {
    pub data: *mut (),
    pub vtables: [*mut ()],
}

Here length of vtables length is equal to the number of listed traits (with optimization for marker traits, perhaps), e.g. for Read+Write it will be just 2. So if we want to use Read method we'll take pointer vtables[0] and for Write it will be vtables[0].

Now if we have Box<A+B+C+D> and we want to convert it into Box<B+D> we dynamically generate appropriate CombTraitObject with new_vtables = [old_vtables[1], old_vtables[3]]. And if we doing conversion into Box<A> we create the standard single indirection TraitObject struct with an appropriate pointer.

To better distinguish from "single" indirection trait objects we probably better to use not a Box, but a different name.

If I am not missing something here we'll generate the same amount of vtables as we do today.