Representing closed trait objects as enums


#1

Dynamically dispatched trait objects have some attributes which can make them a better solution than enums:

  • They separate the implementation for each variant into separate blocks, rather than having a bunch of methods with match statements.
  • They suggest a stronger degree of parametricity by providing an interface that each variant is required to implement. This makes it obvious when, for example, one parameter only applies to one variant, which can pressure you to think about whether the abstraction is well fit.

But because they have to be virtually dispatched and behind a pointer, they carry both a performance and an ergonomics penalty.

What if they didn’t have to? If I have a private trait implemented by a half dozen concrete types, this is logically equivalent to an enum, but easier to read. What if there were a #[repr(closed)] attribute that would represent a trait object as an enum? I think this is a good statement of the rules:

A trait tagged #[repr(closed)] must:

  1. Be local (to tag it, of course)
  2. Be object safe (to dynamically dispatch it).
  3. Be implemented only by concrete types or types in which all type variables are bound by at least one closed traits (so that all concrete implementations can be determined locally).

This would then give the trait these properties:

  1. All impls of this trait outside of this crate are orphan impls (this is what makes it closed).
  2. This trait’s trait object is Sized, having the same representation as an enum which has a variant for each implementing type, and methods perform a switch over the discriminant rather than going through virtual dispatch.

Is this potentially viable? Its stricter and more optimized than the proposals about thin traits.


#2

I had a comment with similar thoughts way back on one of Niko’s virtual structs blog posts, for what it’s worth.


#3

This seems similar/related to the sealed traits pre-rfc.


#4

It also occurs to me that if these traits aren’t really being virtualized in their trait object, none of the object safety rules are necessary anymore.


#5

If I’m correct the memory footprint of such traits may be rather unexpected. As the size of an enum is the size of its longuest variant, someone using a trait object with a small implementing type might be surprised to realise it allocates in fact much more. Wouldn’t this be a clear divergence from the principle of least astonishment?


#6

The user specifically opts into an alternative representation, and has to be the one who defines all implementing types. I certainly don’t think “the principle of least astonishment” applies here.


#7

Between the size thing and the fact that dynamic dispatch is not any worse than a big match statement (in fact, a large match statement is usually turned into a jump table just like the vtable trait objects use), is this really an improvement? The only advantage a large enum has over trait objects that I can think of is that it is Sized.


#8

Would this proposal allow you to use match to discriminate between variants of the enum?

BTW: Also have a look at EnumInnerAsTrait from the custom_derive crate. It solves a somewhat similar problem. https://danielkeep.github.io/rust-custom-derive/doc/enum_derive/index.html


#9

@withoutboats Sure but he certainly does not have to be the only one who uses them (or am I misuderstanding the visibility restrictions?). Any external client would need to have a clear idea of what is a closed trait and if the traits he’s actually importing from your crate are in this category or not.


#10

Definitely not. This would annul many of the advantages of using a trait over an enum. The only difference is that the trait object is Sized.

This doesn’t seem any more true than with any other external type. If I care about the memory layout of someone else’s type, I nearly always need to read the source code. I would agree that this fact about representation should appear prominently in the rustdoc output though.


#11

I definitely care more about the ergonomic advantages of a Sized trait object more than the performance advantages, but a) there’s no reason to assume a closed trait has many variants , so this could easily be a small match statement, b) this also avoids a heap allocation for each object.


#12

So basically this is just a syntactic sugar for

trait T {
    fn foo(&self) { ... }
}

enum E {
    A(A), B(B)
}

struct A { ... }
struct B { ... }

impl T for E {
    fn foo(&self) {
        match *self {
            E::A(ref a) => a.foo(),
            E::B(ref b) => b.foo(),
        }
    }
}

impl T for A { ... }
impl T for B { ... }

impl Into<E> for A { ... }
impl Into<E> for B { ... }

Perhaps something similar can be achieved without much boilerplate with some macro hackery? I’ve done something similar here: https://github.com/matklad/miniml/blob/master/ast/src/exprs.rs. Note that explicit solution lets you to choose between A(A) and A(Box<A>) (on per variant bassis), which should address @burakumin’s concern.


#13

I think that EnumInnerAsTrait and EnumFromInner from the custom_derive crate can solve the problem proposed by matklad. https://danielkeep.github.io/rust-custom-derive/doc/enum_derive/index.html

#[derive(EnumFromInner, EnumInnerAsTrait(pub as_t -> &T)]
enum E {
    A(A), B(B)
}

let a = A {....}
let e: E = a.into();
e.as_t().foo();

EDIT: By I am not sure if it also solves the problem proposed by withoutboats, and AFAICS it doesn’t provide boxes.


#14

The big difference is that having a real enum in the source encourages downcasting it. A major advantage of traits is that they encourage dealing with types abstractly.


#15

This can be dealt with by privacy though.


#16

To elaborate, I don’t try to make a value judgement between “native” sealed traits and a manual implementation via enum :slight_smile:

I want to understand:

  1. Is desugaring a possible implementation strategy for sealed traits, or do they introduce something genuinely new?

  2. If the answer to the 1 is “yes”, then would it be possible to implement this as a procedural macro?

  3. If the answer to the 2 is “yes”, then would it be sufficient to use macro by example?


#17

I don’t think this would introduce any kind of representation that could not be produced manually by constructing the enum. I think a macro could be produced, but I think it would not be elegant - you would need to provide all of the implementing types to the macro.

I think the advantage of using a native representation is that it makes it ergonomic and easy to use dynamic dispatch without worrying about object safety & DSTs when you don’t have to.


#18

I was thinking about this, or something like this, in light of impl Trait.

Consider:

fn one() -> impl Iterator<Item=u8> {
    std::iter::once(123)
}

This of course works fine. But now consider:

fn one_or_two(two: bool) -> impl Iterator<Item=u8> {
    let one = std::iter::once(123);
    if two {
        one.chain(std::iter::once(234))
    } else {
        one
    } // error[E0308]: if and else have incompatible types
}

Now I can of course solve this by using boxed trait objects. But this kind of defeats one of the purposes of impl Trait, which is reducing heap allocation. The return type of one_or_two is trivially defined as an enum. In fact, a lot of functions that return a boxed trait object could return an enum that implements that same trait instead.

  1. For which traits is it possible to programmatically generate such enums and trait impls, given a list of types? (All?)
  2. Can we make this more automated by using a macro/compiler integration?

If you want to support multiple levels of impl Trait return values, it seems impossible to do without compiler magic.


#19

I think a lot of us have been independently coming up with this “enum impl Trait” idea from various angles. Unified Errors, a non-proliferation treaty, and extensible types and Allow return more then one error type from function? and pre-RFC: anonymous enums are some existing threads that all ended up discussing it.


#20

I’d like to update my attitude (which was stated here Allow return more then one error type from function?)

Although I encourage all language improvements like this one for example. I need to say that I’m fine with the concept of create-own-error-and-implement-trait-From-for-each-error-which-can-occur in libraries (which I wasn’t at the time of wrting the above post). Especially with crate failure. Error management is good enough and accomplished from my point of view, I feel comfortable using it. Be aware that I write occasionally toy rust code.

BTW: I recently change my-custom-type with failure and it looks great (https://github.com/xliiv/numsys/pull/1/files#diff-b4aea3e418ccdb71239b96952d9cddb6L33-L61). So simple. Only code which I expected.

In application: maintaining own cutom type with manual implementation of Display & Error feels too expensive. Crate Failure makes it acceptable.