Blog post: Extended Enums and Thin Traits


#1

I wrote a blog post that compares the “Extended Enums” stuff I was describing with the “Thin Traits” (my term) stuff that @aturon was describing:

http://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/


Need custom calling convention for COM
#2

I’m confused, I thought Servo wanted closed inheritance? Have the devs indicated otherwise?


#3

I think they can work with either. Certainly @aturon found that the uses of downcasting were relatively few. Perhaps @pcwalton or @jdm can chime in, though.


#4

That’s not quite right – the use of virtual dispatch was smaller than I expected. They definitely use downcasting, but given their setup they could easily generate the needed methods.

From my point of view, this isn’t so much about prioritizing open vs closed as the other risks around the closed proposal that you didn’t mention much in the post – specifically, the impact to inference/subtyping. Going with the thin traits proposal is definitely sufficient for Servo (and many other cases), and is almost certainly faster and less risky on our end.


#5

This potentially is a very silly question, but with respect to the thin trait design, how would this interact with classic fat pointers, as in:

struct NodeFields {
    id: u32
}

trait Node: NodeFields {
    fn something(&self);
    fn something_else(&self);
}

fn foo(node: &Node) {
    println!("{} {}", node.id, node.something());
}

Would this be an error, or would the node.id be just dispatched through the vtable?


#6

OK, thanks for the correction.

Yes, I didn’t emphasize the “risks” aspect in the post, but I agree with this. I actually had a paragraph about it but cut it, since I didn’t seem to be adding anything to what I had said before. :slight_smile:


#7

Neither, I think. The field would just be loaded from the data pointer at offset zero.


#8

edit: incorrect. See below.

#[repr(thin)] is going to be a performance/backcompat footgun (easy to leave off) and is going to confuse people. Also, repr should only affect representation but this one also affects semantics (can’t implement trait from other crates). What about:

closed trait Test {
    // ....
}

where all closed traits are implicitly optimized.


#9

I agree. I was thinking as I was reading the blog post that local or closed might make sense as a name for these traits.


#10

I feel like it is legitimate for representation choices to cause some operations to become unavailable. What would be bad is if the same operation has two very distinct meanings, depending on the representation. It seems like having a thin representation prohibit other crates from implementing the trait falls into the former category; this seems analogous to how using a packed representation will make taking references to fields unsafe. Another example might be allowing enums (at some future point) to use low-order bits to distinguish variants, thus achieving tagged pointers: this would have to forbid taking references into those enums.

EDIT: As @aturon later pointed out, thin traits can be implemented by other crates – that’s what makes the “open extensibility” I talked about in my post – but they cannot be implemented for types of other crates. Anyway, the broader point stands.


#11

If #[repr(thin)] prevents implementation of a trait from other crates, isn’t it the same thing as sealed traits (https://github.com/rust-lang/rfcs/search?q=sealed)? I.e. it will allow to lift some restrictions from coherence as well.


#12

repr(packed) already has some strong semantic impact (internal references are unsafe). Arguably repr(C) and the rest also have a semantic impact if alignment and size are important aspects to a program’s correctness (e.g. binding to hardware interfaces).


#13

I don’t like the situation with packed structs either but at least that case is very rare. IMO, no one should be exporting packed structs across crates except for, maybe, c bindings. I don’t think packed enums should use repr either but I’ll argue that case when it comes up.

My main problem with #[repr(thin)] is that it will significantly affect public APIs. Maybe we need an RFC outlining what attributes in general should and should not be used for (I see them as compiler directives but this is going well beyond compiler directives).


#14

That’s not the limitation being imposed here. Rather, you can only impl a thin trait for types that you define (where usually, if you define a new trait, you can apply it to existing types). That’s because a thin trait influences the struct layout by inserting a vtable.

To be concrete, you can define a thin trait in crate A, and in crate B, you can impl that thin trait for a struct defined in crate B.


#15

I used to feel exactly the same way, and had long pushed for this to be a keyword when we were working through the design. But I’ve since come around to @nikomatsakis’s position: the behavior of traits does not vary at all, and the choice here is entirely about optimization of representation, that happens to also limit (but not change) the cases where the trait can be applied.

It’s worth keeping in mind that, due to the orphan rules, the primary place where a trait is applied to types defined elsewhere is in the crate defining the trait – since that’s the only one that can do so arbitrarily. So I suspect this representation choice will mostly affect the trait definer rather than downstream crates in practice.


#16

The difference is that repr(C) (versus repr(packed)) modifies compiler-level semantics. They don’t affect rust as a high-level language, just how the compiled binary interacts with other programs and the hardware. In general, I’m fine having representation affect unsafe operations.

Good point. I withdraw my objection (to repr(thin), I still object to repr(packed) but that ship has sailed).

Unless I’m mistaken, the compiler could theoretically apply repr(thin) automatically to all traits not implemented on foreign types. If that’s the case, this really is just a compiler hint saying “compiler, don’t allow me to do something that will disallow this optimization” and doesn’t affect API.


#17

#[repr (thin)] only affects the local crate’s ability to implement the trait because orphan rules already preclude other crates from implementing that trait for alien types, correct? Is there another reason it wouldn’t be back compat to remove a thin repr tag from a trait? Is the performance characteristic of thin pointers always a win whenever you don’t need the flexibility of fat traits?

If the answer to these qs is yes, no, yes, wouldn’t it be optimal for the compiler to just use thin pointers for trait objects unless it can’t?


#18

Implementing a thin trait adds an implicit vtable at the head of the struct. This will affects its size, naturally, and could break unsafe code, as well as other assumptions. It might also affect performance of plain safe code, depending on the ratio of struct-to-object instances (i.e., if you don’t use objects, as is common for traits, you’re just wasting memory). I think it should definitely be something you opt into.


#19

this doesn’t make sense to me. IIRC, there are two possible use cases here:

  • Either:
struct S{...}
#[repr(thin)]
trait T : S {...}
  • Or:

#[repr(thin)]
trait T {…}

In the former case the same crate provides both the struct and the trait so the coherence doesn’t matter and in the latter the struct could be defined as in your example in a client crate but then the upstream crate forces an unnecessary layout decision on its clients without knowing the client struct’s size and it could actually make performance worse (e.g when the struct fits a cache line exactly and adding a vtable inline will cause it to be bigger than a single cache line. This decision is better made at the same location (by the same person) defining the data layout, as part of the struct definition in the client crate and not on the trait.

As I said on the reddit thread, i really like all the separate pieces of these suggestions but the way those pieces are put together is wrong IMO. it is backwards and breaks Rust’s current very clear and orthogonal design and violates the separation of concerns principle.
traits in rust define interfaces and concrete types (enums and structs…) define data layout. clearly we want to affect the data layout and therefore this does not belong in the trait definition.


#20

This is odd that #[repr(thin)] which is a property of the struct (the fact that it has a vtable in it) is marked on the trait (which in my mind is where you declare the interface that the data implement but don’t actually impose any data (except with the hypothetical struct inheritance stuff but which is more like adding an explicit contract that some data is present, than the implicit vtable pointer here). I assume that marking the trait thin rather than the struct is because of the &-syntax (and transparency of fat pointers in general) where &Foo could be a fat or thin pointer depending on whether the vtable is in the pointer or the structure. I get it that we don’t want to add a specific syntax when talking about the pointer itself but on the other hand it is odd because this is actually where it matters (on the pointer) so it would make sense to be explicit there instead of on the definition of the interface. I can’t think of a convenient way to express it A thin pointer to something that implements Foo would look like

&thin Foo

and a fat pointer would remain

&Foo

The declaration of the interface and the way the virtual dispatch is made could be orthogonal and I think it should in an ideal world. Perhaps it is too late for something backward-compatible.