I think that the discussion happening in the topic about Extended enums and thin traits blog post should get its own discussion thread to not high-jack the attention of a blog post which also talks about several other interesting topics. I feel like this discussion may end up in a rfc (I am new to this process), or maybe not, let me know.
Thin traits constitute a useful optimization tool. A pointer to a trait in rust today (&Trait, Box, etc.) are fat pointers which means they are actually double-pointers (one to the object and a one to its vtable). In some cases, the memory overhead of storing fat pointers can be problematic.
In order to have thin pointers to a trait object, we must have the guarantee that the actual structure that implements the trait starts with a vtable pointer (Ă la C++). This all has already been very well explained by aturon, nmatsakis and others so I won't go through that again here, but I wanted to draw some attention on two things:
- Thin pointers are actually a property of the data (struct and pointer) and not the trait.
- Thin pointers are motivated by optimizations focused on memory overhead/layout (so again, data and not traits).
I think that rust needs thin pointers, however I have a few issues with the current direction they are taking, namely:
- Thinness is expressed on the trait instead of the data (struct and pointer)
- The memory layout of a structure is implicitly modified by whether or not the structure implements a trait that is marked as thin.
- The decision of the thinness of a pointer to a trait is made by the person who declares the trait rather than the user of the trait, even though thin pointers are an optimization that is useful to the user of the trait.
- Somewhat less important to me, one can't have thin and fat pointer to the same trait, which is an artificial limitation of making thinness a property of the trait.
- If I make a thin version of an existing "fat" Trait, the there is no automatic coercion between them (please correct me if I am wrong)
Here is my proposal:
Thinness is a property of the structures and pointers. I don't want to focus too much on the syntax because I don't want it to distract peoples attention, but let's say that thin pointers are written as:
thin_foo: &thin Foo
or
thin_foo: &Thin< Foo>
... or some other notation. I will use the first one in the discussion but the important idea here is not the syntax, but the fact that thinness is expressed on the pointer just like mutability.
Thinness also have to be opted into by structures that thin pointers point to, because they need to store a vtable pointer. Again, syntax can be debated as a separate matter:
struct Bar {
virtual Foo,
name: String,
}
or
struct Bar impl Foo { name: String }
...or whatever reads best. The idea here is that a structs that bundles a virtual pointer, declares it explicitly.
The same applies to Box< T>, Rc< T>, etc, just like the mut keyword.
Thin pointers can be coerced into fat pointers. A fat pointer is a pointer to the object and a pointer to its vtable, and luckily we always know where to find the vtable in an object pointed to by a thin pointer.
What do we gain from expressing thinness this way? in short no surprise related to memory layout and
- I can understand the memory layout of a struct by reading its definition. I cannot overstate how important this is to me and to people who need rely on the memory layout of their data (which should be anyone who really cares about performance and have to think in term of cache lines).
- The memory layout of a struct will not change without touching the struct definition or the definition of its members (it's easy to add impl Foo for Bar without looking into whether the trait Foo is marked thin or not. In a large project with a lot of contributors like Gecko (or servo but I can't relate as much), this kind of oversight can happen very easily and it is problematic).
- If a struct implements a trait from a 3rd party library, the memory layout of the struct will not change without me noticing if the thinness of the third party trait changes.
- Similarly, the memory overhead of a pointer is evident by looking at the pointer itself, without having to look for the trait's definition (which ties into being able to know the memory layout of a struct containing a pointers by looking at the struct).
- Declaring trait (the collection of methods that something has to implement to interact with something else) is independent to how we refer to the trait. Orthogonality is nice, and even though I personally care about this less than I care about the parts about knowing memory layout, it is worth noting.
- Thin pointers are an optimization for the user of the trait, who can be a different person than the one who defined the Trait. There is no reason to force the decision of thinness to be made by the person who made the trait.
- I can have thin and fat pointers to the same trait.
- I can implement a standard trait with thinness that I think is right for my use case.
Memory layout is important for performance. Rust is one of the rare languages that give you full control over memory layout. In fact, thin pointers themselves are motivated by memory layout kind of optimizations. I believe that any language feature that makes it harder to understand the layout of your data is making rust lose something important. Even in C++ you can't add a virtual pointer by accident without changing something in the class definition or the one of its parents. It is not only about memory overhead, but also knowing that a structure can be placed in shared memory and read by another process. In rust, data (struct blocks) and logic (trait and impl blocks) are nicely decoupled, and it would be good that it remains the case.
Some potential objections (and my answers):
- Isn't having &Trait and &thin trait going to split the trait system and create incompatible pointers? Mut already has this kind of issue.
- My answer to this question is, how much does having &Trait and &ThinVersionOfTrait splits up the trait system? They are in fact the same thing, but &thin Trait offers the possibility of to coerce a &thin Trait into a &Trait, so I believe my proposal actually splits the trait system less than its alternative.
- This means I need to specify thinness in more places and as every programmers I am lazy and just want to type thin once in the trait definition.
- Yes, precisely! And this kind of laziness causes unexpected regressions. My proposal forces you to know what you are doing and it's a good thing. It doesn't force you to write thin if you don't want to use thin pointers, so if you don't want to type thin all over the place, then don't, and at least you will know that you are using fat pointers. Since thin is an opt-in optimization, saving a few key strokes is a terrible argument.
- This adds syntax and I don't want to deal with the concept of thin pointers because my program is fast enough already.
- Since thin pointer are opt-in, you don't have to deal with them unless you need them. If you don't want to have to write &thin Trait, then always writing &Trait will work with any Trait and any incoming data because thin pointers can be coerced into fat ones. The exception is If you need to pass a thin pointer to a 3rd party library. In this case the library is imposing this constraint on you, and for a reason since it made the choice to opt into it, and what the library's choice should not change what your code does without you noticing.
What do you think? I am mostly interested in the idea that thinness is expressed as a property of the data, instead of being expressed as a property of the trait. The question of the syntax only matters if some people agree about the above.