Blog post: Extended Enums and Thin Traits

What I just wrote would also mean some sort of syntax on the struct, of course. But yeah the more I think about it the more I would be sad that I have to look for the existence of a trait somewhere to know the memory layout of my data.

Let me get this straight, because my first reaction was, I think, based on a misunderstanding of what you wrote. Is this your proposal?

  • Let each struct opt into or out of the vtable for a specific trait.
  • The trait doesn’t care either way and in fact can be implemented both by types with vtable and types without vtable.
  • When producing a &Trait, whether this pointer is thin or not depends on the concrete type from which it is produced, and thus this is tracked in the reference type.

Assuming the above is correct, I have several questions/issues:

  • This doubles the amount of pointer types: You’d need &thin Trait and &mut thin Trait, exponentially more as more dimensions pop up. As satisfying as orthogonality is, pointer type proliferation is a real problem that Rust has been fighting in the past. A really good motivation is needed to add new pointer types.
  • How does this interact with trait objects other than those behind &? What about Box<Trait> and Rc<Trait>?
  • Won’t this split the trait’s ecosystem in two, resulting in either lots of duplication or some functionality only being available for fat/thin pointers respectively? This issue already exists for &mut to some degree. Presumably a &thin Trait could be converted to a fat &Trait., but a &mut can often be reborrowed to & and it doesn’t completely solve the problem.

Your understanding of what I wrote is correct.

Whether this doubles the amount of pointer types is really a matter of whether you think mut and const doubles the amount of pointer types, and whether the implicit possibility of fat pointer themselves doubles the number of pointer types today (when reading &Thing in the code it can already be either a thin or a fat pointer even if it is syntactically transparent). I assume that you mean that the proliferation of pointer syntax is a problem (thin would give you a guarantee about data layout and the strategy for dispatch, but doesn’t change the ownership or other kind of “logical” semantic (The issue back when sigils where removed was that we had syntax for things like ref-counting, and owned values).

Trait objects would naturally follow the same rule: Box< thin Foo>, Rc< thin Foo>, etc.

As you said, we would have rules like a thin Foo could be coerced into a fat Foo since thin is a constraint on the data type but does not preclude also having a fat pointer to the struct by copying its vptr into the fat pointer. I think (but perhaps I am misunderstanding you) that my proposition actually prevents us from splitting the trait system in two! With my proposal you can actually have types that implement a trait with the thin pointer approach and types that don’t. without my proposal however, when you create a trait, you have to decide whether people will use thin or fat pointers, but how do you make this decision? Thin pointers are an optimization for data layout, they are not related to exposing interfaces. Today we have libraries like the standard library that define common traits that can be used by everyone. This means that we can never use these standard traits with a thin pointer approach. If you find yourself in a situation where you need the thin pointer optimization, you will have to duplicate the trait and roll your own ThinWriter, etc. I am certain that everywhere &thin would have divided the trait system (that is you need both thin and fat pointers to foo in your code and have incompatibilities between between a thin and a fat references), it means that you would have had to create a &ThinFoo trait which would actually cause an even deeper division since you declare the the same interface twice can’t coerce ThinFoo into Foo.

In fact, being "thin" is a property of both the trait and the struct. That is, we have to know whether a trait is thin so that we know what size &Trait has -- and of course we don't know the type of the underlying struct there, that's the whole point of objects. On the other hand, as you point out, you have to know whether a struct implements a thin trait because it affects the layout of the struct itself.

I would say that being thin is a property of the pointer and the struct rather than the trait (which I think of as a collection of methods that constitute an interface, so really just a vtable in the case of dynamic dispatch). As a result, from my point of view the size of &Trait should be 2 words and the size of &thin Trait should be 1 word (fat being the default for back-compat).

@nikomatsakis: Sorry if this is blindingly obvious, but could you please comment on why this idea was abandoned in favor of thin traits?

I think that proposal has a lot going for it from a technical perspective, but the ergonomics are quite lacking, and it seems likely to lead to interoperability problems. Details to follow.

From an ergonomics point of view, pushing information onto the reference is also pushing complexity to the consumers of a library. I find it generally makes things feel more complex, since you are confronted with fat-ness and thin-ness at every use of a trait. The names in that proposal were also somewhat consuming: for example, &Fat<Trait> actually gives you a thin pointer. Finally, there is no precedent in Rust for a distinction like Fat<T,U> and Fat<U>, where omitting the type argument corresponds to something quite different (i.e., not a default), nor for using a “pseudo-type” like Fat that seems to really be a kind of keyword.

It also seems likely to me that having some references to traits be thin and some references be fat will lead to composability concerns, where one library is requesting a &Trait and another has a &Fat<Trait> (or vice versa). It is certainly not possible to convert from fat-to-thin, though the other direction problem does work.

It’s not clear to me that mixing fat and thin is a big use case to begin with. For the use cases I am aware of, one usually wants all objects from a trait to be thin or not thin, not some ad-hoc mixture. In general, I consider thin pointers to be somewhat niche, particularly across crates like this – typically in the downstream crate, you know the actual type concretely, and you only employ the trait to kind of thread information upstream into a more generic context. Alternatively, as might be the case for Servo, you are spreading information across crates as much for convenience as anything else, but you still expect all of the trait objects (in this case, dom nodes) to be interoperable within the same data structures.

Finally, the thin trait proposal still allows for one to take an existing fat trait and make thin references from it. You can do it by making a thin trait that extends the fat trait, but adds no methods:

trait Foo { ... }

trait ThinFoo: Foo { }

Now one can have &ThinFoo for thin pointers and &Foo for fat pointers. I believe there is no technical obstacle to this working. (Of course, it assumes we get upcasting working for trait objects, but there are many reasons that we would like that to work.)

1 Like

Thin pointers, in whichever form they are implemented, push the same amount of complexity on the users of a library, because if you want to write good and/or efficient code, you need to know the layout of your data. so the complexity should not be considered as whether the user has to add annotations to the reference types, but whether he has to figure out by himself whether the compiler is messing with his data layout because of something that is not explicitly stated in the struct definition.

If thin pointers are a niche, then adding a thin annotation only adds complexity to the people who opted in to it. And in fact it did not add complexity because they did make the conscious decision to opting into a thin pointer, so they know what and why they are doing it. The thin annotation removes the burden of reading the definition of the pointee to verify something that is important to the pointer. Implicitness, however, does add complexity in the sense that you need to read a lot more code spread in different places to understand what the layout of your data is.

In C++ I can understand the layout of a class/struct by looking at what is in its definition (and the definition of its parents, which are explicitly enumerated in the struct’s definition). I believe this to be a very important thing. If I write a struct that needs to be exactly the size of one cache line and implements or references a trait from some 3rd party library, the impact of adding or removing #[repr(thin)] on the 3rd party traits can be disastrous. Memory layout is something people rely on, especially in rust because it hugely impacts performance in some cases, and therefore I believe it is very important to know the size of the pointers in my structure and whether it bundles a vptr by looking at the structure and more important I want to make sure it can’t change underneath me.

One might answer that I am talking about very specific performance sensitive use cases, and that most people don’t need to deal with that. And that is spot on: thin pointers are only useful in these performance sensitive use-cases, so as long as it is an opt-in thing, those who don’t need it won’t be affected. Thin pointers should be considered with the mindset of the people who think in term of cache lines because it is their problem that it is fixing, and not in term of the people who won’t affected anyway.

Apologies, I tend to write long comments.

I created a new topic with a (way too long) summary of my thoughts on the subject to not hijack the broader discussions of the blog post:

I don't disagree with this. But introducing Fat<T> is not the only way to achieve those objectives. For example, at some point (can't dredge up the link right now), I was discussing the idea of inherent traits (something people have requested for other reasons), along with the restriction that thin traits must be inherent:

I’ve read the RFCs and blog posts floating around about virtual structs and thin pointers, but I haven’t actually seen the discussion from the Servo team delineating what their use case is, what it looks like now, and how such a feature would make their job much more tractable, besides the one in Nikos’s series. Can someone point me in the direction such a discussion is located?

@kristof the background section of a previous post on here goes into more detail (I don’t recall any indication that it has changed).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.