Pre-eRFC: Let's fix DSTs


#2

For reference, I’ve got a custom RFC draft at https://github.com/kennytm/rfcs/tree/dyn-type/text/0000-dyn-type, but far from complete even for Pre-RFC. I believe there should a dedicated syntax for declaring custom DST instead of simply impl Referent for T.

Some opinions on this Pre-RFC from what I’ve learned there:

  1. DynAligned doesn’t work. The minimal promise a type can give about alignment, if any, is AlignFromMeta. This is because in a DST struct

    struct X<T: DynAligned> {
        a: u8,
        b: T,
    }
    

    if you need to read &x.b, you’ll first need to find the offset of b, which in turn requires knowing the alignment of b without knowing its pointer value.

    Even if DynAligned is more flexible, I don’t think any types need such flexibility, or can do anything useful with it. So I’d recommend reducing the hierarchy to 5 traits (remove DynAligned and collapse AlignFromMeta into DynSized):

    Referent                                   (extern type)
    DynSized:      Referent      (just a super trait for SizeFromMeta and Aligned)
    SizeFromMeta:  DynSized                    (dyn Trait)
    Aligned:       DynSized                    (CStr, Thin)
    Sized:         Aligned + SizeFromMeta      (u8, u16)
    
  2. Deprecating ?Sized in favor of a white-list of special traits is interesting. As you mentioned, I have some worry about the interaction with Move though. Would this mean T: Sized becomes T: Sized + ?Move? Wouldn’t it break backward compatibility?

    An alternative idea is to make ? means ?SizedT: DynSized + ?. The + ? looks very cryptic, but is still better than the Sized + ?Sized nonsense (this means ?Move).


Another [Pre-RFC] Custom DSTs
#3

Sadly this doesn’t work well for enums, because the type information necessary to understand enum optimizations would be lost after unsizing - so if you allow T: ?Sized types in Option<T>, you have to disable optimizations for all Options.
But I suppose we could allow it, if anyone needs unsizing more than enum optimizations.

EDIT: I’ve been convinced you can have your cake and eat it too, at the cost of slightly fatter pointers - see comment: Pre-eRFC: Let’s fix DSTs.

Everything else seems really good! I wonder if we can pick better names for Referent and Meta, but they’re decent even if we can’t.


#4

FYI I’ll be pretty busy with school work in the next few days, so probably won’t be able to reply to anything until Thursday or Friday


#5

Excited to see where this thread goes. Thinking more thoroughly about DSTs is one of my personal goals for 2018.

Haven’t thought about the posts so far enough to comment on them, but I just wanted to raise this as an example of the kind of things this would hopefully support:

Error is essentially a wrapper around a custom version of dyn Fail. This custom DST should ideally have these properties:

  1. It should be a single pointer; the vtable should be stored inline in the heap allocation with the data (this is to make Result<?, Error> smaller).
  2. It should store a Backtrace inside the heap allocation as well.

Not sure what the best way to do this is, just raising it as an example of what we should support.


#6

Can’t the necessary information become part of the fat pointer?


#7

That… would work (e.g. a mem::discriminant::<Enum<T>> function pointer for unsizing Enum<T>).
It would make the fat pointer larger, but maybe that’s an alright trade-off?


#8

Once upon a time we also had ideas about nifty things like “nested DSTs”, e.g. &[[T]] would contain two pieces of metadata in the fat pointer - the size of the outer [] and of the inner - resulting in a regular (rectangular) 2D array. Likewise, &[Trait] would be a homogeneous slice of an unknown type that implements Trait. And even just (this is no longer nested, only nifty) unsizing Vec<Type> into Vec<Trait>, with the Vec itself being a kind of smart pointer (just, to multiple values), and fattening up to store the additional vtable pointer. (Vec<Trait> itself may not be that useful, because you won’t be able to push any new items onto it, so maybe it’s not the best possible example; but the point is that anything which doesn’t store T unboxed, rather only through pointers, could/should/would be able to behave “like a smart pointer” from the perspective of DSTs.)

Can these kind of things be accomodated by the plan (even if just to the extent of not ruling them out)?


#9

Yes! <[T]>::Meta can be defined as (T::Meta, usize) (in a library no less).
I think only traits need special compiler support (unnameable vtable types) but they should compose such that [Trait] does the right thing.

EDIT: To expand a bit more: implementing this RFC would cover all the implementation difficulties of [[T]] / [Trait] (right now parts of the compiler assume metadata is either zero-sized or a primitive integer/pointer) and so adding [[T]] / [Trait] support would be a no-brainer (assuming no backwards-compat concerns from removing WF([T]) => T: Sized).


#10

This assumes in [[T]] all inner slices have the same length, or in [dyn Trait] it is always homogeneous. [CStr] would be impossible. I suppose this is intended as this keeps unsizing O(1)?

Also, a [[T]] cannot be used to represent a 2D matrix slice since we can’t encode the “stride”.


#11

Correct. Those are the interpretations of exists N:usize,M:usize.[[T; N]; M] and exists T:Trait,N:usize.[T; N], respectively - consistent compositions of the individual existential forms Rust allows.


#12

This is just a thought, but if you say “Let’s fix DSTs”, the first thing I want is that of the code that we can write today with generic items and types — much more of it should work by itself with trait object types substituted in.


#13

I’m still reading the rest, but I just want to say I love this introduction =)


#14

I sort of like the idea of using extern type for this; that seems to avoid the unsized type syntax that I was floating before.

One thing that is part of these traits is what type of non-meta data is expected, right? At least for things like DynSized, the Meta presumably imposes some conditions on what it can be combined with, but those are kind of implicit. That’s maybe OK, just interesting.

I like the Thin<T> setup, that seems elegant, though clearly for any particular dynamically sized type, one could make it thin as well. e.g., it seems relatively straighforward to satisfy @withoutboats’s use case from failure, right?

I definitely agree that the cognitive costs of ?Traits are problematic. I think that the idea of having a “family” of Sized-related traits helps somewhat here, where picking one from that family kind of disables the others, helps somewhat here; it’s not a complete solution (but there may not exist a complete one).

More specifically, one problem I experienced when we were discussing ?DynSized was that it was very hard to talk in “negative” terms – it feels like your proposal helps here somewhat. As an example, just look at this list:

  • T – size known statically
  • T: ?Sized – size known dynamically
  • T: ?DynSized – size not known at all

It’s logical, but it feels somehow weird to say “in order to have a type T whose type is known dynamically, I say that it is not known statically.” The idea of making T: DynSized helps here, for sure.

(By the way, how under this proposal does one do the equivalent of T: ?DynSized? Oh, I guess T: Referent?)

Another thing I’ve found quite annoying in practice with ?Sized bounds is that they must be repeated so commonly, at every point. For example:

struct Foo<T: ?Sized> { }

impl<T: ?Sized> Foo<T> {
    fn compare<U>(&self, other: &Foo<U>) { ... }
}

See the problem? Oh, right, I need U: ?Sized in compare … =) This part is not helped by your proposal, of course. e.g., I would presumably have to write T: DynSized at each spot there.

(And of course there is the weirdness that adding bounds makes the type match more types — but even though that doesn’t make sense when you think hard about it, I do think it could feel more natural overall. Typically, adding bounds gives you more capabilities – and here it’s giving you the “extra” ability to work with types like [T]. Of course, it’s only when you think hard that you realize this “extra” ability is something the caller gets, not the callee as usual. But it’ll matter to stuff like semver guidelines.)

I am still interested in seeing if we can “rephrase” these concepts to connect them to ownership and borrowing. In particular, it feels very “significant” to me that T: Referent / T: ?DynSized basically corresponds to a non-owned T.


#15

Do you have any further thoughts on this? Just as a contrary data point, to me it feels like an accident - though I don’t have any specific, concrete thoughts about it (either).


#16

Well, I don’t think it’s an accident, but the idea may indeed be a sort of dead-end. On the one hand, the main reasons I can think of to know the size of something all correspond to ownership:

  • You need to free it (i.e., you own it), and hence want to call dealloc
  • You want to copy it to your stack frame (i.e., you want to take ownership of it) and need to know how much stack space to allocate

However, there is one other place where we really need to know the size: if you have a type like &[A], you are borrowing those A values, but you still need to know their size to compute their offset. Similar things could apply for other types – the main reason this isn’t true for structs is that we restrict DST types to being in the final position layout-wise, so we only need to know their alignment to compute their offset.

I’ve been thinking more about @mikeyhew’s proposal to have a “family of sized traits” and to have you pick the most appropriate one (with a default of Sized, which we might even consider deprecating and renaming to something like StaticSized). It’s growing on me. =)

It seems like the learning progression would be something like:

  • Initially, you don’t think about it, you just write T.
  • Later, you learn that types have size categories – essentially – and that each type parameter defaults to Sized, but you can manually select a different one (e.g., T: DynSized or T: Referent)
    • careful attention to names seems important here to make this feel natural

This seems ungreat but somehow better than ?Sized to me. Obviously the behavior of these traits is kinda special (in that typing T: Referent disables a default bound) but ?Referent is also kinda special (more syntax).

Obviously there is still the “repeat all the time” problem, where you have to always repeat T: Referent or T: DynSized, but I suppose we might be able to address that too via a lint. I am imagining something fairly targeted:

If you author a type that “opts out” of Sized, like struct Rc<T: DynSized>, then we will lint within that crate (or within impls on that type? something like that) if you have Rc<U> and U uses the implicit sized declaration. i.e., you’d be encouraged to write U: Sized or U: DynSized etc, so as to make explicit what you want. (I am trying to make this targeted so that it affects the author of the Rc abstraction, but not others who are just using it.)


#17

I see what you mean. T: DynSized as currently written still works with Box<T>, and can be copied around with no problems, but it doesn’t work with Rc<T>.

In particular, that means that Rc<Thin<dyn Trait>> doesn’t work. I think it’s really important that we get that working, so let’s think about it for a second. Here’s what the monomorphized type RcBox<Thin<dyn Trait>> looks like:

struct RcBox<Thin<dyn Trait> {
    strong: ...,
    weak: ...,
    value: Thin<dyn Trait> {
        meta: <dyn Trait>::Meta, // stored at alignment of <Thin<dyn Trait>>
        data: dyn Trait
    }
}

The problem is, the value field of RcBox, is stored at an offset determined by Thin<dyn Trait>'s alignment, which is only determined by reading value.meta. but value.meta is stored at that alignment, meaning we have a chicken-or-egg problem and have to read value.meta to read value.meta.

The solution, I guess, is to “lift” the meta field out into the containing RcBox struct, and store it at the proper alignment of a <dyn Trait>::Meta. That would look like this:

struct RcBox<Thin<dyn Trait>> {
    strong: ...,
    weak: ...,
    value.meta: <dyn Trait>::Meta, // stored at alignment of <dyn Trait::Meta>
    value: Thin<dyn Trait> {
        data: dyn Trait
    }
}

What we need for Thin to work is a way to tell the compiler that a Thin<T>, when stored as a field in a containing struct, should have the meta field separated out and stored at its own static alignment. A Thin<T> by itself would always be stored at the alignment of the T stored within (assuming it’s larger than the alignment of the T::Meta also stored within), and an RcBox<Thin<T>> would also be stored at the alignment of the T stored within, however the padding between the meta and data field of the Thin<T> could vary depending on whether the Thin<T> is stored by itself or as a field in a struct.

I’m not sure what traits we would need for that. Something to experiment with though.

Unless someone comes up with a better solution. There’s also the option of just creating ThinRc<T>, ThinBox<T> and other Thin*<T> pointer types, but then you have to create a “thin” version of each pointer type.


#18

I have a question that why Thin is designed to be a struct. In my mental model, it is better to be a trait. Assume the “field in trait” feature is ready, we could define the Thin trait to be:

trait Thin { 
    type Meta;
    field metadata : Self::Meta;
}

Any concrete type which implemented this trait is guaranteed to contain a metadata. So if T: Thin, &T Box<T> and Rc<T> could make use of this field and themselves could be thin pointers. And if there is a trait Widget: Thin {} , we could make the trait object Box<Widget> a thin pointer too, since the instance of a concrete type T contains a metadata if T: Widget. (Of course, Thin trait must be specially treated by compiler, may be a lang item).

Disclaimer: I don’t know whether it is feasible to design this trait. I just feel the Thin struct is kind of hard to understand for me.


#19

I’m not really tied to any particular syntax, I was just thinking that, since the extern type syntax is already there, we may as well use it while we’re testing things out

I’m sorry, I really have no idea what you’re asking here :sweat_smile:

I like it too, and hopefully we can make it work, although there are some complications with DynAligned as @kennytm pointed out. And yes, the idea is that Thin<T> works for any T: SizeFromMeta (I initially wrote T: DynSized, but I’m gonna change that right now).


#20

I’m glad to see that you’re all enthusiastically discussing how we can make all these traits as ergonomic as possible. However, since discussion is already taking place on basically that topic in https://github.com/rust-lang/rfcs/issues/2255, it’s kind of awkward to be discussing the same thing in two places at once.

How about this: we keep the discussion about ergonomics confined to the GitHub issue, and for now, in this topic, focus on which traits we actually need, and assume we’ll figure out an ergonomic way to deal with them. (I know, this is my fault, since I’m the one that started talking about it here.)


#21

(I wasn’t talking about ergonomics personally, but the relationship (if any) between the sizedness traits and ownership, which seems like a more fundamental matter)