Pre-eRFC: Let's fix DSTs

(I wasn’t talking about ergonomics personally, but the relationship (if any) between the sizedness traits and ownership, which seems like a more fundamental matter)

The reason that Thin<T> is a generic struct is so that it can work for any type T: SizeFromMeta, and in particular, any trait object type. Thin as a trait, if it works, would only work for traits with Thin as a super trait. Which would be fine for object-oriented style hierarchies like Widget, but isn’t general enough to support most traits in Rust.

How might "sizeless types" defined in Arm C language extension for SVE fit in here?

Informally, sizeless types can be used in the following situations:

  • as the type of an object with automatic storage duration;
  • as a function parameter or return type;
  • ...
  • as the target of a pointer or reference type; and
  • as a template type argument.

Sizeless types may not be used in the following situations:

  • as the type of a variable with static or thread-local storage duration (regardless of whether the variable is being defined or just declared);
  • as the type of an array element;
  • as the operand to a new expression; and
  • as the type of object being deleted by a delete expression.

In all other respects, sizeless types have the same restrictions as the standard-defined incomplete types. This specifically includes (but is not limited to) the following:

  • The argument to sizeof and _Alignof cannot be a sizeless type, or an object of sizeless type.
  • It is not possible to perform arithmetic on pointers to sizeless types. (This affects the +, -, ++ and -- operators.)
  • Members of unions, structures and classes cannot have sizeless type.
  • _Atomic variables cannot have sizeless type.
  • It is not possible to throw or catch objects of sizeless type.
  • Lambda expressions cannot capture sizeless types by value, although they can capture them by reference. (This is a corollary of not allowing member variables to have sizeless type.) Standard library containers like std::vector cannot have a sizeless value_type.
1 Like

When is it useful to have alignment determined at run time?

I might be lacking imagination, but I need DSTs for things like multidimensional slices and custom fat pointers, and for all of them I’d be OK with just hardcoded usize alignment.

1 Like

@kornel The obvious case is trait objects – they have the alignment of the erased type.

std::mem::align_of_val(&5i32 as &::std::any::Any) // => 4
std::mem::align_of_val(&5i64 as &::std::any::Any) // => 8
3 Likes

In that case, would this work?

Thin<dyn Trait> {
     meta: <dyn Trait>::Meta, // stored at alignment of Meta
     variable_padding: [u8; ?],
     data: Trait,  // stored at alignment of trait's implementation
}

The Thin struct would have compile-time constant alignment of its Meta and a variable length. The data alignment would be stored in Meta and you’d have to read it to compute the data pointer.

Consider the type Thin<u64x2> where u64x2 has 128-bit (16-byte) alignment. The Thin type according to your scheme would be arranged as

struct Thin {
    meta: &'static Vtable,   // size = 8 bytes
    padding: [u8; 8],
    data: u64x2,             // offset = 16
}

However, the alignment of Thin would only be 8,

struct Foo {
   a: u8,
   b: Thin<u64x2>,   // offset = 8
}

here foo.b's offset can only be 8, which means the offset of foo.b.data would be 24, making access to foo.b.data misaligned.

We could fix it by making that “variable padding” depends on the run-time pointer value of self in additional to the alignment deriving from meta. The drawback is that ptr::copy will require two separate calls to memcpy.

1 Like

@kennytm Your use of a non-trait-object type with thin reminded me of something that I’ve been meaning to point out: the definition of Thin that I wrote earlier needs another type parameter in order to support unsizing:

struct Thin<T: SizeFromMeta, U: Unsize<T> + SizeFromMeta = T> {
    meta: <T as Referent>::Meta,
    data: U
}

Are you sure? I think this may be a counter-example:

let foo: RefCell<Option<&i32>> = RefCell::new(Some(&34));

// bar has pointer metadata for an `Option<Any>`, including discriminant
// EDIT: updated to replace `&Any` with `Any`
let bar: &RefCell<Option<Any>> = &foo as &RefCell<Option<Any>>;

// allowed because `bar` is only a shared borrow of the `RefCell`,
// and not the contained `Option`
foo.replace(None);
println!("{}", *bar.borrow().is_some());
// prints "true" because the pointer metadata still says that
// `Some` is the active variant
// It's easy to cause UB here

Not the discriminant, but mem::discriminant::<Enum<T>> the function (as a fn pointer, to be exact).

As in, how to obtain the discriminant. You don’t need to know how to change it, since you can’t do that after unsizing through the unsized type - as you show, by using foo instead of bar (I think you don’t need RefCell, btw, Cell should work fine).

Also, did you mean to write Any instead of &Any?

Oh, I see. And yes, Any should be there in place of &Any, good point. EDIT: fixed the error in the original comment

1 Like

Won't you also need to be able to get the field offsets? e.g. for Option<T>, the T might be in both offsets 0 and 8.

Okay, yeah, you probably need something closer to a vtable for “virtual fields”, I was oversimplifying then I said “a function pointer”.

Can’t you just disable layout optimisations for enums containing potentially dynamically sized variants?

I mentioned this as the original trade-off. It would mean we can’t ever allow T: ?Sized for Option<T> (because we guarantee layout optimizations for it to unsafe code etc.).

1 Like

Hopefully the Custom DST implementation that we come up with would allow experimenting with stuff like that in a library, without needing to add special support and syntax to the compiler. I can definitely see something like [Trait] being doable – I would call it ErasedSlice<T> where T: SizeFromMeta, with its metadata is a (usize, T::Meta) tuple. I'm not so sure how useful a nested slice [[T]] would be, since it would have to be contiguous, but it should be doable in a third-party library.

Having now skimmed the unsized rvalues RFC, it seems these will be very close to DynSized. The difference is the size is a runtime constant (it is unknown at compile time but is constant at runtime) so the following rules for DynSized can be relaxed.

These types aren't in upstream LLVM yet last time I checked, so maybe nothing needs to be done now. Nevertheless, it should be kept in mind that another trait between Sized and DynSized, say RuntimeSized, will probably need to be added at some stage.

1 Like

A couple of questions:

  1. What do you mean by “runtime constant”? Would the Pixels example from above be an instance of this or not, and why?

  2. What would the difference between DynSized and RuntimeSized be?

Basically it means std::mem::size_of exists for the type but it isn’t const.

So I know I said that I wanted to talk about this in the other issue, but the ownership stuff is at least somewhat on-topic, and it's a lot easier to reply here, so that's what I'm gonna do :slight_smile:

My opinion is that any correspondence between ownership (whether you can pass something by value or just by reference) and sizedness (whether something's size is known at compile time, run time or not at all) is an accident of the way things are in Rust today. OK, that's not entirely true – you will never be able to pass something by value without knowing its size. However, the fact that you can't pass ownership of something to a function without knowing its size is, in my opinion, an unnecessary restriction, and one of my DST-related goals is to have this restriction lifted.

Currently, there are two main ways of passing ownership of a value to a function:

  1. By value. This works for Sized + Move types, and with the unsized rvalues RFC would work for DynSized + Move types as well, by using &move-reference under the hood
  2. By boxed value. This works for any DynSized type, but would not work for every Referent type because there is no way to deallocate the boxed value without knowing its size

There is a potential third way of passing ownership that works for any Referent type: explicit &move-references. With &move T, we can pass ownership of a value, without needing to allocate it on the heap, and without needing to know its size. Whereas the unsized rvalue sugar would presumably only support types that are Move + DynSized, &move T works for any T.