For a while, I've been thinking about how we can make Dynamically-Sized Types (DSTs) more usable. DSTs are, in my opinion, a really cool of Rust that I've known about since I started learning how to program in it, mostly because of getting really confusing error messages about "the trait bound core::marker::Sized
is not satisfied".
Problems
In today's Rust, there are a number of pain points that you encounter when working with DSTs. Here is a list of issues I've come across:
- There is no way to create your own, custom DSTs
- No way to write code that splits a fat pointer to a DST into its
(data_pointer, metadata)
parts and put them back together, generically for all DSTs - No way to pass ownership of a DST into a function (except with
Box<T>
); - No way to create a DST on the stack, except through unsizing
- No safe way to create a
Box<DST>
,Rc<DST>
, or other heap-allocated data structure, except through unsizing - You can have have DST structs where the last field is a DST, but the only way to safely create one is by making it generic and unsizing the last field.
- Writing
?Sized
everywhere sucks
What I am going to propose here aims to solve #1, #2, and #7. #3 can be solved either by &move
-references or the Unsized Rvalues RFC, and #4 can be solved with allocas or something similar. #5 should hopefully be doable one day with box DST
expressions, and #6 with Unsized Rvalues and with box DST
. The traits that are discussed here would be used in implementing all of those.
Proposal
First, we add the following trait:
// trait for types that can be behind a pointer
trait Referent {
type Meta: Copy;
}
The Referent
trait is implemented by all types in Rust1. Referent::Meta
is the type of the fat pointer metadata, i.e. usize
for [T]
and str
, and the vtable(s) for trait objects. For many types this is just ()
.
The point of the Referent
trait is so that we can have the following compiler intrinsic methods, which address problem #2. Note that I'm assuming there is no default Sized
bound here.
fn assemble<T: Referent>(data: *const (), meta: <Self as Referent>::Meta) -> *const Self;
fn assemble_mut<T: Referent>(data: *mut (), meta: <Self as Referent>::Meta) -> *mut Self;
fn disassemble<T: Referent>(ptr: *const Self) -> (*const (), <Self as Referent>::Meta);
fn disassemble_mut<T: Referent>(ptr: *mut Self) -> (*mut (), <Self as Referent>::Meta);
OK, next, we add or change the following traits. Note that in the following code, DynSized
is not a default supertrait/Self
bound as has been discussed elsewhere.
trait DynAligned: Referent {
fn align_of_val(r: &Self) -> usize;
}
// implemented by all types except `extern` types
trait DynSized: DynAligned {
fn size_of_val(r: &Self) -> usize;
}
// for all of the below traits, there are also `impl`s of one of their supertraits
// so that you only ever have to implement 3 traits:
// Referent, one alignment trait, and one sized-ness trait
// I haven't added them here because they take up a lot of space.
// you can see the full code here:
// https://gist.github.com/mikeyhew/36c75640f6b47fed150f1c9aeb0bceff
trait AlignFromMeta: DynAligned {
fn align_from_meta(meta: <Self as Referent>::Meta) -> usize;
}
// implemented by all types in the language right now, including
// trait objects, `[T]` and `str` + all `Sized` types
trait SizeFromMeta: DynSized + AlignFromMeta {
fn size_from_meta(meta: <Self as Referent>::Meta) -> usize;
}
trait Aligned: AlignFromMeta {
const ALIGN: usize;
}
trait Sized: SizeFromMeta + Aligned {
const SIZE: usize;
}
Why so many traits?
There are a lot of traits here – 7, by my count. Why not just have Referent
, DynSized
, and Sized
? Well, the idea is that it lets code be as generic as possible. For example, [T]
and str
are SizeFromMeta + Aligned
, and there may be some cases where those are exactly the traits required by some generic data structure or function.
What's with Sized::SIZE
In the code shown above, you'll notice I've added the alignment and size as associated consts to the Aligned
and Sized
traits. This is partly for consistency, but the main reason is so that they can be implemented by user-defined types. This could potentially be a hugely useful feature for some crates, or maybe I'm over-designing things
Custom DSTs
An often-discussed potential language feature is to let people define their own dynamically sized types, and give them full control over what goes in the pointer metadata, and how the size and alignment are calculated.
The way that they would do that is by impl
ing the above traits. Here is a sketch of how it could work:
- For structs, enums and unions, there are default implementations of the above traits
- For structs, the default
Referent::Meta
type is the last field'sMeta
type. This is how DST structs implicitly work today.- For enums and unions, the
Meta
type is a tuple with theMeta
type of each unsized enum variant/union field. This would be a completely new feature, as unsized enums and unions are currently unsupported. @eddyb and @kennytm your thoughts here are welcome- If there is an explicit
impl
for any of the above traits, it overrides the default implementation, and cancels the default implementation for any sub-traits.
Here's the Pixels
example, taken from Nicole Mazzuca's (@ubsan) Custom DST RFC. (If you're reading this, Nicole, thank you for all the effort you put into that RFC, and I look forward to hearing your thoughts below.)
#[derive(Clone, Copy)]
struct PixelMeta {
width: usize,
stride: usize,
height: usize
}
extern { type Pixels };
impl Referent for Pixels {
type Meta = PixelMeta;
}
impl Aligned for Pixels {
const Align = <f32 as Aligned>::ALIGN;
}
impl SizeFromMeta for Pixels {
fn size_from_meta(meta: PixelMeta) -> usize {
// just copied this from the RFC, no idea how it works
<f32 as Sized>::SIZE * meta.stride * meta.height
}
}
For another example of Custom DST, take a look at the section titled "Thin Pointers to DSTs"
Initially, we can just disallow Custom DST on struct
s, enum
s and union
s, and let people experiment by implementing the traits on extern
types.
Thin Pointers to DSTs
Another often-requested feature is to be able to create a thin pointer to a DST, such as a trait object, in order to save memory or pass it to a C api function.
What I propose here is that we add the following type to the standard library:
// NOTE: updated to replace `T: DynSized` with `T: SizeFromMeta`
struct Thin<T: SizeFromMeta> {
meta: <T as Referent>::Meta,
data: T
}
and implement Referent
, DynAligned
and DynSized
for Thin<T>
like so:
impl<T: SizeFromMeta> Referent for Thin<T> {
type Meta = ();
}
impl<T: SizeFromMeta> DynAligned for Thin<T> {
fn align_of_val(r: &Self) -> usize {
let fat: &T = unsafe {&*assemble(&r.data, r.meta)};
T::size_of_val(fat)
}
}
impl<T: SizeFromMeta> DynSized for Thin<T> {
fn size_of_val(r: &Self) -> usize {
let fat: &T = unsafe {&*assemble(&r.data, r.meta)};
T::size_of_val(fat)
}
}
This way, you can take any T: DynSized
type, and create a Box<Thin<T>>
or Rc<Thin<T>>
as desired.
Deprecating ?Sized
EDIT: discussion for this section should take place in More implicit bounds (?Sized, ?DynSized, ?Move) · Issue #2255 · rust-lang/rfcs · GitHub
So, one thing that I haven't mentioned yet is how we can deal with problem #7:
Writing
?Sized
everywhere sucks
You may have noticed that I didn't write ?Sized
anywhere in the above code. How does that work? Well, here's the rule that makes it work:
If any of the above builtin traits (
Sized
,Aligned
,SizeFromMeta
,AlignFromMeta
,DynAligned
,DynSized
, orReferent
), are present in the list of traits bounds for a generic type parameter or associated type, the defaultSized
bound is removed.
The intuition here is that writing T: DynSized
, or T: Referent
is much less confusing than T: ?Sized
or T: ?DynSized
, and that should help deal with the cognitive costs associated with DynSized
that some team members have pointed out.
With the above rules, we can just deprecate ?Sized
and write DynSized
in its place. I can imagine this idea could be pretty controversial, so if you don't like it, rather than derail the entire thread, just like the comment that someone's going to write 5 minutes from now saying that they love ?
-traits and that this is heresy that I am proposing
I'm not sure how this would work with the Move
trait that has been proposed to allow immovable types. Would Sized
require Move
, and Move
require DynSized
? Theoretically, in that case, DynSized
means DynSized + ?Move
, and Move
means DynSized + Move
. Ideas are welcome.
@nikomatsakis, @withoutboats, and @arielb1, you all expressed concerns about adding DynSized
as a ?
-trait. What do you think of this idea?
EDIT: As mentioned above, let's keep the discussion of this section confined to More implicit bounds (?Sized, ?DynSized, ?Move) · Issue #2255 · rust-lang/rfcs · GitHub
This is all experimental
Despite being rather long, this proposal is incomplete. There are lots of edge-cases that won't come up until we start implementing things in the compiler, and there's certainly going to be some non-edge case that people will hopefully point out in the discussion. I propose that, after discussing things in this thread, we create an eRFC with a general consensus on what we're trying to do, and then start adding the traits and intrinsics to the compiler on an experimental basis.
Let's get the discussion rolling!
1 Unless someone wants to add "marker" types – types that can only be used at the type level, such as byteorder::BigEndian