[Pre-RFC] Improved Unsizing

Summary

There is currently a lot of pain around unsizing and custom unsized types in Rust. This proposal does not attempt to address every possible use-case or problem around the current system, but does attempt to address several current limitations in terms of improving existing tools. It does this by loosening restrictions on several existing unstable traits.

Motivation

Currently, writing custom unsized types in Rust is very painful. To initialize one tends to require large amounts of unsafe code, and to write items handling them is similarly painful, due to restrictions on CoerceUnsized. This proposal does not fix all use cases, but does make initializing custom unsized types and writing wrappers for them significantly less painful in some common cases.

Guide-Level Explanation

Unsize Changes

Unsize can now be implemented (following normal trait impl rules) by users on types that fulfill the following conditions, given a pair of types T: Sized and U: !Sized:

  • T has the same repr as U
  • T has the same fields as U, excepting the last field
  • The scope within which the Unsize impl resides can 'see' all of both T and U's fields. Same module for private fields, same crate for pub(crate) fields, dependent crates for public fields
    • For items from other crates, U must be exhaustive
  • for T's last field TLast and U's last field ULast, TLast: Unsize<ULast>

Examples

The following examples result in a type that can be unsized into another succesfully:

pub struct Object {
    name: String,
    fields: [*const Field],
}

pub struct SizedObject<const N: usize> {
    name: String,
    fields: [*const Field; N],
}

impl<const N: usize> Unsize<Object> for SizedObject<N> {}
#[repr(C)]
pub struct DebugableItem {
    count: i32,
    debug: dyn Debug,
}

#[repr(C)]
pub struct Field1 {
    count: i32,
    name: String,
}

impl Unsize<DebugableItem> for Field1 {}

#[repr(C)]
pub struct Field2 {
    count: i32,
    idx: i32,
}

impl Unsize<DebugableItem> for Field2 {}

The following examples do not result in a type that can be unsized into another:

#[repr(C)]
pub struct CSlice([u8]);

pub struct RustSlice<const N: usize>([u8; N]);

// Errors: repr mismatch
impl<const N: usize> Unsize<CSlice> for RustSlice<N> {}
mod private {
    pub struct Foo {
        a: [bool],
    }
}

pub struct Bar {
    a: [bool; 10],
}

// Error: Foo fields no visible
impl Unsize<Foo> for Bar {}

CoerceUnsized Changes

CoerceUnsized will no longer be limited to only being implemented on structs. Enums will be allowed to implement it, given that they follow the following rules:

  • The type to coerce (T) exists at least once in a non-phantomdata field in the enum
  • T exists at most once in a non-phantomdata field in each variant
  • Given each variant containing a type using T, and the type to coerce to (U), CoerceUnsized<Foo<U>> for CoerceUnsized<Foo<T>> must be implemented for the containing type

Examples

The following examples compile successfully

pub enum MaybeMut<T> {
    Foo(*const T),
    Bar(*mut T),
}

impl CoerceUnsized<MaybeMut<U>> for MaybeMut<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized,
{}
pub enum CoercableOption<T> {
    Some(&mut T),
    None,
}

impl CoerceUnsized<CoercableOption<U>> for CoercableOption<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized,
{}

The following examples do not

pub enum MissingT<T> {
    Foo(i32),
    Bar(bool),
}

impl CoerceUnsized<MissingT<U>> for MissingT<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized,
{}
pub enum TooManyT<T> {
    Field(*const T, *mut T),
    Missing,
}

impl CoerceUnsized<TooManyT<U>> for TooManyT<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized,
{}

Reference-level explanation

Unsize Changes

The compiler already automatically generates all Unsize impls, this proposal will simply expand the cases it works on. The code generated should look similar to existing generated code.

CoerceUnsized Changes

The code generated by the compiler for coerce-unsizing enums will follow a fairly simple format:

match self {
    Var1 { field_0, .., field_n } => Var1 { field_0, .., field_n as &mut U },
    ..,
    VarN { field_0, .., field_n } => VarN { field_0 as &U, .., field_n },
}

Drawbacks

TODO

Alternatives

  • User-controlled unsizing
    • Limits ergonomics, no Box::new(foo) as Box<Bar>
    • Requires unsafe to implement, manual allocation

Unresolved questions

  • Should any more situations be restricted/allowed?
    • Should downstream crates never implement Unsize<Upstream>?
  • Will this lock us out of any future alternatives?

Future Possibilities

  • Allowing unsizing based on fields (len field in a struct)

I believe the issue with enums implementing CoerceUnsized is that it will make calculating the offset of fields a lot harder due to potential alignment differences between variants.

Is this 'difficult to implement' or 'impractical to implement'?

I think that allowing this kind of coercion would be worth difficulty, as I've had at least 2 projects it would be useful for, but on the other hand, I understand hesitancy if it looks like it would slow down common cases, or just take someone time they'd rather not spend. I'd also be willing to put in some work to implement this - I'm not super familiar with the Rust code base, but it wouldn't be my first contribution.

This seems like a back-compat hazard if this coercion is allowed outside the crate. I would prefer if there was some annotation to allow this coercion outside the given crate/module.

The restrictions that all fields must be pub and the structs must be non-exhaustive is designed to require that changing the type or layout of the fields would already be a back-compat issue. Are there any cases where all fields are public and the struct is exhaustive that would be allowed before but not after this change?

To be clear: I'm not fully against an annotation, but I was hoping to design the rules such that it wasn't necessary.

Just to be clear, the above snippet would automatically create the following impl right?

impl Unsize<Object> for SizedObject {} 

so a downstream crate could do the following:

fn unsize_me(obj: &SizedObject) -> &Object { obj }

Because of that automatic implementation and the fact that we can't restrict impls to be crate local. So this isn't as straightforward as it seems.

Perhaps there's another solution, where instead of relying on Unsize it's just some builtin, but that seems to duplicate much of the work that Unsize does, so it may not be desirable.

I wanted to use Unsize because it's the standard way to write generic functions that accept unsizable types. I don't want to force duplication of functions, with unsize_builtin vs unsize_user. The impl is correct, and the exact purpose of this change - to allow user-defined types to be unsized. (Also why this is an RFC - The change would be immediately visible to stable users)

It may be possible to require an attribute on user types to generate impls for them. On the other hand, how popular are user-defined unsized types with matching definitions in a visible scope? In other words, how likely is it that people write code that looks like that today? Requiring an #[allow_unsize] attribute (your first suggestion) would fix both your mentioned issues at once - to unsize, the definer must allow it, and with what you've just said I'm starting to come around to the idea.

As opposed to #[allow_unsize], I'd propose "just" allowing to write impl Unsize<Object> for SizedObject {}. (Then I'm sure a proc-macro would pop up to package that into an attribute/derive would pop up fairly quickly.)

At an impl level, the unsizing conversion would be implemented based on field visibility. However, actually using it is gated behind a CoerceUnsized implementation, which are all (required[1] by the compiler to be) gated on T: Unsize<U>. This would require checking that no coersions bypass a CoerceUnsized check (e.g. for primitive types e.g. references, pointers) but would AIUI be a minimal change to the compiler.

(For the duplicated functions, we currently have https://lib.rs/crates/unsize.)

Also just for reference so I don't forget again when viewing this:

  • CoerceUnsize is for containers, e.g. &T: CoerceUnsize<&U> where T: Unsize<U>
  • Unsize is for objects, e.g. [T; N]: Unsize<[T]>

  1. Caveat: I'm working on adding impl<T: ?Sized + Unsize<U>, U: ?Sized> CoerceUnsized<JustMetadata<U>> for JustMetadata<T> {} to coerce <T as Pointee>::Metadata to <U as Pointee>::Metadata. I forget whether it's emitting the obligation that T: Unsize<U> (but it's still necessarily provided as part of the impl). ↩ī¸Ž

2 Likes

Ooh, I like that idea. Allow manual implementations of the Unsize marker, which the compiler checks for validity.

Updated for manual Unsize instead of automatic

Now that you're using an impl Unsize, I'd change this to be that all of the fields (in both T and U) are visible from the location of the impl.

Note that this is probably more restrictive than you intend: #[repr(Rust)] types never have the same representation as another #[repr(Rust)] type. Consider the following example where this is the case today:

struct T {
    head: u8,
    tail: [u16; 2],
}

struct U {
    head: u8,
    tail: [u16]
}

T will actually likely be laid out as

// this is an implementation detail of rustc
#[repr(C)]
struct T {
    tail: [u16: 2],
    head: u8,
}

reordering the higher alignment fields to the front of the struct to minimize padding. As unsized fields are (as a current restriction of rustc) required to be the last field of a struct, unsizing generic types only works because (as a current implementation detail of rustc) Unsize generic types are always placed at the end of the struct, and the other fields are layout optimized ignoring it.

(In practice I think this never results in layout pessimisation, and it's just a difference of where the padding goes, since we requiring padding to alignment? I don't have a proof, just a gut feeling from trying to think of a counterexample for a minute.)

If the intent is to allow unsizing #[repr(Rust)] types, then the RFC should at least mention the new layout constraint on them (if the last field is Unsize it can't be reordered) and potentially an assertion that this does not negatively impact layout optimizations.

Note that unsizing implementations are not copies; rather, they're instructions on how to create the correct pointee metadata to turn dataptr into (dataptr, metadata).

Note that I have an open PR to implement CoerceUnsized for JustMetadata<T> (which is a newtype around <T as Pointee>::Metadata). I don't know exactly what you're envisioning here, but it'll probably need to use JustMetadata<T> for the same reason CoerceUnsized directly on metadata types is problematic (what does (): CoerceUnsized<usize> even mean?).


Also I scared myself temporarily about Unsize on enums, which would be quite problematic in the face of enum discriminant optimizations/niching.

Updated all but the repr commentary. I'm not 100% on the best way to resolve that, as I'm not super familiar with Rust's current layout algorithm. My instinct for it not causing issues is that the field to unsize must be the last one in the struct. That means that there will never be a field after it that would want to reorder to before it. Thus any that it would re-order before will either fill the padding exactly either way, or, would require the end be padded to length anyways. So this is a safe restriction as long as it only applies to the final field.

// This has size 4 either way
struct Foo {
    a: u8,
    b: u8,
    c: [u16; 2],
}

// The tail has no size - no advantage to moving it
struct Bar {
    a: u8,
    c: [u64; 0],
}

There could be a field before it that wants to reorder after it to minimize padding.

The theorem to prove is that given any layout, you can construct a layout at least as good with the desired field in the unsizing location.

A failed partial proof

In the document, we use the definition of a mod b for integer a, positive integer b as the smallest nonnegative integer x where (a - x) / b = 0.

For convenience, we define a amod b as -a mod b.

For convenience, we define a ~ b as the smallest integer x â‰Ĩ a where x mod b = 0.

Lemma 1: a ~ b = a + (a amod b).

Proof left to the reader.

Lemma 2: For any nonnegative a:
Lemma 2.1: a mod b ≤ a.
Lemma 2.2: a amod b ≤ a.
Lemma 2.3: a mod b ≤ b.
Lemma 2.4: a amod b ≤ b.

Proof left to the reader. Trivial.

Theorem: It is always possible to reorder a critical type in alignment padded structure layout such that all non-padding fields come before it without increasing structure size.

We split the problem into three cases. In all cases, our critical type is Tail(size A, align B), and our structure has alignment Q, which is assumed to be sufficiently aligned for all fields.

Case 1:

All other fields are before the tail.

The layout can be broken into four parts:

  • Before(size X, align Y)
  • Padding to B
  • Tail(size A, align B)
  • Padding to Q
struct(size N1, align Q) {
    // align Q
    Before(size X, align Y)
    Padding(size X amod B)
    // align B
    Tail(size A, align B)
    Padding(size ((X ~ B) + A) amod Q)
    // align Q
}

where
N1 = ((X ~ B) + A) ~ Q)

All non-padding fields are before the critical type, thus this layout already satisfies our desired condition.

Case 2:

All other fields are after the tail.

The layout can be broken into four parts:

  • Tail(size A, align B)
  • Padding to Y
  • After(size X, align Y)
  • Padding to Q
struct(size N2, align Q) {
    // align Q
    Tail(size A, align B)
    Padding(size A amod Y)
    // align Y
    After(size X, align Y)
    Padding(size ((A ~ Y) + X) amod Q)
    // align Q
}

where
N2 = ((A ~ Y) + X) ~ Q

By definition:
B ≤ Q.
Q mod B = 0.
A mod B = 0. Y ≤ Q.
Q mod Y = 0.
X mod Y = 0.
N1 mod Q = 0.
N2 mod Q = 0.

failure

We can convert this to the layout in case 1 by setting Before(size X, align Y) = After(size X, align Y). We prove this does not increase the size.

Proof:

N1 ≤ N2
((X ~ B) + A) ~ Q) ≤ ((A ~ Y) + X) ~ Q)
let N'1 = (X ~ B) + A
        = X + (X amod B) + A
let N'2 = (A ~ Y) + X
        = A + (A amod Y) + X
N'1 ~ Q ≤ N'2 ~ Q
N'1 + (N'1 amod Q) ≤ N'2 + (N'2 amod Q)
(N'1 amod Q) - (N'2 amod Q) ≤ N'2 - N'1
(N'1 amod Q) - (N'2 amod Q) ≤ A + (A amod Y) + X - (X + (X amod B) + A)
(N'1 amod Q) - (N'2 amod Q) ≤ (A amod Y) - (X amod B)
???
failure

Reorder to

  • After(size X, align Y)
  • Padding(size ((A ~ Y) + X) amod Q)
  • Tail(size A, align B)
  • Padding(size A amod Y)

The overall size/align trivially stays the same, and we only need to prove that Tail is sufficiently aligned to B.

(X + (((A ~ Y) + X) amod Q)) mod B = 0
???

Case 3:

There are fields before and after the tail.

The layout can be broken into six parts:

  • Prior(size I, align J)
  • Padding to B
  • Tail(size A, align B)
  • Padding to V
  • Latter(size U, align V)
  • Padding to Q

Strategy: switch Prior/Tail along with padding to maybe reduce to case 2?

Fridge thought even though it's way too late to be doing formal proofs: I was assuming that the noncritical blobs also had a size multiple of alignment, and that's not necessarily the case. Additionally, a following field blob is not necessarily aligned to the maximum of its internal field allotments.

Even forgetting these things I got horribly stuck to not remembering how to do this kind of modular reasoning over multiple rings simultaneously — if I ever knew at that

The main thing that needs to be provoked for a counterexample is an unsizable tail with padding after it (this occurs to pad to the whole structure alignment). The hard part is doing so with a structure where enough fields can be niched into that space to save an alignment size. The spare unused capacity after the unsizable tail is by definition capped at align_of(Whole) - align_of(Tail), so to save an alignment size, some padding before the tail must exist to be able to combine with the trailing padding to make enough space to shrink the type.

The unsizable field is always last. For structs the padding before the field can be easily calculated using a field offset and the alignment stored in the metadata. For enums it would require reading which variant is the current variant and matching on this. This is a lot more complex and means that taking the address of a field requires a read of the pointee, which may be invalid for raw pointers.

Assuming I understood the problem correctly, here's an algorithmic proof:

Suppose the type's alignment requirement is B.

for A in [1,2,4,...,B/4,B/2]:

  1. Gather the fields with alignment A together.
  2. If their total size is not a multiple of the next alignment (2A), add one A-byte padding element to fix that
  3. Reorder these fields arbitrarily (they have the same alignment)
  4. Replace these fields (including the possible padding element) with a single tuple with 2A-alignment (this will be included in step 1 on the next iteration)

When done you only have fields of the maximum alignment B, so no more padding is required. Total padding added is at most 1+2+4+...+B/4+B/2 = B-1 bytes, so it is minimal as the padded size must be a multiple of B. Any specific field can be made the last in the struct by doing so in every step 3. (You can get much more freedom in step 4 by pairing the fields into multiple chunks, each a multiple of 2A in size.)

I'm not 100%, but I think that works? If the last field is unsize, we can always align it into last, and if not, we just don't care. And depending on when layout is decided, we may even be able to only do it when Unsize is implemented.

I realized I didn't respond to the last thing, about 'future ideas' - I have vague ideas about some #[metadata] attribute or something, but I think your DynSized proposal would fulfill that requirement, and combined with this, may even entirely fulfill my usability desires.

I'd note that in this case, it's entirely possible that Object and SizedObject use different layouts, and in particular: fields be at a different offset in SizedObject and Object. As a practical example, if *const Field was something with more alignment (say, a simd type), then then using alignment-sorting, fields would come before name in SizedObject, but not in Object. This change would effectively enjoin the compiler from ever reordering the last field of a structure relative to any other field, which can have significant impact on its ability to lay types out the most efficiently.