[Pre-RFC] Flexible Unsize and CoerceUnsize traits

veykril · May 5, 2023, 8:57am

This is an attempt at changing the current CoerceUnsize and Unsize in a way that makes them more flexible and also allowing for future additions to their API surface without incurring breakage to support even more dynamic unsizing in an attempt to get a minimal design that could be stabilized in the nearer future. The pre-rfc phase here is meant to collect general feedback about possible obvious flaws and drawbacks that may have been missed in the writing and designing of this.

RFC

Summary

Move the unsizing logic out of the compiler into library source, allowing for a more flexible design of the features and allowing user code to implement Unsize for their own types.

Motivation

Currently unsizing in Rust is very rigid, only allowed in very specific scenarios permitted by the rules surrounding the current Unsize and CoerceUnsized traits and its automatic implementations by the compiler.

This has the downside of being very magical, as the majority of the logic happens inside the compiler opposed to the source code. It also prevents certain unsizing implementations from being doable today.

This RFC attempts to make these rules more flexible by also allowing user implementations of the traits that define how the metadata is derived and composed back into unsized objects.

Guide-level explanation

Unsize

Unsizing relationships between two types can be defined by implementing the unsafe Unsize trait for a type and its target unsized type. These implementations describe how the metadata is derived from a source type and its metadata for its target unsized type.

An example implementation of Unsize for [T; N] to [T] unsizing looks like the following:

// SAFETY:
// - `Unsize::target_metadata` returns length metadata that spans the entire array exactly.
// - `[T; N]` is a contiguous slice of `T`'s, so a pointer pointing to its data is valid
//   to be interpreted as a pointer to a slice `[T]`.
unsafe impl<T, const N: usize> Unsize<[T]> for [T; N] {
    fn target_metadata((): <Self as Pointee>::Metadata) -> <[T] as Pointee>::Metadata {
        N
    }
}

The metadata for the source type [T; N] is the unit type (), as there is no metadata for sized types. The implementation then just returns the length N from the array type, as this is the appropriate metadata for a slice produced from such an array.

An example that does an unsized to unsized coercion is the following implementation (for trait upcasting provided by the compiler):

trait Super {}

trait Sub: Super {}

// SAFETY:
// - `Unsize::target_metadata` returns a vtable provided by the vtable of the `dyn Sub` object.
// - `dyn Super` is a super trait of `dyn Sub`, so a pointer pointing to data for a `dyn Sub`
//   is valid to be used as a data pointer to a `dyn Super`
unsafe impl Unsize<dyn Super> for dyn Sub {
    unsafe fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn super as Pointee>::Metadata {
        metadata.upcast()
    }
}

This is an unsizing impl required for trait upcasting, where the metadata (the vtable of the trait) of the dyn Super type has to be extracted from the the metadata of the dyn Sub trait.

CoerceUnsized

To actually enable the unsizing coercion of objects, the CoerceUnsized trait has to be implemented. It defines how the unsizing of the inner type occurs for a given pointer or wrapper type.

A CoerceUnsized implementation has specific requirements to be valid which boil down to 2 kinds:

A non-delegating CoerceUnsized impl
A delegating CoerceUnsized impl

1. A non-delegating `CoerceUnsized` impl

Such an impl is used for actual pointer like types, such as &'a T or Arc<T>. The implementing type and the CoerceUnsized target type must differ in a single generic parameter only. Say, the parameters are T and U. Then,

T is the generic parameter of the implementing type; is bound as T: Unsize
U is the generic parameter of the CoerceUnsized target type

Example impl for the `& 'a T` type

impl<'a, 'b, T, U> CoerceUnsized<&'a U> for &'b T
where
    'b: 'a,
    T: Unsize<U> + ?Sized,
    U: ?Sized
{
    fn coerce_unsized(self) -> &'a U {
        let metadata = Unsize::target_metadata(core::ptr::metadata(self));
        let untyped_data_ptr = (self as *const T).cast::<()>();
        // SAFETY: [`Unsize`] demands that the return value of
        // `Unsize::target_metadata` is valid to be used together
        // with the data pointer to be re-interpreted as the unsized type
        unsafe { &*core::ptr::from_raw_parts(untyped_data_ptr, metadata) }
    }
}

Example impl for the `Arc<T>` type

impl<T, U> CoerceUnsized<Arc<U>> for Arc<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized
{
    fn coerce_unsized(self) -> Arc<U> {
        let ptr = Arc::into_raw(self);
        let metadata = Unsize::target_metadata(core::ptr::metadata(ptr));
        let untyped_data_ptr = (ptr as *const T).cast::<()>();
        // SAFETY: [`Unsize`] demands that the return value of
        // `Unsize::target_metadata` is valid to be used together
        // with the data pointer to be re-interpreted as the unsized type
        // and that `std::mem::size_of` on `U` will report the same size as the `T`.
        unsafe { Arc::from_raw(core::ptr::from_raw_parts(untyped_data_ptr, metadata)) }
    }
}

Important to note is that Unsize impls are required to return metadata that make the unsized object report the same size as the source type. If that was not the case, the Arc impl above would be unsound, as its destructor would try to deallocate a smaller allocation than it initially owned.

2. A delegating `CoerceUnsized` impl

Such an impl is used for wrapper like types, such as Cell<T> or Pin<T> where the impl is required to list a CoerceUnsized bound on the generic parameters of the wrapping type.

Example impl for the `Cell<T>` type

impl<T, U> CoerceUnsized<Cell<U>> for Cell<T>
where
    T: CoerceUnsized<U>
{
    fn coerce_unsized(self) -> Cell<U> {
        Cell::new(self.into_inner().coerce_unsized())
    }
}

Example implementation for `Option<T>`

A delegating impl is not limited to struct types.

impl<T, U> CoerceUnsized<Option<U>> for Option<T>
where
    T: CoerceUnsized<U>,
{
    fn coerce_unsized(self) -> Option<U> {
        match self {
            Option::Some(t) => Option::Some(t.coerce_unsized()),
            Option::None => Option::None,
        }
    }
}

With such an impl, Option<&[T; N]> could coerce to Option<&[T]>.

Reference-level explanation

`Unsize`

The new Unsize trait definition looks like the following:

/// # Safety
///
/// The implementation of [`Unsize::target_metadata`] must return metadata that
/// - is valid for interpreting the `Self` type to `Target`, and
/// - where using `core::mem::size_of` on the unsized object will report the
///   same size as on the source object.
pub unsafe trait Unsize<Target>
where
    Target: ?Sized,
{
    fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <Target as Pointee>::Metadata;
}

This trait allows specifying how to derive the metadata required for unsizing from the metadata of the source type Self or the compile time type information.

`CoerceUnsized`

The new CoerceUnsized trait definition looks like the following:

pub trait CoerceUnsized<Target> {
    fn coerce_unsized(self) -> Target;
}

Implementations of this trait now specify how the coercion is done. This also drops the ?Sized bound on Target, as returning unsized values is not possible currently. This can be relaxed without breakage in the future.

In order to prevent misuse of the trait as means of implicit conversions, implementations for this trait require specific conditions to hold which the compiler will enforce. This can be relaxed without breakage in the future.

For an implementation to be valid, one of the following must hold:

Self and Target
- must be references or raw pointers to different generic parameters
- type parameter T of Self has T: Unsize bound where U is the type parameter of Target
Self and Target
- must have the same type constructor, varying in a single type parameter
- type parameter T of Self must have a T: CoerceUnsized bound where U is the type parameter of Target
- Example:
```
impl<T: CoerceUnsized, U> CoerceUnsized<Cell>
 for Cell<T>
```
Self and Target
- must have the same type constructor, varying in a single type parameter
- type parameter T of Self must have a T: Unsize bound where U is the differing type parameter of Target
- Example:
```
impl<T: ?Sized + Unsize, U: ?Sized, A: Allocator> CoerceUnsized<Box<U, A>>
 for Box<T, A>
```

Implementations provided by the standard library

`Unsize`

Today, all Unsize implementations are provided by the compiler. Most of them will continue to be provided by the compiler as they involve trait objects which depend on all traits defined. The only one that will no longer be emitted by the compiler is the [T; N]: Unsize<[T]> implementation as we can now fully implement it in library source. The implementation will be as follows (and live in core):

// SAFETY:
// - `Unsize::target_metadata` returns length metadata that spans the entire array exactly.
// - `[T; N]` is a contiguous slice of `T`'s, so a pointer pointing to its data is valid to
//   be interpreted as a pointer to a slice `[T]`.
unsafe impl<T, const N: usize> Unsize<[T]> for [T; N] {
    fn target_metadata((): <Self as Pointee>::Metadata) -> <[T] as Pointee>::Metadata {
        N
    }
}

`CoerceUnsized`

The non-delegating implementations of CoerceUnsized provided by the standard library will have the implementation of their fn coerce_unsized function written to disassemble the source into pointer and source metadata, make use of the Unsize trait for extracting the target metadata from the source metadata, and then reassembling the pointer and target metadata into the target.

For the delegating implementations, the implementation of the fn coerce_unsized function will merely delegate to the inner value and then wrap that result again.

Implementations provided by the compiler

Note: This section uses fictional rust syntax

For types to trait object for their implemented types, the compiler will generate Unsize implmentations:

unsafe impl<trait Trait, T: Trait> Unsize<dyn Trait> for T {
    fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn Trait as Pointee>::Metadata {
        // magic
    }
}

For the unstable trait upcasting feature, the compiler will generate the following Unsize implementations:

unsafe impl<trait Trait, trait Super> Unsize<dyn Super> for dyn Trait
where
    dyn Trait: Super
{
    unsafe fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn super as Pointee>::Metadata {
        // compiler magic
    }
}

This is safe to do, as the metadata of source can be safely extracted from a raw pointer without touching the data for a CoerceUnsized implementation on *const T/*mut T.

To keep backwards compatibility (as these are already observable in today's stable rust), the compiler also generates Unsize<Foo<..., U, ...>> implementations for structs Foo<..., T, ...> if all of these conditions are met:

T: Unsize.
Only the last field of Foo has a type involving T.
Bar<T>: Unsize<Bar>, where Bar<T> stands for the actual type of that last field.
Foo: Pointee<Metadata = <Bar as Pointee>::Metadata>

Unsize un-lowering for known impls

The compiler may "un-lower" some known unsize coercions back into builtin operations in the MIR as to not degrade performance too much, as lowering this new definition will introduce a lot of new operations that don't exist in the current unsizing logic. This would be similar to how builtin operators for primitives work currently, where they are typechecked with the trait impls but then lowered back to builtin operators in the mir.

`TypeMetadata<T>` and Unsizing

See the following PR for context: Implement pointee metadata unsizing via a TypedMetadata<T> container #97052.

With this new definition, we can implement CoerceUnsized for TypeMetadata without having to special case it in the compiler as follows:

struct TypedMetadata<T: ?Sized>(pub <T as core::ptr::Pointee>::Metadata);


impl<T, U> CoerceUnsized<TypedMetadata<U>> for TypedMetadata<T>
where
    T: ?Sized + Unsize<U>,
    U: ?Sized,
{
    fn coerce_unsized(self) -> TypedMetadata<U> {
        TypedMetadata(Unsize::target_metadata(self.0))
    }
}

Pin Unsoundness

See the following issue for context: Pin is unsound due to transitive effects of CoerceUnsized #68015

The design of the new traits here do not address the underlying issue in regards to Pin. The author of this RFC feels like addressing the Pin soundness in the definitions of the traits is wrong, as in almost all cases where are a user implements one of these traits Pin will be irrelevant to them. And at its core, unsizing is not coupled with Pin whatsoever.

It would make much more sense to fix the unsound CoerceUnsized implementation that Pin provides.

That is given the current implementation of (with the new definition of the trait):


// Copied from core library docs:
// Note: this means that any impl of `CoerceUnsized` that allows coercing from
// a type that impls `Deref<Target=impl !Unpin>` to a type that impls
// `Deref<Target=Unpin>` is unsound. Any such impl would probably be unsound
// for other reasons, though, so we just need to take care not to allow such
// impls to land in std.
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
    P: CoerceUnsized<U>,
    // `U: core::ops::Deref`, this bound is accidentally missing upstream,
    // hence we can't use `Pin::new_unchecked` in the implementation
{
    fn coerce_unsized(self) -> Pin<U> {
        Pin {
            pointer: self.pointer.coerce_unsized(),
        }
    }
}

Instead, we should rather strive to have the following 2 implementations:

// Permit going from `Pin<impl Unpin>` to` Pin<impl Unpin>`
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
    P: CoerceUnsized<U>,
    P: Deref<Target: Unpin>,
    U: Deref<Target: Unpin>,
{
    fn coerce_unsized(self) -> Pin<U> {
        Pin::new(self.pointer.coerce_unsized())
    }
}

// Permit going from `Pin<impl Pin>` to `Pin<impl Pin>`
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
    P: CoerceUnsized<U>,
    P: core::ops::Deref<Target: !Unpin>,
    U: core::ops::Deref<Target: !Unpin>,
{
    fn coerce_unsized(self) -> Pin<U> {
        // SAFETY: The new unpin Pin is derived from another unpin Pin,
        // so we the pinned contract is kept up
        unsafe { Pin::new_unchecked(self.pointer.coerce_unsized()) }
    }
}

While this is a breaking change, it should be in line with being a soundness fix. Unfortunately, these kind of impl requires negative bounds and negative reasoning which is its own can of worms and therefore likely not to happen, see GH issue: Need negative trait bound #42721. Though maybe allowing them for auto traits alone could work out fine, given those types of traits are already rather special.

Assuming this path would be blessed as the future fix for the issue, this RFC itself will not change the status quo of the unsoundness and therefore would not need to be blocked on negative bounds/reasoning.

Custom Reborrows

See the following issue for context: Some way to simulate &mut reborrows in user code #1403

In the linked issue the idea was to generalize CoerceUnsized as a general Coerce trait.

The design proposed here would still enable this, although it raises some questions.

For one, reborrows should be guaranteed to be no-ops, as all that should change in "reborrow coercions" are the corresponding lifetimes, yet this RFC exposes a function that will be run on coercion.
Generalizing this to a general Coerce trait would require specialization and/or negative trait bound reasoning, such that &'a mut T: CoerceUnsized<&'b T> (for reborrows) but also &'a mut T: CoerceUnsized<&'b U>, T: Unsize for unsizing coercions can both be done as impls.

The first issue is only of concern with the proposed design here, while the second one is a more general issue relevant to impl overlap.

Drawbacks

This proposal allows for some non-sensical CoerceUnsized implementations resulting in odd unsizing coercions (think implicit casts where no actual "unsizing" happens), though the restrictions on the CoerceUnsized trait try to limit them (for the time being).
- This includes implementations that may allocate
Unsizing coercions are now able to run arbitrary user code, placing it into a similar category to Deref in that regard, effectively adding yet more user facing *magic* to the language.
This proposal relies on the ptr_metadata feature (the Pointee trait to be specific), which is unlikely to stabilize soon and as such would push stabilization of this feature back as well

Rationale and alternatives

As was discussed in the custom reborrows issue, we could make CoerceUnsized represent a more general user controlled Coerce mechanism.
This proposal is forwards compatible with exposing more dynamic unsizing behavior in the future, where for example the metadata is read from a field of the source type. To support that, a new trait DynamicUnsize could be introduced as the supertrait of Unsize, exposing the needed functions to extract the metadata. Then a blanket impl can be provided that implements DynamicUnsize for anything implementing Unsize with delegating the metadata extraction functions to the Unsize impl. The reason for why such a split would be necessary is that not all coercions can read from the source object (raw pointer unsizing for example), so there needs to be a way to differentiate on the trait bounds for the corresponding CoerceUnsized implementations.
It would be possible to partially stabilize things in this RFC without being blocked on the ptr_metadata features. CoerceUnsized could be stabilized on its own, allowing delegating impls that only depend on existing CoerceUnsized implementations. As such, most custom smart pointers could delegate to *mut T: CoerceUnsized<*mut U> impls which would cover a lot of cases already as Unsize itself isn't really too useful currently due to custom dynamically sized types not being a thing yet.
Introduce a Pointer trait that allows disassembling into and assembling from raw parts. This would allow simplifying most implementations of CoerceUnsized, as the usual impl is doing just that, disassembling, coercing the metadata and then re-assembling.

Prior art

There is another Pre-RFC that tries to improve Unsizing. It does so by just allowing more impls of the current traits, while restricting them by taking visibilities of fields into account which complicates the traits in a (subjectively to the author) confusing way. And while the RFC makes the traits more flexible, it does not offer the same flexibility that this proposal offers.

Prior art with librarification (but relying on compiler impl details to shim <*T>::with_metadata_of): https://crates.io/crates/unsize

Unresolved questions

Given the Pin unsoundness proposal, assuming negative reason was a thing, would an impl permitting to go from Pin<impl Unpin> to Pin<impl !Unpin> be sound?
The compiler emitted implementations for the unsize trait, in particular the Foo<..., T, ...> case may collide with user implementations.
1. Is this problematic?
2. Should they be overridable?
Will this design prevent any scenarios from being ever supported?
This proposal allows multiple CoerceUnsized impls for a given type (as long as they don't overlap as usual), which might allow for multiple applicable unsizing coercions for certain scenarios?
As usual, naming. Given we might want to introduce multiple unsize traits for certain requirements, should the proposed trait stick to Unsize or something more specific like FromMetadataUnsize?

Future possibilities

Expand the compiler emitted implementations of CoerceUnsized to enums, such as Option<T>: CoerceUnsized<Option> where T: CoerceUnsized.
Add a DynamicUnsize trait as outlined in the rationale to support more unsizing use cases.

chrefr · May 5, 2023, 2:31pm

I wouldn't want Option to coerce unsizing automagically since there is a branch there, but a method like unsize() will be good.

Other than that, this proposal sounds great.

CAD97 · May 6, 2023, 1:43am

TL;DR: the current shape works for impl details, but I'm unconvinced that it's good for stabilization.

Prior art with librarification (but relying on compiler impl details to shim <*T>::with_metadata_of):

(Using Box as an example isn't great since Box is a kinda primitive type. On the other hand we're slowly trying to make Box less necessarily special^[1], so perhaps it's a good example.) Changing from

impl<T: ?Sized, U: ?Sized> CoerceUnsized<*mut U> for *mut T
where
    T: Unsize<U>,
{}

impl<T: ?Sized, U: ?Sized, A> CoerceUnsized<Box<U, A>> for Box<T, A>
where
    T: Unsize<U>,
    A: Allocator,
{}

to

impl<T: ?Sized, U: ?Sized> CoerceUnsized<*mut U> for *mut T
where
    T: Unsize<U>,
{
    fn coerce_unsized(self) -> *mut U {
        /* compiler builtin */
        self as *mut U
    }
}

impl<T: ?Sized, U: ?Sized, A> CoerceUnsized<Box<U, A>> for Box<T, A>
where
    T: Unsize<U>,
    A: Allocator,
{
    fn coerce_unsized(self) -> Box<U, A> {
        let (ptr, alloc) = Box::into_raw_with_allocator(self);
        unsafe { Box::from_raw_in(ptr as *mut U, alloc) }
    }
}

doesn't really do all that much to reduce the magic of Unsize/CoerceUnsized.

Removed magic:

Identification of the relevant field for coercing.
Updating the record with that field coerced^[2].

Remaining magic:

Automatic implementation of Unsize.
Verification that the CoerceUnsized implementation is valid.

In fact, the implementation validity check gets more magic/difficult! Currently, impl validation, ignoring raw pointers, takes roughly the shape of

Self and Target are the same struct and differ in only a single type parameter, T => U; and
The struct has a single field which mentions the varied type parameter; then
Add an obligation that FieldTy<T>: CorceUnsized<FieldTy> to the impl.

With this proposal, the validation instead becomes

Self and Target are the same struct and differ in only a single type parameter, T => U; then
Add an obligation that T: Unsized OR T: CoerceUnsized to the impl.

Some form of validation probably needs to be done to keep type inference in the face of coercions under control, but this feels like a step in the wrong direction if the goal is simplification.

I do find an approach more like the unsize crate somewhat appealing. It's possible to actually separate the "select pointer for coercion" and "unsize the pointee" steps.

Roughly

either

unsafe trait Pointer: Sized {
    type Pointee: ?Sized;
    type Marker;

    unsafe fn into_ptr_parts(this: Self) -> (*mut Self: Pointee, Self::Marker);
    unsafe fn from_ptr_parts(ptr: *mut Self::Pointee, marker: Self::Marker) -> Self;
}

// example
unsafe impl<T: ?Sized, A> Pointer for Box<T, A> {
    type Pointee = T;
    type Marker = Box<!, A>;

    unsafe fn into_ptr_parts(this: Box<T, A>) -> (*mut T, Box<!, A>) {
        let (ptr, alloc) = Box::into_raw_with_allocator(this);
        let marker = Box::from_raw_in(ptr as *mut !, alloc);
        (ptr, marker)
    }

    unsafe fn from_ptr_parts(ptr: *mut T, parts: Marker) -> Box<T, A> {
        let (_, alloc) = Box::into_raw_with_allocator(this);
        Box::from_raw_in(ptr, alloc)
    }
}

fn coerce_unsize<TPtr, UPtr>(this: TPtr) -> UPtr
where
    TPtr: Pointer,
    UPtr: Pointer<Marker = TPtr::Marker>,
    TPtr::Pointee: Unsize<UPtr::Pointee>,
{
    let (ptr, parts) = TPtr::into_ptr_parts(this);
    UPtr::from_ptr_parts(ptr as *mut UPtr::Pointee, parts);
}

or

unsafe trait PointerAs<U: ?Sized>: Sized {
    type Pointee: ?Sized;
    type Output;

    fn as_raw_ptr(this: &Self) -> *const Self::Pointee;
    unsafe fn with_replacement_ptr(this: Self, *const U) -> Self::Output;
}

// example
unsafe impl<T: ?Sized, U: ?Sized, A> PointerAs<U> for Box<T, U> {
    type Pointee = T;
    type Output = Box<U, A>;

    fn as_raw_ptr(this: &Box<T, A>) -> *const T {
        ptr::from_ref(&**this)
    }

    unsafe fn with_replacement_ptr(this: Box<T, A>, re: *const U) -> Box<U, A> {
        let (ptr, alloc) = Box::into_raw_with_allocator(this);
        let ptr = ptr
            .with_addr(re.addr())
            .with_metadata_of(re);
        Box::from_raw_with_allocator(ptr)
    }
}

fn coerce_unsized<TPtr, UPte: ?Sized>(this: TPtr) -> TPtr::Output
where
    TPtr: PointerAs<UPte>,
    TPtr::Pointee: Unsize<UPte>,
{
    let ptr = TPtr::as_raw_ptr(&this);
    unsafe {
        TPtr::with_replacement_ptr(ptr as *const UPte)
    }
}

The latter (actually used by the unsize crate) is maybe a bit questionable on provenance, since it's relying on <*T>::with_metadata_of to maintain provenance, since passing the smart pointer to the second function may invalidate the unsized raw pointer's provenance (e.g. with &mut T), and maybe even taking the address as well, if coercions can change the pointer address. The former has clearer provenance path, but relies on an extra type to hold the smart pointer's additional state in the in-between^[3].

Go bananas, take bad parts of both and throw GATs at the problem:

trait CoercePtr {
    type Pointee: ?Sized;
    type Output<U: ?Sized>;
    unsafe fn coerce_with<U: ?Sized>(
        this: Self,
        coercion: impl FnOnce(*mut Self::Pointee) -> *mut U,
    ) -> Self::Output<U>;
}

// example
fn coerce_unsized<TPtr, UPte>(this: TPtr) -> TPtr::Output<UPte>
where
    TPtr: CoercePtr,
    TPtr::Pointee: Unsize<UPte>,
{
    unsafe {
        TPtr::coerce_with(this, |ptr| ptr as *mut UPte)
    }
}

By necessity these designs are designed to slot in with a type representing coercions can be done, e.g.

pub struct Coercion<T: ?Sized, U: ?Sized>(fn(*mut T) -> *mut U);

impl<T: ?Sized, U: ?Sized> Coercion {
    pub 
}

pub macro Coercion($T:ty as $U:ty) {
    Coercion::<$T, $U>()
}

I agree that transforming the pointee metadata and not even having access to the data pointer is probably^[4] the more principled way to represent custom unsizing, though. However, for the short term, note the below.

Additionally, this relies on feature(ptr_metadata) as written. For one, it doesn't have to: a less involved stabilization could just stabilize CoerceUnsized and leave Unsize fully unstable; stable implementations would just use *mut T: CoerceUnsized<*mut U> or whatever base pointer type(s) they're wrapping. For second, pointee metadata seems a ways from stabilization still. I still feel rather strongly that even if strongly typed metadata doesn't have a coercion, having ptr::Metadata<T>(<T as MetadataKind>::Metadata) (like mem::Discriminant<T>) is probably the better API.

It's also worth noting that if CoerceUnsized stabilizes with the Ptr<T>: CoerceUnsized<Ptr> shape, then I could potentially do something funny like

struct Tup3<T, U, V>(Box<T>, Box<U>, Box<V>);
// omitting bounds for clutter
impl<T, U, V, X> CoerceUnsized<Box3<X, U, V>> for Box3<T, U, V> {}
impl<T, U, V, X> CoerceUnsized<Box3<T, X, V>> for Box3<T, U, V> {}
impl<T, U, V, X> CoerceUnsized<Box3<T, U, X>> for Box3<T, U, V> {}

(though not exactly this since impl overlap) and potentially get multiple applicable CoerceUnsized impls on the same type, making the coercion space larger. Because of this I feel a very slight preference towards the form that parameterizes over the unsized type rather than the pointer type.

Obviously many things will remain specially handled for perf reasons, but generally speaking, the more things that could be plain library code, and are only built-in for perf, the better. ↩︎
To note, this is an extremely simple transformation which the compiler is inherently capable of doing. With type changing FRU, it's literally as simple as
```
struct Box { ptr: *mut T, alloc: A }
fn coerce_unsized(self: Box<T, A>) -> Box<U, A> {
 Box { ptr: self.ptr, ..self }
}
```
modulo FRU not being allowed when the type has drop glue. ↩︎
The into side is also unsafe with the idea that ! can be substituted in to place a type hole there and having a pointer to ! temporarily won't cause too much of an issue, perhaps plus MaybeDangling. ↩︎
Counterargument: a &Thin<dyn Trait> where Thin<T> is (T::Metadata, Extern<T>) would probably like to coerce/unsize to &dyn Trait by loading the metadata and adjusting the pointer to the payload (&**self), rather than use a shim vtable doing that every time. ↩︎

quinedot · May 6, 2023, 4:18am

The direction on this front seems to vacillate. (En passant reversal of RFC 130.)

veykril · May 8, 2023, 9:14am

Remaining magic: Automatic implementation of Unsize.

With this, do you refer to the compiler provided implementations? I don't see how we could ever get rid of them personally (especially given the trait object ones can only be provided by the compiler as we can't be generic over traits).

In fact, the implementation validity check gets more magic/difficult!

I'm not sure I see how it gets more difficult? As you pointed out, the only part that got dropped is the single field requirement, but I can't tell how that makes it more magic or difficult.

The coercion part otoh is a very good point that I completely forgot about.

but this feels like a step in the wrong direction if the goal is simplification.

Fwiw the goal here wasn't simplification, it was making things more flexible and backwards compatible for future improvements (while trying to stay as simple as possible with this added flexibility).

Regarding the remainder of your comment I'll have to ponder a bit longer about it (I'm a bit busy this week), thanks for the valuable feedback so far though

veykril · May 15, 2023, 1:23pm

Finally got time to get back to this now, so let me dig into the rest of your comment

I do find an approach more like the unsize crate somewhat appealing. It's possible to actually separate the "select pointer for coercion" and "unsize the pointee" steps.

I do like the Pointer trait idea, though I find the use of marker rather confusing. The use of the never type also doesn't seem to help in terms of the short term stability comment in regards to ptr_metadata given that the never type isn't close to being stabilized either.

Though in general the way the unsize crate models this (by necessity of not relying on unstable features I suppose) seems to be quite complicated, at least to my eyes. Likewise your ideas of modeling this in a similar fashion.

Additionally, this relies on feature(ptr_metadata) as written. For one, it doesn't have to: a less involved stabilization could just stabilize CoerceUnsized and leave Unsize fully unstable; stable implementations would just use *mut T: CoerceUnsized<*mut U> or whatever base pointer type(s) they're wrapping.

Oh this is a great observation! It's dependence on ptr_metadata was something that worried me as well since it would effectively push any means of stabilizing into the non-near future.

It's also worth noting that if CoerceUnsized stabilizes with the Ptr<T>: CoerceUnsized<Ptr> shape, then I could potentially do something funny like

struct Tup3<T, U, V>(Box<T>, Box<U>, Box<V>);
// omitting bounds for clutter
impl<T, U, V, X> CoerceUnsized<Box3<X, U, V>> for Box3<T, U, V> {}
impl<T, U, V, X> CoerceUnsized<Box3<T, X, V>> for Box3<T, U, V> {}
impl<T, U, V, X> CoerceUnsized<Box3<T, U, X>> for Box3<T, U, V> {}

(though not exactly this since impl overlap) and potentially get multiple applicable CoerceUnsized impls on the same type, making the coercion space larger. Because of this I feel a very slight preference towards the form that parameterizes over the unsized type rather than the pointer type.

I don't think I follow what you mean here. Can you elaborate what you mean with the form that parameterizes over the unsized type? Also assuming those impls werent overlapping (given some bounds), how would multiple of them be possibly applicable in a certain scenario?

veykril · May 17, 2023, 8:56am

I agree that transforming the pointee [1]metadata and not even having access to the data pointer is probably the more principled way to represent custom unsizing, though. However, for the short term, note the below.

Counterargument: a &Thin<dyn Trait> where Thin<T> is (T::Metadata, Extern<T>) would probably like to coerce/unsize to &dyn Trait by loading the metadata and adjusting the pointer to the payload (&**self), rather than use a shim vtable doing that every time.

Also regarding this, the RFC makes effort at not immediately disallowing that for the future as outlined in the rationale-and-alternatives section, to support unsizing from the data (opposed to the metadata), a new trait could be introduced as a super trait for the Unsize trait. It also outlines why the split would be necessary though.

system · August 15, 2023, 8:57am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] Improved Unsizing language design	21	1358	September 22, 2022
[Pre-RFC] Custom DSTs language design	33	2565	March 25, 2019
RFC: Unsized types, take 2 ideas (deprecated)	1	1078	March 25, 2019
[Pre-RFC] Yet another DST proposal language design	3	788	November 9, 2020
Bikeshed request: `struct(<T as Pointee>::Metadata)` libs	9	828	October 12, 2022

[Pre-RFC] Flexible Unsize and CoerceUnsize traits

RFC

Summary

Motivation

Guide-level explanation

Unsize

CoerceUnsized

1. A non-delegating CoerceUnsized impl

Example impl for the & 'a T type

Example impl for the Arc<T> type

2. A delegating CoerceUnsized impl

Example impl for the Cell<T> type

Example implementation for Option<T>

Reference-level explanation

Unsize

CoerceUnsized

Implementations provided by the standard library

Unsize

CoerceUnsized

Implementations provided by the compiler

Unsize un-lowering for known impls

TypeMetadata<T> and Unsizing

Pin Unsoundness

Custom Reborrows

Drawbacks

Rationale and alternatives

Prior art

Unresolved questions

Future possibilities

Related topics

1. A non-delegating `CoerceUnsized` impl

Example impl for the `& 'a T` type

Example impl for the `Arc<T>` type

2. A delegating `CoerceUnsized` impl

Example impl for the `Cell<T>` type

Example implementation for `Option<T>`

`Unsize`

`CoerceUnsized`

`Unsize`

`CoerceUnsized`

`TypeMetadata<T>` and Unsizing