This is an attempt at changing the current CoerceUnsize
and Unsize
in a way that makes them more flexible and also allowing for future additions to their API surface without incurring breakage to support even more dynamic unsizing in an attempt to get a minimal design that could be stabilized in the nearer future. The pre-rfc phase here is meant to collect general feedback about possible obvious flaws and drawbacks that may have been missed in the writing and designing of this.
RFC
Summary
Move the unsizing logic out of the compiler into library source, allowing for a more flexible design of the features and allowing user code to implement Unsize
for their own types.
Motivation
Currently unsizing in Rust is very rigid, only allowed in very specific scenarios permitted by the rules surrounding the current Unsize
and CoerceUnsized
traits and its automatic implementations by the compiler.
This has the downside of being very magical, as the majority of the logic happens inside the compiler opposed to the source code. It also prevents certain unsizing implementations from being doable today.
This RFC attempts to make these rules more flexible by also allowing user implementations of the traits that define how the metadata is derived and composed back into unsized objects.
Guide-level explanation
Unsize
Unsizing relationships between two types can be defined by implementing the unsafe Unsize
trait for a type and its target unsized type.
These implementations describe how the metadata is derived from a source type and its metadata for its target unsized type.
An example implementation of Unsize
for [T; N]
to [T]
unsizing looks like the following:
// SAFETY:
// - `Unsize::target_metadata` returns length metadata that spans the entire array exactly.
// - `[T; N]` is a contiguous slice of `T`'s, so a pointer pointing to its data is valid
// to be interpreted as a pointer to a slice `[T]`.
unsafe impl<T, const N: usize> Unsize<[T]> for [T; N] {
fn target_metadata((): <Self as Pointee>::Metadata) -> <[T] as Pointee>::Metadata {
N
}
}
The metadata for the source type [T; N]
is the unit type ()
, as there is no metadata for sized types.
The implementation then just returns the length N
from the array type, as this is the appropriate metadata for a slice produced from such an array.
An example that does an unsized to unsized coercion is the following implementation (for trait upcasting provided by the compiler):
trait Super {}
trait Sub: Super {}
// SAFETY:
// - `Unsize::target_metadata` returns a vtable provided by the vtable of the `dyn Sub` object.
// - `dyn Super` is a super trait of `dyn Sub`, so a pointer pointing to data for a `dyn Sub`
// is valid to be used as a data pointer to a `dyn Super`
unsafe impl Unsize<dyn Super> for dyn Sub {
unsafe fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn super as Pointee>::Metadata {
metadata.upcast()
}
}
This is an unsizing impl required for trait upcasting, where the metadata (the vtable of the trait) of the dyn Super
type has to be extracted from the the metadata of the dyn Sub
trait.
CoerceUnsized
To actually enable the unsizing coercion of objects, the CoerceUnsized
trait has to be implemented.
It defines how the unsizing of the inner type occurs for a given pointer or wrapper type.
A CoerceUnsized
implementation has specific requirements to be valid which boil down to 2 kinds:
- A non-delegating
CoerceUnsized
impl - A delegating
CoerceUnsized
impl
1. A non-delegating CoerceUnsized
impl
Such an impl is used for actual pointer like types, such as &'a T
or Arc<T>
.
The implementing type and the CoerceUnsized
target type must differ in a single generic parameter only. Say, the parameters are T
and U
. Then,
T
is the generic parameter of the implementing type; is bound asT: Unsize<U>
U
is the generic parameter of theCoerceUnsized
target type
Example impl for the & 'a T
type
impl<'a, 'b, T, U> CoerceUnsized<&'a U> for &'b T
where
'b: 'a,
T: Unsize<U> + ?Sized,
U: ?Sized
{
fn coerce_unsized(self) -> &'a U {
let metadata = Unsize::target_metadata(core::ptr::metadata(self));
let untyped_data_ptr = (self as *const T).cast::<()>();
// SAFETY: [`Unsize`] demands that the return value of
// `Unsize::target_metadata` is valid to be used together
// with the data pointer to be re-interpreted as the unsized type
unsafe { &*core::ptr::from_raw_parts(untyped_data_ptr, metadata) }
}
}
Example impl for the Arc<T>
type
impl<T, U> CoerceUnsized<Arc<U>> for Arc<T>
where
T: ?Sized + Unsize<U>,
U: ?Sized
{
fn coerce_unsized(self) -> Arc<U> {
let ptr = Arc::into_raw(self);
let metadata = Unsize::target_metadata(core::ptr::metadata(ptr));
let untyped_data_ptr = (ptr as *const T).cast::<()>();
// SAFETY: [`Unsize`] demands that the return value of
// `Unsize::target_metadata` is valid to be used together
// with the data pointer to be re-interpreted as the unsized type
// and that `std::mem::size_of` on `U` will report the same size as the `T`.
unsafe { Arc::from_raw(core::ptr::from_raw_parts(untyped_data_ptr, metadata)) }
}
}
Important to note is that Unsize
impls are required to return metadata that make the unsized object report the same size as the source type. If that was not the case, the Arc
impl above would be unsound, as its destructor would try to deallocate a smaller allocation than it initially owned.
2. A delegating CoerceUnsized
impl
Such an impl is used for wrapper like types, such as Cell<T>
or Pin<T>
where the impl is required to list a CoerceUnsized
bound on the generic parameters of the wrapping type.
Example impl for the Cell<T>
type
impl<T, U> CoerceUnsized<Cell<U>> for Cell<T>
where
T: CoerceUnsized<U>
{
fn coerce_unsized(self) -> Cell<U> {
Cell::new(self.into_inner().coerce_unsized())
}
}
Example implementation for Option<T>
A delegating impl is not limited to struct
types.
impl<T, U> CoerceUnsized<Option<U>> for Option<T>
where
T: CoerceUnsized<U>,
{
fn coerce_unsized(self) -> Option<U> {
match self {
Option::Some(t) => Option::Some(t.coerce_unsized()),
Option::None => Option::None,
}
}
}
With such an impl, Option<&[T; N]>
could coerce to Option<&[T]>
.
Reference-level explanation
Unsize
The new Unsize
trait definition looks like the following:
/// # Safety
///
/// The implementation of [`Unsize::target_metadata`] must return metadata that
/// - is valid for interpreting the `Self` type to `Target`, and
/// - where using `core::mem::size_of` on the unsized object will report the
/// same size as on the source object.
pub unsafe trait Unsize<Target>
where
Target: ?Sized,
{
fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <Target as Pointee>::Metadata;
}
This trait allows specifying how to derive the metadata required for unsizing from the metadata of the source type Self
or the compile time type information.
CoerceUnsized
The new CoerceUnsized
trait definition looks like the following:
pub trait CoerceUnsized<Target> {
fn coerce_unsized(self) -> Target;
}
Implementations of this trait now specify how the coercion is done.
This also drops the ?Sized
bound on Target
, as returning unsized values is not possible currently.
This can be relaxed without breakage in the future.
In order to prevent misuse of the trait as means of implicit conversions, implementations for this trait require specific conditions to hold which the compiler will enforce. This can be relaxed without breakage in the future.
For an implementation to be valid, one of the following must hold:
Self
andTarget
- must be references or raw pointers to different generic parameters
- type parameter
T
ofSelf
hasT: Unsize<U>
bound whereU
is the type parameter ofTarget
Self
andTarget
- must have the same type constructor, varying in a single type parameter
- type parameter
T
ofSelf
must have aT: CoerceUnsized<U>
bound whereU
is the type parameter ofTarget
- Example:
impl<T: CoerceUnsized<U>, U> CoerceUnsized<Cell<U>> for Cell<T>
Self
andTarget
- must have the same type constructor, varying in a single type parameter
- type parameter
T
ofSelf
must have aT: Unsize<U>
bound whereU
is the differing type parameter ofTarget
- Example:
impl<T: ?Sized + Unsize<U>, U: ?Sized, A: Allocator> CoerceUnsized<Box<U, A>> for Box<T, A>
Implementations provided by the standard library
Unsize
Today, all Unsize
implementations are provided by the compiler.
Most of them will continue to be provided by the compiler as they involve trait objects which depend on all traits defined.
The only one that will no longer be emitted by the compiler is the [T; N]: Unsize<[T]>
implementation as we can now fully implement it in library source.
The implementation will be as follows (and live in core
):
// SAFETY:
// - `Unsize::target_metadata` returns length metadata that spans the entire array exactly.
// - `[T; N]` is a contiguous slice of `T`'s, so a pointer pointing to its data is valid to
// be interpreted as a pointer to a slice `[T]`.
unsafe impl<T, const N: usize> Unsize<[T]> for [T; N] {
fn target_metadata((): <Self as Pointee>::Metadata) -> <[T] as Pointee>::Metadata {
N
}
}
CoerceUnsized
The non-delegating implementations of CoerceUnsized
provided by the standard library will have the implementation of their fn coerce_unsized
function written to disassemble the source into pointer and source metadata, make use of the Unsize
trait for extracting the target metadata from the source metadata, and then reassembling the pointer and target metadata into the target.
For the delegating implementations, the implementation of the fn coerce_unsized
function will merely delegate to the inner value and then wrap that result again.
Implementations provided by the compiler
Note: This section uses fictional rust syntax
For types to trait object for their implemented types, the compiler will generate Unsize
implmentations:
unsafe impl<trait Trait, T: Trait> Unsize<dyn Trait> for T {
fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn Trait as Pointee>::Metadata {
// magic
}
}
For the unstable trait upcasting feature, the compiler will generate the following Unsize
implementations:
unsafe impl<trait Trait, trait Super> Unsize<dyn Super> for dyn Trait
where
dyn Trait: Super
{
unsafe fn target_metadata(metadata: <Self as Pointee>::Metadata) -> <dyn super as Pointee>::Metadata {
// compiler magic
}
}
This is safe to do, as the metadata of source can be safely extracted from a raw pointer without touching the data for a CoerceUnsized
implementation on *const T
/*mut T
.
To keep backwards compatibility (as these are already observable in today's stable rust), the compiler also generates Unsize<Foo<..., U, ...>>
implementations for structs Foo<..., T, ...>
if all of these conditions are met:
T: Unsize<U>
.- Only the last field of
Foo
has a type involvingT
. Bar<T>: Unsize<Bar<U>>
, whereBar<T>
stands for the actual type of that last field.Foo<U>: Pointee<Metadata = <Bar<U> as Pointee>::Metadata>
Unsize un-lowering for known impls
The compiler may "un-lower" some known unsize coercions back into builtin operations in the MIR as to not degrade performance too much, as lowering this new definition will introduce a lot of new operations that don't exist in the current unsizing logic. This would be similar to how builtin operators for primitives work currently, where they are typechecked with the trait impls but then lowered back to builtin operators in the mir.
TypeMetadata<T>
and Unsizing
See the following PR for context: Implement pointee metadata unsizing via a TypedMetadata<T>
container #97052.
With this new definition, we can implement CoerceUnsized
for TypeMetadata
without having to special case it in the compiler as follows:
struct TypedMetadata<T: ?Sized>(pub <T as core::ptr::Pointee>::Metadata);
impl<T, U> CoerceUnsized<TypedMetadata<U>> for TypedMetadata<T>
where
T: ?Sized + Unsize<U>,
U: ?Sized,
{
fn coerce_unsized(self) -> TypedMetadata<U> {
TypedMetadata(Unsize::target_metadata(self.0))
}
}
Pin Unsoundness
See the following issue for context: Pin
is unsound due to transitive effects of CoerceUnsized
#68015
The design of the new traits here do not address the underlying issue in regards to Pin
.
The author of this RFC feels like addressing the Pin
soundness in the definitions of the traits is wrong, as in almost all cases where are a user implements one of these traits Pin
will be irrelevant to them.
And at its core, unsizing is not coupled with Pin
whatsoever.
It would make much more sense to fix the unsound CoerceUnsized
implementation that Pin
provides.
That is given the current implementation of (with the new definition of the trait):
// Copied from core library docs:
// Note: this means that any impl of `CoerceUnsized` that allows coercing from
// a type that impls `Deref<Target=impl !Unpin>` to a type that impls
// `Deref<Target=Unpin>` is unsound. Any such impl would probably be unsound
// for other reasons, though, so we just need to take care not to allow such
// impls to land in std.
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
P: CoerceUnsized<U>,
// `U: core::ops::Deref`, this bound is accidentally missing upstream,
// hence we can't use `Pin::new_unchecked` in the implementation
{
fn coerce_unsized(self) -> Pin<U> {
Pin {
pointer: self.pointer.coerce_unsized(),
}
}
}
Instead, we should rather strive to have the following 2 implementations:
// Permit going from `Pin<impl Unpin>` to` Pin<impl Unpin>`
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
P: CoerceUnsized<U>,
P: Deref<Target: Unpin>,
U: Deref<Target: Unpin>,
{
fn coerce_unsized(self) -> Pin<U> {
Pin::new(self.pointer.coerce_unsized())
}
}
// Permit going from `Pin<impl Pin>` to `Pin<impl Pin>`
impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
P: CoerceUnsized<U>,
P: core::ops::Deref<Target: !Unpin>,
U: core::ops::Deref<Target: !Unpin>,
{
fn coerce_unsized(self) -> Pin<U> {
// SAFETY: The new unpin Pin is derived from another unpin Pin,
// so we the pinned contract is kept up
unsafe { Pin::new_unchecked(self.pointer.coerce_unsized()) }
}
}
While this is a breaking change, it should be in line with being a soundness fix. Unfortunately, these kind of impl requires negative bounds and negative reasoning which is its own can of worms and therefore likely not to happen, see GH issue: Need negative trait bound #42721. Though maybe allowing them for auto traits alone could work out fine, given those types of traits are already rather special.
Assuming this path would be blessed as the future fix for the issue, this RFC itself will not change the status quo of the unsoundness and therefore would not need to be blocked on negative bounds/reasoning.
Custom Reborrows
See the following issue for context: Some way to simulate &mut
reborrows in user code #1403
In the linked issue the idea was to generalize CoerceUnsized
as a general Coerce
trait.
The design proposed here would still enable this, although it raises some questions.
- For one, reborrows should be guaranteed to be no-ops, as all that should change in "reborrow coercions" are the corresponding lifetimes, yet this RFC exposes a function that will be run on coercion.
- Generalizing this to a general
Coerce
trait would require specialization and/or negative trait bound reasoning, such that&'a mut T: CoerceUnsized<&'b T>
(for reborrows) but also&'a mut T: CoerceUnsized<&'b U>, T: Unsize<U>
for unsizing coercions can both be done as impls.
The first issue is only of concern with the proposed design here, while the second one is a more general issue relevant to impl overlap.
Drawbacks
- This proposal allows for some non-sensical
CoerceUnsized
implementations resulting in odd unsizing coercions (think implicit casts where no actual "unsizing" happens), though the restrictions on theCoerceUnsized
trait try to limit them (for the time being).- This includes implementations that may allocate
- Unsizing coercions are now able to run arbitrary user code, placing it into a similar category to
Deref
in that regard, effectively adding yet more user facing *magic* to the language. - This proposal relies on the
ptr_metadata
feature (thePointee
trait to be specific), which is unlikely to stabilize soon and as such would push stabilization of this feature back as well
Rationale and alternatives
- As was discussed in the custom reborrows issue, we could make
CoerceUnsized
represent a more general user controlledCoerce
mechanism. - This proposal is forwards compatible with exposing more dynamic unsizing behavior in the future, where for example the metadata is read from a field of the source type. To support that, a new trait
DynamicUnsize
could be introduced as the supertrait ofUnsize
, exposing the needed functions to extract the metadata. Then a blanket impl can be provided that implementsDynamicUnsize
for anything implementingUnsize
with delegating the metadata extraction functions to theUnsize
impl. The reason for why such a split would be necessary is that not all coercions can read from the source object (raw pointer unsizing for example), so there needs to be a way to differentiate on the trait bounds for the correspondingCoerceUnsized
implementations. - It would be possible to partially stabilize things in this RFC without being blocked on the
ptr_metadata
features.CoerceUnsized
could be stabilized on its own, allowing delegating impls that only depend on existingCoerceUnsized
implementations. As such, most custom smart pointers could delegate to*mut T: CoerceUnsized<*mut U>
impls which would cover a lot of cases already asUnsize
itself isn't really too useful currently due to custom dynamically sized types not being a thing yet. - Introduce a
Pointer
trait that allows disassembling into and assembling from raw parts. This would allow simplifying most implementations ofCoerceUnsized
, as the usual impl is doing just that, disassembling, coercing the metadata and then re-assembling.
Prior art
There is another Pre-RFC that tries to improve Unsizing. It does so by just allowing more impls of the current traits, while restricting them by taking visibilities of fields into account which complicates the traits in a (subjectively to the author) confusing way. And while the RFC makes the traits more flexible, it does not offer the same flexibility that this proposal offers.
Prior art with librarification (but relying on compiler impl details to shim <*T>::with_metadata_of
): https://crates.io/crates/unsize
Unresolved questions
- Given the
Pin
unsoundness proposal, assuming negative reason was a thing, would an impl permitting to go fromPin<impl Unpin>
toPin<impl !Unpin>
be sound? - The compiler emitted implementations for the unsize trait, in particular the
Foo<..., T, ...>
case may collide with user implementations.- Is this problematic?
- Should they be overridable?
- Will this design prevent any scenarios from being ever supported?
- This proposal allows multiple
CoerceUnsized
impls for a given type (as long as they don't overlap as usual), which might allow for multiple applicable unsizing coercions for certain scenarios? - As usual, naming. Given we might want to introduce multiple unsize traits for certain requirements, should the proposed trait stick to
Unsize
or something more specific likeFromMetadataUnsize
?
Future possibilities
- Expand the compiler emitted implementations of
CoerceUnsized
to enums, such asOption<T>: CoerceUnsized<Option<U>>
whereT: CoerceUnsized<U>
. - Add a
DynamicUnsize
trait as outlined in the rationale to support more unsizing use cases.