Should we unsize custom handles?

With the Tracking issue for RFC 2580 being implemented (in nightly), it is now possible to create custom handles:

struct Handle<T>(u32, <T as Pointee>::Metadata);

And using those handles, one can implement Storages (as per storage-poc), which are a generalization of allocators unlocking:

  • Inline storage: storing the elements of a collection within the memory blob of the collection.
  • Special memory storages: such as placing a HashMap<T> in shared memory.
  • ...

However, as I noted in this comment on the track issue, the current combination of CoerceUnsized and Pointee does not make it possible for Handle itself to be CoerceUnsized.

And I think it should be.

The goal of storage-poc is to demonstrate that it should be possible to have the collections in collections NOT depend on the alloc crate. And for the purpose of writing collections, having a "manual" coerce method is not too problematic -- it's all internal, after all.

For the purpose of porting Box to storages, though, it's a stringent limitation. Look at this testcase:

#[test]
fn slice_storage() {
    let storage = SingleElement::<[u8; 4]>::new();
    let mut boxed: RawBox<[u8], _> = RawBox::new([1u8, 2, 3], storage)
        .unwrap()
        .coerce();

    assert_eq!([1u8, 2, 3], &*boxed);

    boxed[2] = 4;

    assert_eq!([1u8, 2, 4], &*boxed);
}

Let's check the types here:

  • I'll alias SingleElement<[u8; 4]> as Storage for simplicity, it's a type of 4 bytes, aligned on a 1-byte boundary.
  • RawBox<[u8; N], Storage> is a type of 4 bytes, aligned on a 1-byte boundary. No heap allocation whatsoever.
  • RawBox<[u8], Storage> is a type of 16 bytes, aligned on a 8-bytes boundary, as it contains both a Storage and usize (the meta-data for the slice).

It's quite amazing that the extra .coerce() works, allowing type-erasure, but let's honest it'd be sweeter if it was unnecessary, just like with a regular Box.

Thus, I think we should have a mechanism to have compiler-provided unsizing of custom handle, and this my motivational example: making RawBox first-class.

The canny reader will note that RawBox with an inline storage provides in-place type-erasure, how cool is that? Cool enough to be a first-class citizen, I'll argue.


I feel obligated to note that I am NOT proposing a user-defined coercion logic.

A possible implementation, today, would be to have a strongly-typed Metadata for Sized types, and have the compiler automatically provide a CoerceUnsized implementation to the slice/trait metadata as appropriate as is done with pointer types. Ralf pointed out that this may not make sense; I'll respectfully disagree and reserve the right to revise my opinion.

3 Likes

maybe the handle could be written:

#[derive(CoerceUnsized)]
pub struct Handle<T> {
    handle: u32,
    #[metadata]
    metadata: T::Metadata,
}

this would invoke the appropriate compiler magic to implement unsized coercion by replacing the annotated field with the new metadata. this way, we don't have to expose a user-implementable function that does coercion.

2 Likes

I fully agree with your sentiment that Metadata should be a strongly typed thing. I had basically the same idea (of a strongly-typed metadata-only type in std) independently before (in the context of experimenting with #![feature(arbitrary_self_types)]; without those you'll currently have to ab-use e.g. null-pointers), and only just read your comment in the tracking issue.

Downgrading this later to () or usize is not an option, but it's also not necessary. If you want to conveniently create *const [T]s, keep using ptr::slice_from_raw_parts[_mut].

A strongly typed Metadata<T> would replace DynMetadata<T> entirely (in terms of public API). There's nothing special you can do with DynMetadata anyways. All its current API would be useful on a more general Metadata<T> as-well.

Internally, as an implementation detail, structs like DynMetadata, or an internal/hidden Pointee trait could keep existing; but all that needs to be stabilized is a struct Metadata<T: ?Sized>.

Metadata<T> for T: Sized would be constructible via const fn with Metadata::new().

Metadata<[T]> would be convertible from and to usize via associated methods and perhaps also From implementations.


Metadata<T> could implement all the neat things that e.g. *const T also implements, i.e.

  • CoercedUnsized<Metadata<U>> for Metadata<T> where T: Unsize<U> + ?Sized, U: ?Sized
  • DispatchFromDyn<Metadata<U>> for Metadata<T> where T: Unsize<U> + ?Sized, U: ?Sized

In my view, Metadata<T> would make API such as ptr::from_raw_parts only cleaner, which would just look like

pub fn from_raw_parts<T>(
    data_address: *const u8, 
    metadata: Metadata<T>,
) -> *const T 
where
    T: ?Sized, 

(*const u8 makes way more sense than *const ())

No more special traits, just a mostly straightforward struct.

The CoerceUnsized implementation would trivially mean that custom handles can be created that implemen CoerceUnsized, too.

The DispatchFromDyn means that you can - with arbitrary_self_types - write object-safe traits where the receiver is just the metadata

trait Foo {
    fn bar(self: Metadata<Self>) {}
}

fn baz(vtable: Metadata<dyn Foo>) {
    vtable.bar();
}

and also it would (perhaps more importantly) also allow custom handles to implement DispatchFromDyn and become receiver types for object-safe traits

3 Likes

Actually, you can get surprisingly far in modeling the API if you use null-pointers to model Metadata on the current nightly

#![feature(slice_ptr_len)]
#![feature(dispatch_from_dyn)]
#![feature(unsize)]
#![feature(coerce_unsized)]
#![feature(set_ptr_value)]
#![feature(arbitrary_self_types)]
#![feature(layout_for_ptr)]
#![feature(ptr_metadata)]

use std::{
    alloc::Layout,
    marker::Unsize,
    mem,
    ops::{CoerceUnsized, DispatchFromDyn},
    ptr, hash::Hash,
};

// should probably also get a fancy `Debug` impl; similar to the one that `DynMetadata` currently has
// but that seems impossible to implement in the context of this mockup
pub struct Metadata<T: ?Sized>(*const T);
impl<T: ?Sized> Clone for Metadata<T> {
    fn clone(&self) -> Self {
        *self
    }
}
impl<T: ?Sized> Copy for Metadata<T> {}
unsafe impl<T: ?Sized> Sync for Metadata<T> {}
unsafe impl<T: ?Sized> Send for Metadata<T> {}
// implementation detail
fn legacy_metadata<T: ?Sized>(meta: Metadata<T>) -> <T as ptr::Pointee>::Metadata {
    meta.0.to_raw_parts().1
}
impl<T: ?Sized> PartialEq for Metadata<T> {
    fn eq(&self, other: &Self) -> bool {
        PartialEq::eq(&legacy_metadata(*self), &legacy_metadata(*other))
    }
    fn ne(&self, other: &Self) -> bool {
        PartialEq::ne(&legacy_metadata(*self), &legacy_metadata(*other))
    }
}
impl<T: ?Sized> PartialOrd for Metadata<T> {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
        PartialOrd::partial_cmp(&legacy_metadata(*self), &legacy_metadata(*other))
    }
    fn lt(&self, other: &Self) -> bool {
        PartialOrd::lt(&legacy_metadata(*self), &legacy_metadata(*other))
    }
    fn le(&self, other: &Self) -> bool {
        PartialOrd::le(&legacy_metadata(*self), &legacy_metadata(*other))
    }
    fn gt(&self, other: &Self) -> bool {
        PartialOrd::gt(&legacy_metadata(*self), &legacy_metadata(*other))
    }
    fn ge(&self, other: &Self) -> bool {
        PartialOrd::ge(&legacy_metadata(*self), &legacy_metadata(*other))
    }
}
impl<T: ?Sized> Eq for Metadata<T> {}
impl<T: ?Sized> Ord for Metadata<T> {
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
        Ord::cmp(&legacy_metadata(*self), &legacy_metadata(*other))
    }
}
impl<T: ?Sized> Hash for Metadata<T> {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        legacy_metadata(*self).hash(state)
    }
}

impl<T> Metadata<T> {
    pub const fn new() -> Self {
        Metadata(ptr::null())
    }
}

impl<T> Metadata<[T]> {
    pub fn len(self) -> usize {
        self.0.len()
    }
}

impl<T> From<usize> for Metadata<[T]> {
    fn from(len: usize) -> Self {
        #[allow(clippy::invalid_null_ptr_usage)] // annoying false positive clippy lint
        Metadata(ptr::slice_from_raw_parts(ptr::null(), len))
    }
}

impl<T> From<Metadata<[T]>> for usize {
    fn from(meta: Metadata<[T]>) -> Self {
        meta.len()
    }
}

impl<T: ?Sized> Metadata<T> {
    // IMO, this should not be unsafe
    pub fn size_of(self) -> usize {
        unsafe { mem::size_of_val_raw(self.0) }
    }
    // IMO, this should not be unsafe
    pub fn align_of(self) -> usize {
        unsafe { mem::align_of_val_raw(self.0) }
    }
    pub fn layout(self) -> Layout {
        unsafe { Layout::from_size_align_unchecked(self.size_of(), self.align_of()) }
    }
}

impl<T, U> DispatchFromDyn<Metadata<U>> for Metadata<T>
where
    T: Unsize<U> + ?Sized,
    U: ?Sized,
{
}

impl<T, U> CoerceUnsized<Metadata<U>> for Metadata<T>
where
    T: Unsize<U> + ?Sized,
    U: ?Sized,
{
}

pub mod my_ptr {
    use super::*;
    pub fn from_raw_parts<T: ?Sized>(data_address: *const u8, metadata: Metadata<T>) -> *const T {
        metadata.0.set_ptr_value(data_address)
    }
    pub fn from_raw_parts_mut<T: ?Sized>(data_address: *mut u8, metadata: Metadata<T>) -> *mut T {
        (metadata.0 as *mut T).set_ptr_value(data_address)
    }
    // morally associated funtions though:
    pub fn to_raw_parts<T: ?Sized>(self_: *const T) -> (*const u8, Metadata<T>) {
        (
            self_ as *const u8,
            Metadata(self_.set_ptr_value(ptr::null())),
        )
    }
    pub fn to_raw_parts_mut<T: ?Sized>(self_: *mut T) -> (*mut u8, Metadata<T>) {
        (
            self_ as *mut u8,
            Metadata(self_.set_ptr_value(ptr::null_mut())),
        )
    }
}

///////////////////////////////////////////////////////////////////////////////////////////////////

// temporary implementation detail to make the compiler happy about dispatching from `Metadata<T>`
// this impl should not exist
impl<T: ?Sized> std::ops::Deref for Metadata<T> {
    type Target = *const T;

    fn deref(&self) -> &Self::Target {
        panic!("temporary implementation detail")
    }
}

trait Foo {
    fn bar(self: Metadata<Self>) {}
}

fn baz(vtable: Metadata<dyn Foo>) {
    vtable.bar();
}

(playground)

(Of course, implemented this way, the Metadata<T> is one size_of<usize>() too large.)

1 Like

To get a better feeling of the API, I tried it for a use case I had a while back.

In the depths of rustc_query_system, there were several functerions that took a generic closure and pass it along. Since there were about 300 call sites, all those functions were monomorphized 300 times. To reduce bootstrap time, I tried to make them take a dyn FnOnce instead.

That didn't work out though, because allocating a Box<dyn FnOnce> had to much overhead on these hot functions, and a workaround with &mut dyn FnMut and Option::take() didn't work because of some complex control flow.


The RawBox with an inline storage would have made it (almost) work though:

fn foo() {
    let some_object = vec![42];
    let closure = |context: &Context| context.process(some_object);
    
    let boxed = RawBox::new(closure, SingleElement::new())
        .map_err(|_|())
        .unwrap()
        .coerce();
    
    takes_fn_once(boxed);
}

// This is called on a hot path from many places.
fn takes_fn_once(fn_once: RawBox<dyn FnOnce(&Context), SingleElement<[usize; 4]>>) {
    // Imagine lots of entangled code here.
    let context = Context::get();
    
    fn_once(&context);
    //^ error: cannot move out of dereference of `RawBox<dyn for<'r> FnOnce(&'r Context), SingleElement<[usize; 4]>>`
}

The one thing that doesn't work is actually calling the FnOnce. The real Box<dyn FnOnce> has a special case to handle this, so this is just a limitation with the proof of concept. Also, there is some Result type juggling needed with .map_err(|_|()).unwrap() in the PoC.


Now to the original question about .coerce():

From a user's perspective, it definitely would be nice if the extra .coerce() call wouldn't be needed. Since it's not needed with a normal Box, it's inconsistent and an additional thing to learn and remember. I'd even argue that it makes the difference between "Make dyn FnOnce easy to use" and "Needs lots of boilerplate and type magic".

When using custom allocators/storages, the type bounds get big pretty quickly. That's why it took me an hour to come up with the example above. The error you get without the .coerce() says

error[E0308]: mismatched types
  --> examples/dyn_fn_once.rs:52:19
   |
45 |     let closure = |context: &Context| context.process(some_object);
   |                   ------------------------------------------------ the found closure
...
52 |     takes_fn_once(boxed);
   |                   ^^^^^ expected trait object `dyn FnOnce`, found closure
   |
   = note: expected struct `RawBox<(dyn for<'r> FnOnce(&'r Context) + 'static), storage_poc::inline::SingleElement<[usize; 4]>>`
              found struct `RawBox<[closure@examples/dyn_fn_once.rs:45:19: 45:67], storage_poc::inline::SingleElement<_>>`

(and without the .map_err(|_|()) it's even worse.) The error isn't really actionable. Of course, some of that can be mitigated with better error messages, but long types have always been a pain point.

It would be best if it would just work, without a .coerce() call.


Here are some API ideas:

// Current PoC:
RawBox::new(closure, SingleElement::new()).map_err(|_|()).unwrap().coerce();
// Cleaned up:
Box::new_in(closure, SingleElement::new()).coerce();
// New idea:
Box::new_in(closure, [0; 4]);
// Compare that to heap allocation:
Box::new(closure);

// Type signatures for the above:
RawBox<dyn FnOnce(), SingleElement<[usize; 4]>>
Box<dyn FnOnce(), SingleElement<[usize; 4]>>
Box<dyn FnOnce(), [usize; 4]>
Box<dyn FnOnce()>

It would be awesome if the code can be made as concise as the third one. I feel like every bit of verbosity really makes a difference in how difficult it is to use, because it leans so much on type inference to be practical.

It's "just" a question of how to actually implement it :slight_smile: