Storing the size in the box?

We use Rc<Box<dyn T>> and Box<Box<dyn T>> a lot, because the outer box has a size suitable for FFI but the inner one does not. It would be nice if either Rust had a highly-optimized allocator for (ptr,size) pairs (to reduce the cost) or if one could somehow shove the size in the box itself (perhaps with a wrapper type? maybe as a lang item?).

With custom allocator generics, you can provide this while still using the standard types. You can already do this with custom owning types.

For the specific case of array DSTs, this is already possible using custom types and ptr::slice_from_raw_parts.

Once ptr::metadata stabilizes, it will be possible to do the same for arbitrary DST types (and I intend to add library support for such into erasable).

I still need to do a bit of thinking as to how to acquire a Thin<P<InlinePtrMetadata<dyn Tr>>> (name obviously subject to bikeshedding), but the scheme otherwise works in library code without new language functionality (beyond #![feature(ptr_metadata)]).

It appears I may have backed myself into a corner, actually:

#![feature(ptr_metadata)]

use {
    erasable::Erasable,
    std::{marker::PhantomData, ptr},
};

#[repr(C)]
pub struct Indyn<Dyn: ?Sized, T: ?Sized = Dyn> {
    phantom: PhantomData<Dyn>,
    metadata: <Dyn as ptr::Pointee>::Metadata,
    inner: T,
}

unsafe impl<Dyn: ?Sized> Erasable for Indyn<Dyn> {
    unsafe fn unerase(this: erasable::ErasedPtr) -> ptr::NonNull<Self> {
        let metadata = ptr::read::<<Dyn as ptr::Pointee>::Metadata>(this.as_ptr() as *mut _);
        let this: *mut Dyn = ptr::from_raw_parts_mut(this.as_ptr() as *mut _, metadata);
        ptr::NonNull::new_unchecked(this as *mut Indyn<Dyn>)
    }

    const ACK_1_1_0: bool = true;
}

This works, and miri is happy to accept it.

Example
macro_rules! indyn {
    ($t:expr; as $d:ty) => {{
        let t = $t;
        let p: &$d = &t;
        Indyn {
            phantom: PhantomData,
            metadata: ptr::metadata(p),
            inner: t,
        }
    }};
}

fn main() {
    let b: Box<Indyn<dyn Any>> = Box::new(indyn!(0usize; as dyn Any));
    println!("type_name: {}", std::any::type_name_of_val(&b));
    println!("size_of  : {}", std::mem::size_of_val(&b));

    let thin = erasable::erase(ptr::NonNull::new(Box::into_raw(b)).unwrap());
    println!("type_name: {}", std::any::type_name_of_val(&thin));
    println!("size_of  : {}", std::mem::size_of_val(&thin));

    let b: Box<Indyn<dyn Any>> = unsafe { Box::from_raw(Indyn::unerase(thin).as_ptr()) };
    println!("type_name: {}", std::any::type_name_of_val(&b));
    println!("size_of  : {}", std::mem::size_of_val(&b));

    dbg!(b.downcast_ref::<usize>());
}
type_name: alloc::boxed::Box<indyn::Indyn<dyn core::any::Any>>
size_of  : 16
type_name: core::ptr::non_null::NonNull<erasable::priv_in_pub::Erased>
size_of  : 8
type_name: alloc::boxed::Box<indyn::Indyn<dyn core::any::Any>>
size_of  : 16
[src\main.rs:63] b.downcast_ref::<usize>() = Some(
    0,
)

Unfortunately...

error[E0119]: conflicting implementations of trait `erasable::Erasable` for type `Indyn<_>`
  --> src\main.rs:17:1
   |
17 | unsafe impl<Dyn: ?Sized> Erasable for Indyn<Dyn> {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: conflicting implementation in crate `erasable`:
           - impl<T> Erasable for T;

the blanket impl of Erasable for any sized T conflicts with the specific impl for Indyn. I still stand by the blanket impl, so I guess arbitrary DST metadata support for erasable::Thin will need to wait for both ptr_metadata and min_specialization :cry:


(I'm sorry-not-sorry; Indyn is a pun on "linline")

Hmm. This feels more suitable as a lang item, just saying. Good attempt tho.

(Specifically, isn't the goal that Box<Indyn<dyn Foo>> would have a size of 8?)

That's not possible in general until custom DSTs. erasable::Thin is a wrapper around any erasable pointer that stores it in its erased (thin) form without losing type safety. I would've used it here if not for the coherence issue.

I think this is worth digging into: why?

Box is highly special, down to being a unique kind of type in the compiler (or at least it was at one point, I don't know if that's been unified?), and Box's specialness is usually seen as a historical accident (but a useful one) that the lang/compiler teams would like to decrease in the future.

Cell/RefCell/Mutex etc. are not language items, they're all library features built on top of one language feature, UnsafeCell.

What makes Indyn special that it needs to be implemented as a compiler item rather than a regular library type?

It's unfortunate that as of current rustc (plus #[feature(ptr_metadata)]) Indyn can't always be a thin DST, but that's solved in the future by custom DSTs. Any language feature for Indyn is going to look a lot like Indyn (though keep in mind, the proof-of-concept is just that, a proof that it works, not necessarily the best API); if the language feature can be implemented strictly in library code, why shouldn't it just be a library feature?

Making Indyn a language feature isn't going to magically make it stably work as a thin DST without stabilizing ptr_metadata and custom DSTs. In fact, I'd give you 90+% odds that the way the lang team would implement Indyn would be as a library feature using ptr_metadata and custom DSTs.

Plus, Thin<P<Indyn<dyn Trait>>> works on today's nightly. (I'll be adding nightly-only feature gated support to erasable and indyn this coming weekend.) Once custom DSTs are available, P<Indyn<dyn Trait>> will (hopefully) also be thin.

I fail to see the value-add of rejecting the library implementation and waiting even longer for a potential language implementation.

Our thought process was that making it a lang item would make it work sooner. At the very least it could be implemented as a custom DST (under the hood) before custom DSTs get a defined syntax and semantics, thus also helping shape those syntax and semantics. (Indeed, just like Box. See below.)

Additionally, the "problem" with Box is simply one of ?Uninit types, as has been discussed before. It would stop being a lang item if we had ?Uninit types. It doesn't look like we'll have those anytime soon tho, and this thread isn't about that issue.

Syntax, sure. But semantics, not really. As a new kind of DST, thin DSTs would require deciding on the semantics of new DST kinds throughout the language and compiler.

Sure you could sidestep a little bit of complexity by the fact that it doesn't introduce a new pointer metadata type. But not enough to make it significantly easier of a problem, though, imo.

Also, a language thin DST Indyn would want to always be Indyn<T>, not Indyn<Interface, ActualT> like I've written. That would mean that it's always unsized, which would mean requiring unsized_locals to be usable. (My Indyn abuses the second parameter to be conditionally unsized to get behind an indirection at which point it can be unsized.) unsized_locals is hard blocked on custom DSTs being fully designed and workable, if not stable, such that custom DSTs can also be held as locals.

"Make it a lang item" isn't a magic bullet to push features through to stabilization faster. For one, the first question is "why can't this be a library item?" Plus, language extensions are under a much higher burden of proof for addition, for good reason.

Box digression

The ability to talk about (partially) uninitialized types in the type system isn't enough to demagic Box; you also need typestate. You need the type of existing bindings to change based on the initialization state of the value.

This is much more complicated than "just" supporting (partially) maybe uninitialized types.

1 Like

Think of it this way: trying to make Indyn<T> work would lead to defining the semantics of custom DSTs, which would then lead to defining the syntax. That doesn't necessarily mean stabilizing it sooner, but it does make it easier to reason about with an actual implementation.

As for the Box digression, we consider those inseparable. We've already argued about it.

Sometimes you just need to let the implementation shape the syntax/features you wanna create. Box is special in that you can move in and out of it, despite it being a Drop type. So one should use Box to shape ?Uninit types and the stuff around it. Make an Indyn<T> and let it shape custom (thin) DSTs.

Are we making any sense here? Are these good, valid points? Any feedback? .-.

My stance remains the same. An Indyn that is always thin requires solving all of the barriers between custom DSTs and stabilization. There is next to no way a std Indyn is stabilized before custom DSTs. As I said previously, a std Indyn would want to always be a trait object, which requires unsized locals, which is another huge far-future feature to block on. In order to always be thin, it potentially even requires this; my implementation allows you to e.g. create Indyn<dyn Tr1, dyn Tr2> via unsizing, which can't be thin, since it's storing the incorrect metadata inline. And you can't unsize from Indyn<T> to Indyn<dyn Tr>, because the whole point is storing metadata inline, which necessarily changes if you unsize the type.

The syntax is not the hard part of a feature; the semantics are.

Alright. And wouldn't it make sense to design unsized locals, custom DSTs, etc around an Indyn rather than the other way around?

What do unsized locals have to do with this feature? AFAIK currently the main problems are with alignment and interactions with async/generators.

Who knows. @CAD97 keeps bringing up unsized locals.