Proposals to support DST smart pointers

For a long time, there's not really been a sane way to construct custom dynamically-sized types (DSTs). They can't be constructed on the stack, so they must be constructed on the heap, but none of the standard library's heap-allocated containers can handle custom DSTs. Box, Rc and Arc can be constructed by moving a Sized type into them, or by constructing a slice of MaybeUninit, but nothing beyond that. Box::from_raw can take a pointer to a memory allocation, which you can in theory use to construct a DST before giving Box ownership of the memory, but Rc and Arc have no such option, as the pointer needed for their from_raw constructors can only be obtained from a previously constructed container.

I've thought about possible ways out of this tangle, but I'm not sure which one is the most promising option that I should pursue with a followup RFC.

1. Unsized unions

The main reason that the above containers can't be constructed with MaybeUninit for arbitrary types is that it's a union, and unions must be sized. You can have a [MaybeUninit<T>] but not a MaybeUninit<[T]>. I found some discussions about allowing (some subset of) unsized unions:

The most conservative would be to allow a union to contain exactly one unsized variant, just like structs. It could even be required to be the last, though for a union that seems superfluous. Because this would be a change to the core language itself, it probably has the most soundness ramifications, and therefore seems like the most likely to run into roadblocks and stall. But it also seems like this is the option everyone ultimately "wants" to happen in the long term.

2. UniqueRcUninit

UniqueRc is a relatively new and still unfinished API for constructing an Rc. It seems to have been added primarily for creating cyclic reference chains, but it could also be used to construct values in-place. In Discord, @kpreid pointed me towards a private helper type for UniqueRc, called UniqueRcUninit, that was merged only a few days ago. Its purpose is to help with allocating memory for UniqueRc, and then providing some way to initialise it. If this were made public, and we added BoxUninit and UniqueArcUninit, then this could work around the limitations of unions. But as a private type it may not be suitable for a public API without some changes.

3. New allocation-only constructors

If neither of the above options are feasible, then there is a fallback option, which is to add a completely new API. A possible API was suggested in Discord, with some adjustments by me:

impl<T> Rc<T>
    T: ?Sized + Pointee
    pub fn alloc(metadata: <T as Pointee>::Metadata) -> *mut T;
    pub unsafe fn dealloc(ptr: *mut T);

These functions would also be added to Box and Arc. Essentially, these functions would do the same with a manual raw pointer that UniqueRcUninit does as a custom type:

  • The alloc constructor would allocate memory, and then return a raw pointer to the T within the allocation. This would be the same pointer that's returned by the into_raw method of these three types, but can be obtained without first obtaining a Box/Rc/Arc with an initialised T. The user would then use this pointer to initialise the T, and then call from_raw to create the final container.
  • The dealloc function is needed in case construction of the T fails in some way, and the memory needs to be freed without ever initialising the memory and creating the container type.

Do you think any of these ideas is worth investigating further and turning into an RFC?

1 Like

To me, it seems like unsized unions would be the best option here. However, you would still need some way to allocate a UniqueRc<MaybeUninit<[T]>>. This would essentially mean copying the basic MaybeUninit APIs (uninit, write, etc) to all of those containers.

1 Like

Once you have the UniqueRc<MaybeUninit<[T]>>, you can deref to get a &mut MaybeUninit<[T]> for purposes of initializing it. So, the only operations needed per container would be the initial and final ones:

  • Allocate Container<MaybeUninit<T>>.
  • Transform Container<MaybeUninit<T>> to Container<T>.