Unsized `MaybeUninit`

(There was a previous discussion about this, but it was locked for inactivity without any comments.)

Is there a reason why MaybeUninit<T> needs to require T: Sized? It seems meaningful to be able to do it on unsized T, and doing so would also be useful. (My usecase involves a wrapper type that asserts that a MaybeUninit<T> is in fact initialised and thus allows safe assume_init_ref and assume_init_mut operations โ€“ it's both meaningful and useful to do an unsized coercion on this wrapper type, e.g. from Wrapper<T> to Wrapper<dyn Trait>, but this would require coercing the underlying MaybeUninit<T> to MaybeUninit<dyn Trait>. The best available alternative would involve recreating &MaybeUninit and &mut MaybeUninit using raw pointers, but this would make the code much harder to read and write and much fuller of unsafe, so it would significantly raise the chance of a soundness error โ€“ it also makes type-generic code much harder to write because most existing traits think in terms of references rather than pointers.)

As far as I can tell, this restriction occurs because unions don't support unsized fields, and that restriction occurs for unknown reasons: it isn't specified in the Rust reference, and the Rust compiler produces a confusing error message when you try (it states that T needs to be sized due to appearing in ManuallyDrop<T>, but ManuallyDrop can wrap unsized types โ€“ there's a separate help message stating that unions don't allow unsized fields, but that's in addition to the incorrect explanation rather than instead of it).

As such, I'd like to suggest either a) allowing unsized fields in unions, or b) special-casing MaybeUninit so that it can have an unsized field despite being a union. The semantics would be that the appropriate metadata is always present and is used to determine the size/alignment of the union, but is otherwise used only when accessing the unsized union field and ignored when accessing the other fields. For example, you could coerce a MaybeUninit<String> to a MaybeUninit<dyn Debug>, and the resulting type would be an unsized type whose runtime size and alignment were the same as those of String, and would use String's vtable when accessing the type as though it were initialized. For simplicity, it is probably best to limit this to one unsized field per union, to avoid needing to juggle two different sets of metadata (Rust already has a rule like this for other unsized types, e.g. you can't coerce Box<[u32]> to Box<dyn Debug> because the resulting type would have two different sets of unsized metadata).

Not all the methods of MaybeUninit work on unsized types (e.g. you couldn't use the by-value assume_init), but enough of them do to still be able to use it meaningfully.

Is there a reason why this wouldn't work? Or is it just the fact that it hasn't been implemented yet, or that it would require an exception to usual language design rules? (I can't think of any other use cases for unsized unions, but then there aren't many use cases for unions in Rust in general.)

I don't see the discussion in rust-lang linked: MaybeUninit requires T: Sized but it should not ยท Issue #80158 ยท rust-lang/rust ยท GitHub

Boils down to: for all types T we expect some behavior of pointers-to-T but MaybeUninit can not provide them for all its possible type instantiations. Usually type constructors just defer to T but obviously we can not do that here, T is allowed to enforce any number of additional invariants that MaybeUninit<T> can not possibly guarantee to uphold. No one brought forward a consistent way of dealing with this that can be evaluated for usefulness and ergonomics. We could either think of restricting the type by a special bound (so not all unsized types, just those that don't even look at the value), or if we don't wand to restrict the generic bring an idea how to get away with *const MaybeUninit<T> not being well-behaved for all T. For the latter, the crux is one has to provide reasoning about it being a well-formed type and a list of behavior for values of that type.

As far as I know, neither was fully explored. I think the lang-team would like to at least have a beginning of formal reasoning for any proposed idea.

2 Likes

Hmm, so one big difference between the discussion at the time, and current (nightly/unstable) Rust, is that we now have a sized hierarchy (tracking issue, RFC). The main thing that was blocking this back in 2020 was that there was no way to say "a type whose size can be determined using a pointer/reference to the type (without inspecting the pointer/reference target)" other than Sized.

RFC 3279 introduces a specific concept of MetaSized which seems to have exactly the desired behaviour (i.e. it specifies that a pointer or reference to a value of that type is enough to determine its size and alignment). This seems like it would be sufficient to allow MaybeUninit<T: MetaSized> to work correctly โ€“ everything you can do with a MaybeUninit either depends purely on its size and alignment, or requires it to hold an initialized variable (in particular, there would be no need to know whether or not any contained T obeyed its invariants in order to manipulate the MaybeUninit itself via pointer).

Note that I wouldn't expect MaybeUninit::uninit() and the like to work on an unsized type (nor do I need the ability to do that) โ€“ I'd expect it to be created directly from a piece of uninitialized memory that's the right size/alignment to store a value of the correct type, or by unsizing a sized MaybeUninit.