Summary
Add a new repr: #[repr(collapse_uninhabited)] (name to be bikeshed) that can optimize out all fields of a struct or variant if any field of that struct or variant is uninhabited.
Motivation
- Allow optimizing
Result<T, E>to justTifEis uninhabited, especially ifEis large - Unblock
offset_of_enum
Guide-level explanation
A #[repr(collapse_uninhabited)] type represents that the author is opting-out of Rust's normal "all fields (per variant) are nonoverlapping and the type is aligned enough for each field" guarantees when the type (or variant) is uninhabited in the particular instantiation.
For example:
#[repr(collapse_uninhabited)]
pub struct Pair<T, U>(pub T, pub U);
assert!(size_of::<Pair<u32, u32>>() >= 2 * size_of::<u32>());
assert!(align_of::<Pair<u32, u32>>() >= align_of::<u32>());
assert!(size_of::<Pair<!, u32>>() == 0);
assert!(align_of::<Pair<!, u32>>() == 1);
Field projection and pattern-matching by-value or by-reference on such a type or variant is allowed in safe code as normal, as having a value of or reference to such a type/variant already requires that the type/variant be inhabited.
However, projecting behind a raw pointer, or computing offset_of! a field in such a type is a deny-by-default lint in all cases, and is a hard monomorphization-time error if the type is uninhabited in the particular generic instantiation (See "Alternatives > Alternatives to post-mono error ...").
The fallible try_offset_of! can be used on such types with no error or lint, and will return None if the type or variant in question is uninhabited (even if the particular field is zero-sized).
For example:
#[repr(collapse_uninhabited)]
pub struct Pair<T, U>(pub T, pub U);
impl<T, U> Pair<T, U> {
pub fn into_tuple(self) -> (T, U) {
(self.0, self.1) // Allowed with no lint: this could would be unreachable if `Self` was uninhabited
}
pub fn as_ref(&self) -> (&T, &U) {
(&self.0, &self.1) // Allowed with no lint: this could would be unreachable if `Self` was uninhabited.
// Even if we end up with `&V` being *valid* when `V` is uninhabited,
// it is still not *safe* to expose to arbitrary code, so this should still be sound.
// The semantics for if `&Pair<T, U>` is field-projected when `Pair<T, U>` is uninhabited is immediate language UB.
}
pub unsafe fn first_ptr(this: *const Self) -> *const T {
&raw const (*this).0 // Deny-by-default lint here
// If the lint is downgraded, then calling this function where
// `T` *or* `U` is uninhabited will be a post-mono error.
}
pub fn first_offset() -> usize {
offset_of!(Self, 0) // Deny-by-default lint here
// If the lint is downgraded, then calling this function where
// `T` *or* `U` is uninhabited will be a post-mono error.
}
pub fn try_first_offset() -> Option<usize> {
try_offset_of!(Self, 0) // Allowed with no lint
}
pub unsafe fn first_ptr_good(this: *const Self) -> Option<*const T> {
let offset = try_offset_of!(Self, 0)?;
Some(this.byte_add(offset).cast::<T>())
}
}
When applied to enums, #[repr(collapse_uninhabited)] applies to the whole enum, as well as per-variant. That is, individual uninhabited variants can be optimized out, and if there is no inhabited variant, the whole enum can be optimized to a 1-aligned 0-sized type.
For example:
#[repr(collapse_uninhabited)]
pub enum MyOption<T> {
Some(T),
None,
}
impl<T> MyOption<T> {
pub fn into_std_option(self) -> Option<T> {
match self {
MyOption::None => MyOption::None,
MyOption::Some(value) => Some(value), // Allowed with no lint
// If we reach this arm, then `MyOption::<T>::Some` is inhabited
}
}
pub fn as_ref(&self) -> Option<&T> {
match self {
MyOption::None => MyOption::None,
MyOption::Some(value) => Some(value), // Allowed with no lint
// If we reach this arm, then `MyOption::<T>::Some` is inhabited
}
}
pub fn value_offset() -> usize {
offset_of!(Self, Some.0) // Deny-by-default lint here
// If the lint is downgraded, then calling this function where
// `T` is uninhabited will be a post-mono error.
}
pub fn try_value_offset() -> Option<usize> {
try_offset_of!(Self, Some.0) // Allowed with no lint
}
pub unsafe fn value_ptr_good(this: *const Self) -> Option<*const T> {
let offset = try_offset_of!(Self, Some.0)?;
Some(this.byte_add(offset).cast::<T>())
}
}
#[repr(collapse_uninhabited)]
pub enum MyResult<T, E> {
Ok(T),
Err(E),
}
const _: () = assert_eq!(size_of::<MyResult<(u32, !), (!, u32)>>(), 0);
const _: () = assert_eq!(align_of::<MyResult<(u32, !), (!, u32)>>(), 1);
const _: () = assert_eq!(size_of::<MyResult<u8, (MyLargeErrorType, !)>>(), 1);
const _: () = assert_eq!(align_of::<MyResult<u8, (MyLargeErrorType, !)>>(), 1);
Reference-level explanation
Layout
When #[repr(collapse_uninhabited)] is applied to a struct, and (a particular generic instantiation of) that struct is uninhabited, the resulting layout is 1-aligned and zero-sized. No space is reserved for any field (even ones of inhabited types), and the alignment is not raised to that of the most-aligned field. If (a particular generic instantiation of) the struct is not uninhabited, it is laid out as though #[repr(collapse_uninhabited)] were not present (See "Interaction with other reprs)
When #[repr(collapse_uninhabited)] is applied to an enum, any variant which is uninhabited in a particular generic instantiation is not considered when computing the layout of the enum as a whole. If such an enum has no inhabited variants, the resulting layout is the same as though it had been declared with no variants (i.e. 1-aligned and zero-sized and uninhabited, for enums with no other repr hints (see Interaction with Other Reprs below)).
Examples:
#[repr(collapse_uninhabited)]
struct Pair<T, U>(pub T, pub U);
Partial Initialization
Partial Initialization in safe code
Rust does not currently have partial initialization in safe code. If such a feature were to be implemented, there are a few ways it could interact with this feature:
- Do not allow
#[repr(collapse_uninhabited)]types to be partially initialized at all. - Similar to
offset_of!/ptr field projections: deny-by-default lint in all cases, post-mono error when used on a particular instantiation that is uninhabited (or panic, see "Alternatives > Alternatives to post-mono error ..."). - Allow
#[repr(collapse_uninhabited)]to be partially initialized in safe code, but silently change the semantics to have each initialized field in its own local variable when the type overall is uninhabited- This would probably be incompatible with
&rawborrowing of partially-initialized variables, if such a feature were to be added.
- This would probably be incompatible with
I think in-place-initialization proposals would mostly interact with #[repr(collapse_uninhabited)] in the same way as safe-partial-initialization (i.e. probably one of the above three interactions, and they should have the same interaction if both are added).
Partial Initialization in unsafe code
This is IMO covered by the lints and post-mono errors on raw pointer field projection and offset_of!.
Semver
Adding #[repr(collapse_uniniabited)]
Adding #[repr(collapse_uniniabited)] to a type with any public fields on which offset_of! can be called is a major breaking change. As such, this repr cannot be added to any existing stable stdlib structs with public fields.
Also we cannot add this repr to primitive tuples, sice offset_of! on them is stable.
Enums
Note that since the offset_of_enum feature is currently unstable, and is the only way to observe whether an enum has #[repr(collapse_uninhabited)], it is not a breaking change to add #[repr(collapse_uninhabited)] to an enum before feature(offset_of_enum) is stabilized. As such, we should evaluate if stdlib enums should get this repr (see "Stdlib types" below), or if this repr should be the default for #[repr(Rust)] enums (see "Alternatives" below).
Removing #[repr(collapse_uniniabited)]
Ideally, removing #[repr(collapse_uninhabited)] from a type with should not be a breaking change. However, we'd need to consider how this interacts with #[repr(transparent)], e.g.
// upstream crate v1.0
#[repr(Rust, collapse_uninhabited)]
pub struct Foo<T>(u32, T);
#[repr(Rust, collapse_uninhabited)]
pub enum Bar<T> { None, Some(u32, T) }
// downstream crate
#[repr(transparent)]
pub struct Wrapper(u32, upstream::Foo<!>, upstream::Bar<!>);
// Foo<!> is collapsed to an (uninhabited) 1-ZST,
// and Bar<!> is collapsed to an inhabited 1-ZST,
// so it seems like it should be allowed
// upstream crate v1.1
#[repr(Rust)]
pub struct Foo<T>(u32, T);
#[repr(Rust)] // no_collapse_uninhabited if we make it opt-out
pub enum Bar<T> { None, Some(u32, T) }
// Oops! either of these changes individually break downstream,
// as `Foo<!>` and `Bar<!>` are now not 1-ZSTs
I think it would be reasonable to disallow #[repr(collapse_uninhabited)] types as the 1-ZST fields in #[repr(transparent)] types, especially cross-crate, though enforcing this transitively may be difficult.
Stdlib types
Option
I think it would be possible to add #[repr(collapse_uninhabited)] to Option in a backwards-compatible manner.
Option::as_slicewould need to be changed to usetry_offset_of!, and return an arbitrary empty slice ifTis uninhabited. This will be fully known at codegen time (because it will be monomorphic) so should not have a perf impact I think.- All the current guaranteed-null-pointer-optimized types are inhabited (assuming
&!does not start being considered type-level uninhabited), so that layout guarantee should not conflict with this.
Result
I think it'd be fine, for the same reasoning as Option with guaranteed null-pointer-optimized T with Result<T, 1ZST>.
All other enums
I think it should be backwards-compatible to add #[repr(collapse_uninhabited)] to any enum with no specific layout guarantees, before feature(offset_of_enum) is stabilized.
Interaction with other reprs
Note: this is specifically for applying multiple reprs to the same type. See "Semver > Removing #[repr(collapse_uniniabited)]" for discussion of interaction with #[repr(transparent)] in a different type.
#[repr(Rust)]:#[repr(Rust, collapse_uninhabited)]is equivalent to#[repr(collapse_uninhabited)]#[repr(align(N)]: Works as normal: aligns the whole type to at least align-N(even if it is uninhabited)#[repr(packed(N)]: Works as normal: packs the type and any fields to at most align-Nand removes padding if possible.#[repr(transparent)]: Cannot be combined#[repr(C<=2024/ordered_fields/linear)](current#[repr(C)],#[repr(ordered_fields)]from RFC 3845) for structs: We could allow this, and just say "if all fields are inhabited, behaves asrepr(linear), else uninhabited 1-ZST"#[repr(C>=2025/system)](#[repr(C)]in future edition from RFC 3845) for structs: Cannot be combined#[repr(C)]/#[repr(C, u8)]/#[repr(u8)]primitive enum representations: We might consider allowing this and just having any uninhabited variants not participate in layout calculation- This would require that we allow (or at least it would seem inconsistent if we didn't allow) empty
#[repr(C/Int)]enums, which are uninhabited and have the size/align of their tag type - If we don't allow this, as a workaround users can make a
#[repr(linear, collapse_uninhabited)] structto hold the fields of a variant they want to be collapsible- This works perfectly fine for
#[repr(C)]/#[repr(C, int)]enums which use a(tag, payload)representation anyway (the payload for the particular variant is then just the intermediatestruct). - This isn't fully equivalent with
#[repr(int)]enums, as the type will have a different offset in the case it's not uninhabited, and contains any field more aligned than the tag.
- This works perfectly fine for
- This would require that we allow (or at least it would seem inconsistent if we didn't allow) empty
Interaction with unsized types
Possiblities:
- We require all fields be
Sized. - We allow an unsized type as the last field, but disallow unsizing (i.e. the current state for tuples).
- need to make
size/align_of_val_rawknow about it
- need to make
- We allow an unsized type as the last field, and allow unsizing "as normal".
- I don't think this can work: what if we unsize
Pair<u32, !>toPair<u32, dyn Debug>? - need to make
size/align_of_val_rawknow about it
- I don't think this can work: what if we unsize
Drawbacks
This complicates the language, and breaks the current status quo that types have space for all of their fields in all layouts.
Alternative choices and future possibilities
Make enums #[repr(collapse_uninhabited)] by default
We could make #[repr(collapse_uninhabited)] the default for #[repr(Rust)] enums, and provide an opt-out for them (e.g. #[repr(no_collapse_uninhabited)] or #[repr(collapse_uninhabited = false)] or something).
Note if we did this, we would have to do it before stabilizing feature(offset_of_enum). We wouldn't necessarily have to stabilize the repr itself, just the layout effects, lints, try_offset_of, and (probably) the post-mono errors. We could allow opting-out enums and/or opting-in structs later by stabilizing repr(no_collapse_uninhabited) and/or repr(collapse_uninhabited).
Per-variant collapsing
We could allow applying the representation on individual variants in an enum, such that only the particular variants are subject to the "collapsing" semantics. (Or if we make it opt-out for enums, we could also allow opting-out per-variant, or opting-out for the enum as a whole, but opting-in for a particular variant).
This would be possible to add later (though semver would need to be considered when actually opting-in/out a variant).
Lint level
The deny-by-default lint for offset_of! or raw pointer projecting through a #[repr(collapse_uninhabited)] type could instead be warn-by-default. I don't think it should be allow-by-default.
Alternatives to post-mono error for offset_of! and &raw (*ptr).field
We could instead make offset_of! through an uninhabited #[repr(collapse_uninhabited)] type give a runtime panic if the code is reached.
We could make &raw const (*ptr).field through an uninhabited #[repr(collapse_uninhabited)] type panic at runtime or give UB. Either of these would allow checking if try_offset_of! returns Some, and then using normal field projection afterwards.
Prior Art
- Enum variants that are uninhabited and have only 1-aligned 0-sized type fields are considered "absent" and are mostly ignored for layout purposes.
Unresolved questions
Interaction with MIR optimizations that would conflict with this GitHub · Where software is built