Missed niche optimization on unions

I've noticed that Rust doesn't perform niche optimization on unions (i.e. reduced-size Option<union>). Is that an omission, or is there a technical reason for this?

I'd expect this to work, since the first field is the same in all union variants, and has a niche:

use std::num::NonZeroU8;

pub union Niche {
    a: NonZeroU8,
    b: (NonZeroU8, u8),
    c: (NonZeroU8, u16),
}

fn main() {
    assert_eq!(std::mem::size_of::<NonZeroU8>(), std::mem::size_of::<Option<NonZeroU8>>(), "bare");
    assert_eq!(std::mem::size_of::<Niche>(), std::mem::size_of::<Option<Niche>>(), "union");
}

I don't have the specific citation for this, but I distinctly recall it being decided that union should suppress validity requirements (thus any niching opportunities) and that any byte layout be valid in a union.

This is notable for e.g. MaybeUninit, which is currently just[1] defined as union { (), ManuallyDrop<T> }. This means that (as currently implemented in the compiler) union has to suppress niches, as I understand it.


  1. It is a lang item, but a "soft" lang item in that the compiler emits code using it, not that it is given special semantics. ↩︎

2 Likes

To me this feels solved by simply not considering the extra padding for the niche, or in the other words, consider the padding as if it can take any bit pattern. Thus the () variant has no niche, and MaybeUninit continues working like it currently does.

1 Like

The counterargument to that is that unions with fields that all have consistent niche's (like the Niche union in the OP) could still utilize those niches.

Whether or not that's worth the effort of implementing is a separate question, though... Unions complicate niches because a union like union U { a: (NonZeroU8, u8), b: (u8, NonZeroU8) } doesn't have a niche anymore, unlike a struct would. Union niches would require some careful, custom logic to implement correctly.

My recollection is that this fell out of deciding that Rust doesn't have the "at least one union variant must be valid" rule, and thus one cannot trust that any niche, even if consistent across all variants, would be there.

(I don't recall why this was decided, though. Since it seems to me like adding a () variant would always work to get essentially that behaviour if that's what was desired for a particular union.)

4 Likes

It could potentially, fields with niches could be sorted to the start of their alignment class so that (NonZeroU8, u8) and (u8, NonZeroU8) have the same layout (which seems like it could help enum niche opts too).

Also of note is that in the case of a union like union U { a: NonZeroU8, b: (u8, NonZeroU8) } (with tuples that lay their fields out in declaration order), the a variant could be offset to align the niche with that in the b variant since repr(Rust) unions don't guarantee their layout.

2 Likes

If unions were guaranteed to suppress validity requirements, then it could simply be union MaybeUninit<T> { init: ManuallyDrop<T> }, the () variant is only necessary as a way to forceably remove the validity requirements if union doesn't do that.

1 Like

One reason (of many) that you aren't guaranteed at least one union variant is valid: Consider a struct containing a type byte followed by a union, where the type byte determines the correct interpretation of the union, and not all types are defined yet. A producer of that struct might add a new variant, with a type you don't know. If you see an unknown type, no variant of the union is valid; it might contain arbitrary bytes from your perspective. You could still read those bytes and (say) checksum them, but you can't assume they correspond to any variant.

1 Like

How exactly would you checksum potentially uninitialized bytes? AFAIK a (repr Rust) union can always contain padding which can be uninitialized.

5 Likes

Wouldn't (0, 0) be a niche though? Neither variant can store 0 in both fields of the tuple.

1 Like

This is even possible to break without any unsafe:

use std::num::NonZeroU8;
union U {
    a: (NonZeroU8, u8),
    b: (u8, NonZeroU8),
}
fn main() {
    let mut x = U {
        a: (1_u8.try_into().unwrap(), 0),
    };
    x.b.0 = 0;
}
7 Likes

That's suprising, I was expecting an union to be only one variant at a time, partially one and partially another.

1 Like

Unions are all variants at the same time, effectively. It's just that if you want it to concretely be one, you need unsafe.

3 Likes

Doesn't affect the case of unions, but enums don't get similar optimizations either. For example:

enum OptionWithExtras {
    None(u8), // variant layout could be (x, 0)
    Some(u8, NonZeroU8), 
} // but ends up taking three bytes

(issue #46213)

Slightly different since we'd just like the union as a whole to have a niche, but the enum should also use a niche in its variants.

1 Like

But what about the case where all union variants have the same field in the same position? I don't think there's a way to break it.

Indeed, your example in the original post seems very reasonable to me and I don’t see a way to break it either. My answer was just about justifying that the concrete example that @mjbshaw brought up can’t/doesn’t have a niche.

Depending on whether or not we currently always allow unions to always contain uninitialized or invalid data, and if yes how clearly specified this is, worst-case such niches as you propose could be introduced over an edition where cargo fix --edition would add an additional __dummy: () fields to every existing union that isn’t supposed to change semantics when upgrading editions.

Even in that case, our defined semantics allow a union to contain a value that isn't valid for any of its fields, for multiple reasons.

I personally think the right solution to this would be to stabilize a mechanism to declare explicit niches. That way, rather than the compiler trying to guess at the union semantics (and risk doing so incorrectly), the code can declare the correct semantics.

11 Likes

I thought you couldn't even do that meaningfully in C, since any padding bytes from that unknown variant could have arbitrary vales?

1 Like

You can if the semantics of the structure are such that you know the checksum is over the entire structure (for instance, if padding is defined to be zeroed). This is true for some firmware data structures, for example.

1 Like

Tuples are not guaranteed to be laid out in memory in sequence. union U { a: (NonZeroU8, u8), b: (u8, NonZeroU8) } might as well be ABI-equivalent to (NonZeroU8, u8), with only field names differing between union variants.

1 Like