Enum with tag at end

Consider the following code (also on the playground):

#![allow(dead_code)]

use std::mem::size_of;

#[derive(Copy, Clone)]
// specifically ensure an alignment of at least 16 bytes, because of https://github.com/rust-lang/rust/issues/54341
#[repr(align(16))]
struct BigInt(i128);

fn print_size<T>(label: &str) {
    println!("{} has size of {}", label, size_of::<T>());
}

enum E {
    A(i32),
    B(i16),
    C(BigInt),
}

#[repr(u8)]
enum Small {
    A,
    B,
    C(BigInt),
}

fn main() {

    print_size::<Option<BigInt>>("Option");
    print_size::<E>("E");
    print_size::<Small>("Small");
}

In all three cases the result is 32. This makes sense if you assume that the tag is at the beginning of the representation, since you need 16 bytes for the tag and padding to ensure the BigInt is on a proper alignment boundary. But if the tag were instead placed at the end, then it could have a size of 17 instead (16 for the content, and 1 for the tag).

From my reading of Type layout - The Rust Reference, if there is no repr attribute on the defintion of the enum, then this is an optimization the compiler is allowed to make. Is there any particular reason not to have an optimization like "if the alignment of the data is greater than the alignment of the tag, then put the tag at the end to minimize the size of the enum"?

And even if that optimization is added, what if you want to explicitly specify the type of the discrimenant/tag (for example with #[repr(u16)]), but also want the discriminant tag at the end to reduce the size? (That's why I tagged this as "language design" instead of "compiler").

Granted, the reduction in size won't do much in a lot of cases, for example if it is part of an array, because you will still need padding to align it properly. But in some cases it could, for example, if it is part of a larger data structure.

Types in Rust always have a size which is a multiple of alignment. Since you are specifying an alignment of 16, the smallest the enum can be is 32 bytes. Most of it is padding which can be utilized by niche optimization, but the padding is necessary to keep size as a multiple of alignment.

10 Likes

Also see Pre-RFC: Allow array stride != size

4 Likes

Minor note: padding can typically not be used by niche optimization, since uninitialized memory can be any bit pattern, so there’s no free never-used bit patterns left that a niche optimization could exploit. (All the unused value for) the discriminant itself however (which should by default be … usize, maybe? Or does it become a u8?) can be used by niche optimization.

4 Likes

Has it been considered to expand the discriminant to take up all "padding," effectively removing padding entirely (for enums) and allowing all the extra bits to be used for niche optimization?

1 Like

See UCG #174 and Rust #70230.

2 Likes