Consider the following code (also on the playground):
#![allow(dead_code)]
use std::mem::size_of;
#[derive(Copy, Clone)]
// specifically ensure an alignment of at least 16 bytes, because of https://github.com/rust-lang/rust/issues/54341
#[repr(align(16))]
struct BigInt(i128);
fn print_size<T>(label: &str) {
println!("{} has size of {}", label, size_of::<T>());
}
enum E {
A(i32),
B(i16),
C(BigInt),
}
#[repr(u8)]
enum Small {
A,
B,
C(BigInt),
}
fn main() {
print_size::<Option<BigInt>>("Option");
print_size::<E>("E");
print_size::<Small>("Small");
}
In all three cases the result is 32. This makes sense if you assume that the tag is at the beginning of the representation, since you need 16 bytes for the tag and padding to ensure the BigInt
is on a proper alignment boundary. But if the tag were instead placed at the end, then it could have a size of 17 instead (16 for the content, and 1 for the tag).
From my reading of Type layout - The Rust Reference, if there is no repr
attribute on the defintion of the enum, then this is an optimization the compiler is allowed to make. Is there any particular reason not to have an optimization like "if the alignment of the data is greater than the alignment of the tag, then put the tag at the end to minimize the size of the enum"?
And even if that optimization is added, what if you want to explicitly specify the type of the discrimenant/tag (for example with #[repr(u16)]
), but also want the discriminant tag at the end to reduce the size? (That's why I tagged this as "language design" instead of "compiler").
Granted, the reduction in size won't do much in a lot of cases, for example if it is part of an array, because you will still need padding to align it properly. But in some cases it could, for example, if it is part of a larger data structure.