Enum with tag at end

tmccombs · January 2, 2023, 9:51am

Consider the following code (also on the playground):

#![allow(dead_code)]

use std::mem::size_of;

#[derive(Copy, Clone)]
// specifically ensure an alignment of at least 16 bytes, because of https://github.com/rust-lang/rust/issues/54341
#[repr(align(16))]
struct BigInt(i128);

fn print_size<T>(label: &str) {
    println!("{} has size of {}", label, size_of::<T>());
}

enum E {
    A(i32),
    B(i16),
    C(BigInt),
}

#[repr(u8)]
enum Small {
    A,
    B,
    C(BigInt),
}

fn main() {

    print_size::<Option<BigInt>>("Option");
    print_size::<E>("E");
    print_size::<Small>("Small");
}

In all three cases the result is 32. This makes sense if you assume that the tag is at the beginning of the representation, since you need 16 bytes for the tag and padding to ensure the BigInt is on a proper alignment boundary. But if the tag were instead placed at the end, then it could have a size of 17 instead (16 for the content, and 1 for the tag).

From my reading of Type layout - The Rust Reference, if there is no repr attribute on the defintion of the enum, then this is an optimization the compiler is allowed to make. Is there any particular reason not to have an optimization like "if the alignment of the data is greater than the alignment of the tag, then put the tag at the end to minimize the size of the enum"?

And even if that optimization is added, what if you want to explicitly specify the type of the discrimenant/tag (for example with #[repr(u16)]), but also want the discriminant tag at the end to reduce the size? (That's why I tagged this as "language design" instead of "compiler").

Granted, the reduction in size won't do much in a lot of cases, for example if it is part of an array, because you will still need padding to align it properly. But in some cases it could, for example, if it is part of a larger data structure.

CAD97 · January 2, 2023, 10:05am

Types in Rust always have a size which is a multiple of alignment. Since you are specifying an alignment of 16, the smallest the enum can be is 32 bytes. Most of it is padding which can be utilized by niche optimization, but the padding is necessary to keep size as a multiple of alignment.

SkiFire13 · January 2, 2023, 11:09am

Also see Pre-RFC: Allow array stride != size

steffahn · January 2, 2023, 11:18am

Minor note: padding can typically not be used by niche optimization, since uninitialized memory can be any bit pattern, so there’s no free never-used bit patterns left that a niche optimization could exploit. (All the unused value for) the discriminant itself however (which should by default be … usize, maybe? Or does it become a u8?) can be used by niche optimization.

mjbshaw · January 3, 2023, 4:50pm

Has it been considered to expand the discriminant to take up all "padding," effectively removing padding entirely (for enums) and allowing all the extra bits to be used for niche optimization?

quinedot · January 3, 2023, 8:03pm

See UCG #174 and Rust #70230.

system · April 3, 2023, 8:04pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Automated Data Oriented Design (DOD) transformations? language design	10	961	March 22, 2023
Optimizing layout of nested enums? compiler	19	12249	March 25, 2019
Optimized representations ideas (deprecated)	5	1934	March 25, 2019
Combine Enum C-like and rust language design	6	1528	July 11, 2021
Pre-RFC - Add alignment niches for references language design	19	2966	March 10, 2022

Enum with tag at end

Related topics