Pre-pre-pre-... RFC: Explicit padding bytes (`AlwaysUninit<const NBYTES: usize>`)

This is a weird one, so bear with me. It's also has many "pre-", in that I don't even have a solid proposal for the solution.

This is more a description of an (admittedly obscure) problem, including how the current commonly used approaches to solve aren't ideal, and some discussion of the design space for a better solution. Note that I'm also not certain this needs a fix. I've wanted one several times, but I'm wary of complicating the language in significant ways for it.

Basically, there are times you want to add padding into a struct. This might be between two fields, before a field, after a field, both, etc. I'd like a better way of doing it, since the current approaches have disappointing downsides.

Common Approaches

Right now if you want this (lets assume that you do, for example to align fields in a struct for performance or to avoid false sharing), you have two options:

  1. Use #[repr(align(N))] to control padding, e.g. something like #[repr(align(N))] struct Aligned<T>(pub T); and using that in a field of your struct. This has a couple downsides:

    • It impacts the alignment of the structure as a whole.

      • Given how aligned allocation typically is implemented in allocators, this tends to mean that you double the amount of wasted memory usage, or you get punted to a "slow/complex" path in the allocator impl, which can be bad if you wanted this object to be relatively cheap to allocate.

      • Additionally, if you put this structure into a field of another structure (say, a repr(C) one, for the sake of argument), the alignment will similarly double the amount of wasted memory, (you'll have the N - size_of::<PrevField>() bytes of padding to align the start of your structure, in addition to the padding bytes you actually wanted1).

    • Currently the N in #[repr(align(N))] has to be a literal in the source.

      • It can't, for example, be the result of a size_of::<T>() or align_of::<T>() call, or pulled from a constant, or used from a (const) generic...

      • That said, hopefully this particular will change or have some workaround someday, since there are many other cases where you'd want to specify the alignment dynamically.

  2. Use MaybeUninit<[u8; N]>. This has one main downside: most of the time, the compiler will assume its initialized, and copy these padding bytes in copies (it requires whole program analysis, in many cases for it to know that this data is definitely never assigned to).

    • If the type is inside a Box or similar, this is fine, since you probably won't need to be doing copies of it, but this isn't always the case.

    • You can also try a few variants of this where you try to address this cost, but they don't work in practice, for example:

      • Using a MaybeUninit<[(u8, Infallible); N]>, under the assumption that the (u8, Infallible) would behave like a size-1 uninhabited type

        • Sadly, I think rustc probably has to assume that the u8 could be piecewise initialized, which is allowed, and even intended, for MaybeUninit.

        • Yeah, you could make this specific example work by implementing Clone::clone to return Self(MaybeUninit::uninit()), but that only works so long as Copy can't/won't be invoked.

      • Using a MaybeUninit<[T; N]>, with a T that is mostly padding, in an effort to reduce the size of the copy, or something

        • In practice, even if this did switch from memcpy to filling in only every 8th byte (or whatever), I doubt that would actually be any more efficient in practice. (That said, if it did work, there would probably be some related avenues to try out, so I gave it a shot regardless)

Design Options

This is just the design options that were obvious to me. There are certainly more.

1. core::mem::AlwaysUninit<const NUM_BYTES: usize>

So, the most obvious design is probably to add this as a new magic type in libcore. I suspect in practice that it's... unlikely2 that the downsides I described above are significant enough for it to be worth adding something weird and situational like this.

So, Ideally, this could be defined by user code.

2. An attribute controlling field padding

Another option might be an attribute that could be applied to fields, which controls padding, something like:

struct Blah {
    #[pad(128)] foo: usize,
    #[pad(64)] bar: Cell<usize>,
}

This looks somewhat nice, but I'm not sure it's actually ideal. Here are some downsides:

  1. Many of the use cases would involve #[repr(C)], but this doesn't really cleanly map to any construct in C or C++ (including extensions). That's not a dealbreaker, but it's a bit... Odd.

  2. There's no great way to add end-of-struct trailing padding, besides requiring it be a #[repr(C)] struct, and having the last field be #[pad(N)] _end: () or similar.

    • You could imagine fixing this by allowing #[pad(leading = N, trailing = M)], but it's getting a bit complicated at this point.
  3. Attributes like this currently don't tend to integrate into the type system well, and so you'll likely have the same downside as #[repr(align(N))], where N must be a literal. Also, there are concerns around stability with adding new attributes, since they might clash

3. Size-1 Uninhabited types?

Perhaps a way this could work would be allowing #[repr(u8)] enum EmptyU8 {} (currently an error).

Then AlwaysUninit<N> could be a wrapper around MaybeUninit<[EmptyU8; N]>. This should be sufficiently uninhabited as to allow Rust to omit the copies (but perhaps I'm wrong about this as well...). This would allow this to be defined in an external crate, rather than needing to place it in the stdlib.

The largest downside here is that I don't know if it would avoid the problem I hit above with (u8, Infallible)3...

Thinking more, I suspect it actually wouldn't. For it to work, Rust would have to be allowed to assume that MaybeUninit<T> contents are either:

  • uninitialized (and thus semantically meaningless)
  • a valid instance of T Which seems unlikely to be true (to say the least), and making it true probably would open a big can of typed-memory-shaped worms that would disappoint me... far more.

So yeah, this one is probably not viable, unless I'm missing something.

Conclusion?

I think option 1 makes sense, but is slightly unpalatable. That said, it seems likely to be the only option of these that would actually work...

I think option 2 is the kind of thing that a proc macro crate should provide, that would build on top of whatever solution is viable here.

Option 3 seems like the best... if it were viable, which I've pretty much convinced myself it isn't.

The implicit option 4 is to do nothing of course, which in this case would disappoint me, but the downsides are plausibly minor enough that we don't have the stomach for any of these solutions


Footnotes

  1. I'm aware that this is sometimes actually desirable (even if the author didn't realize it), but that's certainly not always the case.

  2. That said, if others think otherwise, I'd hardly argue about it.

  3. Alternatively, perhaps the fact that (u8, Infallible) didn't work the way I had hoped is just a missed optimization... I'd be surprised by this though, since the "what if piecemeal init" argument is pretty compelling in my mind).

2 Likes

I'm wondering what a repr(u8) enum with a single variant does. Ordinarily, single-variant enums are still zero sized and contain no data (like the type ()), but the repr would make it have size 1.

Edit: Looking at what it currently does in assembly, looks like copying such an enum always writes a 0 to the destination.

1 Like

It has size 1.

(I don't think you're suggesting it would, but for clarity, it doesn't fix the issue here at all, since it would still be copied)

To clarify, reading the documentation in the Rust reference, it says that repr(u8) on a fieldless enum influences the size and alignment. Furthermore, using as … casting it can be converted into a u8 (of value 0 by default). I don’t see why such an enum couldn’t still “contain no data” in the sense that every bit pattern (including uninitialized) is valid and copying it becomes a no-op. Note that copying does already not read the enum, but it still does write 0 to the destination.

Prior art: C++ specifies that empty structs have size 1 and the object representation with that is effectively entirely padding. This could be used to define efficient types that can be abi compatible with empty standard-layout types defined in C++ (which I do frequently, especially in lccc, both in the stdlib and bindings to it's api). For example, a type written in rust compatible with lccc::vector, which has a size of 4 pointers with the default allocator, but only 3 aren't padding, currently only has 3 padding bytes, and could not be copied in 3 pointer-size copies (on x86_64 - one sse copy, one qword copy), and would have to waste time copying at least one other byte in rust when the C++ side expects it can ignore those bytes, both for reading and writing. Having the ability to just define types with explicit padding would allow matching the layout of such C++ types without losing the efficiency of such padding.

1 Like

enum EmptyU8 {} is uninhabited, so adding it to a struct would make that struct also uninhabited (like adding a ! does), so it's definitely not what you want.

(I'm a bit surprised that that's an error -- it might not be obviously useful, but since you can cast an uninhabited enum to u32 I'm not sure why you couldn't say repr(u32) on it to make it 4-bytes wide instead of a ZST. Though you still wouldn't be able to put anything in it.)

Honestly I think you're underestimating option 1. Though I'll tweak it a bit to an option 5, though: Add a new PaddingByte type to libcore. Anyone who wants more padding can use an array of them. And if you want to pad up to a length, you can use a union between something and a [PaddingByte; N]. So I don't think this needs to be a user-accessible attribute.

Doesn't seem like it'd be that bad; it's need to be special-cased in the layout code but that's just "if it's PaddingByte then return align one size one zero fields" and hopefully everything else would work out.

4 Likes

It was MaybeUninit<[EmptyU8; N]>, not directly EmptyU8. The idea would be that the compiler would know that if the MaybeUninit were initialized, it would have to have been initialized as as an [EmptyU8; N], which is uninhabited, and so that would be impossible, therefore, the MaybeUninit must be uninitialized.

But I think the compiler can't actually assume this — those bytes could be initialized to whatever (The type inside the MaybeUninit doesn't matter unless you extract it or take a reference to it).

Oh, good idea. I think this would be very good, and not too weird.