Pre-RFC: mem::trailing_padding!

binarycat · October 10, 2024, 2:41pm

Problem statement

It is quite common in rust to have an array of enums that is iterated over. It is also quite common for some variants of that enum to be much larger than others. Implemented trivially, this will result in quite a lot of wasted space for padding. This can be solved by defining a full bytecode syntax, but this requires implementing a bunch of encoding and decoding logic, which wastes CPU cycles and introduces opprotunities for programmer error.

Guide level explanation

Add a trailing_padding! macro to core::mem. This macro would take the name of a type or enum variant, and evaluate to an integer literal.

This literal would be the number of trailing "padding" bytes in that enum variant.

Padding bytes can have any value, so this would allow the creation of a crate that implements a datastructure that carefully overlays the padding at the end of one enum with the start of another. Because padding can have any value, and because this datastructure would be append-only, this would be sound. Combined with #[repr(C)], this would essentially allow easily creating an opcode prefixed bytecode representation just through an enum and a derive macro.

Alternatives

A const function that accepts a mem::Discriminant and returns the number of trailing padding bytes.

scottmcm · October 10, 2024, 3:06pm

Reminds me of https://lang-team.rust-lang.org/frequently-requested-changes.html#size--stride, where I hypothesized something like this that would give a Range<usize> for a type or an instance saying which bytes are sufficient to copy, with the trivial 0..sizeof(T) implementation always being legal if a compiler wants.

binarycat · October 10, 2024, 3:08pm

Importantly, this change would not effect how Vec or any other pre-existing type functions.

Another alternative would be full layout reflection, or at least saying where all the padding bytes are.

kpreid · October 10, 2024, 5:47pm

You could almost implement this today in a derive macro by looking at the offsets and sizes of all fields and finding the last occupied byte. The catch is that you can't ask where the enum discriminant is stored (or any other non-field repr complications that might be introduced in the future), but that's well-defined for #[repr(C)] and #[repr(uN)] enums. That would be sufficient to write a prototype of this feature without modifying the compiler.

kornel · October 10, 2024, 6:34pm

If you compact enums like that, won't this ruin alignment? You won't be able to repr(C)-read any except the first one?

CAD97 · October 10, 2024, 6:53pm

Important thing to be careful of: if a tail field is not Freeze (is shared mutable; an UnsafeCell) then some of the trailing padding may be writable from safe code, and any calculations need to be careful to consider that.

Whether padding is valid to write to with unsafe doesn't matter since it's still unsound to write through &T not via the UnsafeCell API even if it happens to be generally valid.^[1]

Note that it's possible to use repr(C) and/or repr(uN) to create a bytecode which can load safe enums with ptr::read_unaligned, although it might be somewhat less optimal than a manual bytecode definition. The RFC text defining these layouts: https://rust-lang.github.io/rfcs/2195-really-tagged-unions.html

Ideally this would be a function like mem::size_of, but that would require “enum variants as types” to be utilized as intended by the target problem space.

Or similarly, an “*_of_val” function. The limitation of course being that such needs an instance of the variant to get the trailing padding, just like currently you need to get a Discriminant value.

There's no real requirement for this to be a macro like there is for offset_of!, so I'd personally prefer a solution which doesn't use a macro in the stdlib. A crate can of course provide a wrapping macro that sticks the fn in a const{} block^[2].

Given the definition is as-if defined with struct and union, a proc-macro with access to the enum definition (e.g. a derive) could accomplish the goal with just a little bit of Layout math on said as-if types.

I didn't even think of that possibility, which gives a bit tighter of a bound than just using the size of the variant payload as-if types.

You can almost get this today with just offset_of! and Layout of the fields' types, if you know the field name/types (i.e. #[derive]). The macro auto(de)ref specialization tricks I'm conceptualizing to get a nice ergonomic behavior are honestly kinda frightening.

Technically, that is all entirely private unstable impl details, so it doesn't matter whether they change their impl strategy or not. They should do whatever has the most consistent/predictable “zero overhead” performance characteristics.

read_unaligned or repr(packed) solve that (although I don't think we actually define a repr(C, packed) for enums), or if the variants are sufficiently different in size (given the desire to overlap, they likely are), it's possible to align the next instance properly while still starting in the padding of the prior.

This is the stated position of T-opsem, and the used justification for immutable static constant promotion of UnsafeCell-free enum variants even when the full type is not Freeze and potential opsem would make writes valid when not promoted to immutable memory. (And a trivial proof/example: I can retag memory as a Freeze array of bytes and then refcast it back to the type in question, and this should be considered sound if the reachable safe API cannot cause a write to the now immutable memory.) ↩︎
And because of various impl details, doing so can actually be better for compile time, since it reduces the amount of MIR cost to the function. ↩︎

kpreid · October 10, 2024, 7:10pm

This isn't a problem unless you are

using a “deep, not shallow” definition of trailing padding that looks into field types rather than treating them as opaque except for ordinary size_of, and
looking through UnsafeCell<T> to the padding of T.

I'd say this operation shouldn’t do (2), in exactly the same way looking for niches doesn’t, and possibly shouldn't do (1) either.

kpreid · October 10, 2024, 7:18pm

And so Vec (and even more so, slices) could not do this, because it is expected to offer O(1) random read-write access which depends on uniformly-sized elements. Collections like BTreeMap could try, but they’d have to expand out to full-size nodes on any get_mut() or iter_mut() (because you can never hand out a &mut reference to these padding-overlapped enums without potentially smashing following elements). In general, std doesn’t deal in append-only or “immutable” collections, so there aren’t many opportunities to use this kind of layout in std.

binarycat · October 10, 2024, 7:20pm

the reason i prefer a macro solution is because of the aforementioned limitation of needing an instance of the enum. i want this to be useable in derive macros, and a function-based solution would struggle, expecially if the enum variant contained runtime-only values.

of course, another alternative would be first-class curried enum variants. it's already possible to instantiate values that are a variant that is missing it's value, but only with tuple enums:

enum En {
    Um(u8)
}

fn main() {
    let _x = En::Um;
}

quinedot · October 10, 2024, 7:26pm

That's a function item or so, not a variant value. You can let _y = |b| En::Em { b }; too.

pitaj · October 10, 2024, 7:27pm

offset_of is a macro, so I don't see why this wouldn't be as well.

binarycat · October 10, 2024, 7:28pm

it's an opaque type that implements Fn. i don't know a ton about the compiler, but theoretically it could implement other traits too.

kornel · October 10, 2024, 7:36pm

let _x: En = En::Um; doesn't compile, and the error says says it's fn(u8) -> En.

kpreid · October 10, 2024, 8:15pm

The type error should be saying not fn(u8) -> En but fn(u8) -> En {En::Um}. This is the (not nameable in surface syntax) function item type of the variant constructor. Every fn foo(){} item, and every tuple-struct or tuple-variant constructor, has a unique function item type.^[1]

There is no type-system obstacle to deciding that constructor function item types should implement another trait that gives information about the variant (perhaps via returning mem::Discriminant?).

One reason why such unique types exist is so that they can be zero-sized types, so that when functions are passed to generic code and stored in generic data structures, the function takes up no space and is easily inlinable after monomorphization. Function pointers can also sometimes be bypassed by LLVM optimization of code, but not removed from struct layout. ↩︎

Topic		Replies	Views
Store Option discriminant in Containing type (optimization) compiler	4	393	September 3, 2024
Exploit the padding?	50	2580	September 28, 2021
Bitpacking the flag and data of an enum language design	10	1388	January 3, 2021
Combine Enum C-like and rust language design	6	1493	July 11, 2021
Missed layout optimization compiler	26	1080	March 20, 2024

Pre-RFC: mem::trailing_padding!

Problem statement

Guide level explanation

Alternatives

Related topics