Towards even smaller structs

mcy · May 12, 2021, 4:05pm

C has bitfields. Bitfields are great, because writing shifts and other bit operations is a pain and error-prone. There's plenty of fancy macros for emulating bitfields, sure, but it's limited one specific kind of packing within a struct.

I'm interested in how far we can take cramming down fields in a struct on the assumption that you can't take references to them. For example, consider the classic, extremely wasteful struct:

struct Foo {
  a: Option<u32>,
  b: Option<u32>,
  c: Option<u32>,
}

A whole 93 bits of this struct are wasted, because every one of those options expects to be able to be the referent of a reference, and thus need to have identical layouts, so the presence bit gets rounded up to 1. However, if we aggregated all the presence bits into a byte, we save 64 bits! However, this means that you can't get a reference to those Options, because they're discontinuous!

You see this kind of optimization a lot, especially in code that needs to really pack its representations for efficiency, like a regular expressions engine or generated protobuf code. It would certainly be nice to have compiler support, similar to niche-finding, for saying "hey, I'm never going to need a reference to this field, can you please lay it out in some crazy way?

Here's a strawman to illustrate what I want. Suppose we add a #[flat] annotation that you can put on a field:

struct Foo(u8, u16);
struct Bar(#[flat] Foo, u8);

Under normal circumstances, Bar is going to take up three u16s, because Foo needs to be padded out, even if Rust decided to order the u8 after the u16. However, because we've requested that Foo be flattened, the definition is instead treated as if it were

struct ActualBar(u8, u16, u8);

We save two bytes, at the cost that &bar.0 is now a hard error. The field bar.0 instead has Cell semantics. However, &bar.0.1 is valid, because the u16 will be aligned, as would be the case in ActualBar.

The local struct version is easy, but it gets a bit messier when you involve pre-existing types. We want

struct Foo {
  #[flat]
  a: Option<u32>,
  #[flat]
  b: Option<u32>,
  #[flat]
  c: Option<u32>,
}

to be laid out as (u32, u32, u32, u8, padding) or similar. But suddenly, because you can't take references, you can't do simple things like foo.a.is_some(), because that requires a reference to a. For now, I think it's best to just assume you can only flatten Copy types, but it's probably possible to extend to non-Copy types in some cases; the Copy case is probably the most interesting one. You should also be able to write

match &foo.a {
  Some(a) => a, // types at &u32
  None => ...,
  x => /* not allowed! */,
}

but I'm not sure how to best specify this in a way that doesn't feel unnecessarily magical...

Open questions from my perspective:

Should "flatness" be a property of a field, or of a type? For fields, it means that you can make arbitrary existing types flat without incurring the no-references-allowed penalty. For a type, it means we can make it a trait, which could be useful in generic contexts, but it's not clear to me how.
How do we deal with matching on a flat enum field to extract references into it?

Soni · May 12, 2021, 4:37pm

This would tie in nicely with PITs, so you could do something like this:

struct Type(i32, i32, i32);
enum TaggedPartiallyInitializedType {
  None(Type()),
  __C(Type(2)),
  _B_(Type(1)),
  _BC(Type(1, 2)),
  A__(Type(0)),
  A_C(Type(0, 2)),
  AB_(Type(0, 1)),
  ABC(Type(0, 1, 2)),
}

and then carry the PIT around more efficiently in other structs/enums/whatever.

Alternatively, an DropFlagged<T> could be specified to provide the same functionality today (with no support for nesting ofc).

scottmcm · May 12, 2021, 5:11pm

One version of this that's been discussed before is "move-only" fields that would have this constaint.

Because the main constraint on doing layout optimization is reading through &s and particularly writing through &muts, that would give repr(rust) far more freedom to bit-pack (even arithmetic pack, if it wanted) enums together, unify discriminants across fields, misalign fields if helpful, etc, since it could run extra code in both cases.

Move-destructuring such types would also be fine even if the fields were non-Copy.

Flatness of a type seems like it would be a mess for generic code, since I can take references to a T that I put in a wrapper of some sort today. Making that work seems like it'd probably need another opt-out trait (like ?Sized), but historically we've strongly not wanted to have more of those.

mcy · May 12, 2021, 5:27pm

Yeah, I just didn't want to think about that one too hard. =P

This was my thought too. I don't think it's a good idea.

jhpratt · May 12, 2021, 10:00pm

This could effectively be simulated by using byte(s) for the Option<T> fields, with the fields themselves being MaybeUninit<T>. Getters and setters would be necessary, This would be a little tricky with Drop, but I'm pretty sure it's doable. Maybe a proc macro to start?

Edit: Not with a macro, but I've just successfully implemented this behavior. I'm 99% sure it's sound and miri doesn't complain. With regard to drop types, it actually wouldn't be that hard now that I've thought about it.

josh · May 13, 2021, 2:08am

We've talked about "fields you can't take a reference to" before, for exactly this reason. I'd love to see this.

"flat" feels like a surprisingly good name for this. My first reaction was to think it should be something like "noref", but I like the idea of "flatten this into the containing structure so it's preserved but not separated".

Nokel81 · May 13, 2021, 2:38am

If you couldn't take references to it, then how would you call methods on it? Or you you have to extract it out of the struct and then call methods on it?

CAD97 · May 13, 2021, 3:30am

Just like fields in #[repr(packed)] structures, the only legal operations on the flattened field would be to move out (a copy) or to move in, but you also wouldn't be allowed to ptr::addr_of! it.

comex · May 13, 2021, 3:51am

Another option is to say that accessing a flattened field (say, foo.bar) results in an rvalue. Thus, &foo.bar would be legal, but would simply copy the value to the stack and take a reference to that, same as e.g. &123. The same would happen if you called foo.bar.baz(), where baz is some method that requires &self. ptr::addr_of!(foo.bar), on the other hand, would be an error, just as ptr::addr_of!(123) is.

This way you avoid the problem with is_some() and similar.

Note that this mostly only makes sense if the type is Copy. If not, &foo.bar could still be treated as moving foo.bar and taking a reference to the result, but that's kind of surprising.

scottmcm · May 13, 2021, 5:27am

That could work, but probably shouldn't happen for &mut self, so it might be better to not work for &self either.

Then you could just {foo.bar}.baz() to acknowledge that you're "moving" out of it to make an rvalue (which in most cases would probably be a copy because bar's type would impl Copy).

felix.s · May 13, 2021, 10:53am

Yes, I pretty much had a ready-made pre-RFC written down. Thought my draft is different in that it also forbids taking references to all transitive fields. It never occurred to me that someone might want a non-transitive version of this, but optimising out alignment seems like a pretty good use case. Maybe they could be combined into a single RFC?

As for naming: maybe #[repr(inline)]? It’s analogous to inlining in that it incorporates the fields of the other type into your own, allowing the compiler to perform layout optimisations that don’t preserve the inter-type boundary; just like inlining procedure calls incorporates the callee’s body into the caller, allowing the compiler to perform code optimisation that don’t preserve the inter-procedural boundary.

(And yes, I’m kind of going back and forth on whether it should be a #[repr(...)] or an independent attribute.)

bluss · May 13, 2021, 11:34am

A per-field attribute seems like a good idea. One potential complication is the interaction with code generation - foremost the Clone (required for Copy) and Debug derives, since they assume they can take references to fields. I guess Clone does have some special cases where it doesn't care about this, though(!).

matklad · May 13, 2021, 11:49am

A use-case I specifically care about is storing a bunch of flags in a struct:

struct Fn {
  is_async: bool,
  is_const: bool,
  is_default: bool,
  is_unsafe: bool,
}

I'd love to write that as

struct Fn {
  #[pack] is_async: bool,
  #[pack] is_const: bool,
  #[pack] is_default: bool,
  #[pack] is_unsafe: bool,
}

rather that pull or hand-roll bitflags.

Soni · May 13, 2021, 12:56pm

Would something like

use std::dropper::DropFlags;
struct Fn {
  is_async: (),
  is_const: (),
  is_default: (),
  is_unsafe: (),
}
type FnBitfield = DropFlags<Fn>;

where DropFlags is a special type, solve your problem?

matklad · May 13, 2021, 1:09pm

I don't understand how that would work. Maybe I am mistaken, but I don't think a library based solution can be substantially more ergonomic than bitfalgs.

lordan · May 13, 2021, 2:54pm

In C, perhaps. In C++ they complicate the language by a stupefying degree. The amount of space required to detail all the exceptions they require in the C++ standard makes me think compiler implementers would rather prefer they don't exist.

Soni · May 13, 2021, 3:03pm

Similar to how you can have

let x: Fn;
x.is_async = ();

you'd be able to have

let x: DropFlags<Fn> = DropFlags::new();
putinto!(x, is_async, ());

with the main difference being that the latter can be moved around, even in such a partially initialized state. (with takefrom!(x, is_async) -> Option<()> to unset it.)

mcy · May 13, 2021, 5:30pm

I'm not sure that "well, WG21 got it spectacularly wrong" is a reason not to do it. =P We can do far, far better than the C++ committee here, because Rust does not have many of the... exciting features that make bitfields hard in C++.

(I have had to glare at people in the past for getting fresh with volatile uint8_t x : 4; and such.)

I think the key thing here is that you want to be able to get a reference into a flat Option<T>. I want to make it possible to perform all the compression tricks you could think of on enums, while making them mostly behave like regular enums and avoiding any unsafe union tricks.

CAD97 · May 14, 2021, 2:06am

(oh fun I'm about to suggest using ref patterns)

For this post I'm using #[flatten], but it could also be #[repr(flatten)], #[repr(inline)], or whatever else.

Let's consider the following:

struct Option8<T>(
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
    #[flatten] pub Option<T>,
);

We want to be able to optimize it into:

struct Option8<T> {
    init_flags: u8,
    0: MaybeUninit<T>,
    1: MaybeUninit<T>,
    2: MaybeUninit<T>,
    3: MaybeUninit<T>,
    4: MaybeUninit<T>,
    5: MaybeUninit<T>,
    6: MaybeUninit<T>,
    7: MaybeUninit<T>,
}

And ideally we'd be able to use it equivalently to the expanded version, but safely. I believe we can specify this succinctly enough:

Guide Level

A field of a #[repr(Rust)] struct may be marked as #[flatten]. If a field is marked as #[flatten], you are not allowed to take a reference to it, nor to take its address with ptr::addr_of!. The only legal operations, enforced by the compiler, are to 1) move into the field, potentially dropping the value that was already there; 2) move out of the field, potentially performing a copy, otherwise performing a destructive move; or finally 3) pattern match the field, so long as the pattern match does not require taking a reference to the field.

To illustrate:

let option8: Option8<_> = /* ... */; 

match &option8.0 {} // already illegal, takes reference to flattened field

match &option8 {
    Option8(a, ..) => {} // illegal, takes reference to flattened field (via default by-ref binding mode)
    Option8(Some(a), ..) => {} // legal, takes reference to contained value
    Option8(None, ..) => {} // legal, no references taken
    Option8(a @ Some(b), ..) => {} // illegal, takes reference to flattened field
}

match option8 {
    Option8(a, ..) => {} // legal, moves out the field
    Option8(ref a, ..) => {} // illegal, takes reference to flattened field
    Option8(Some(a), ..) => {} // legal, moves out the field
    Option8(Some(ref a), ..) => {} // legal, takes reference to contained value
}

match option8.0 {
    Some(a) => {} // legal, moves out contained value
    Some(ref a) => {} // legal, takes reference to contained value
    a => {} // legal, moves out the field
    ref a => {} // illegal, takes reference to flattened field
}

These restrictions allow the compiler to more aggressively niche values together to produce smaller structure layouts. For example, the compiler would be allowed to (but is not required to) store the eight Option discriminants in Option8 in a singular byte, as there is no requirement to be able to manifest a reference to a full Option<T>.

Implementation Concerns

There are two main concerns for implementation: 1) built-in derive support, and 2) drop glue generation. The former is simple enough: do the same as is done for #[repr(packed)]: make a temporary stack copy and call methods on that. As with #[repr(packed)], more complicated cases will just provide custom trait implementations. The problematic concern is the latter.

For types with a Drop implementation, we need to manifest &mut T in order to call Drop::drop(&mut self). This is identical (again) to #[repr(packed)], and again we take the same solution: move the value out onto the stack. This is somewhat unfortunate, as the we would prefer avoiding the copy of possible, but unavoidable.

For types with drop glue but no Drop implementation, a smarter implementation is possible. The drop glue could be enhanced to comply with the restrictions added by #[flatten] (in all cases, if it doesn't already), and thus avoid the extra copy required to manifest the &mut.

I particularly enjoy how this avoids any interaction with whether a type is Copy or not, beyond type checking of emitted #[derive]d trait implementations. Going into this I was expecting to suggest limiting #[flatten] to Copy types, but I think this specified behavior is clean enough to support all types equivalently.

I do not know if the existing pattern matching machinery is set up in such a way that it could support matching types with flattened fields as I've described here. If not, then it could make sense to implement flattening only within Copy types initially.

canndrew · May 14, 2021, 2:41am

If this were to be added how could it account for sub-byte-sized fields? eg. suppose I want to write a type to represent a TCP header. Using #[repr(C)] and #[flatten] fields I could almost do this. I could make all the flag fields flattened bools for instance, but there still wouldn't be a way to represent the data offset field as a 4-bit integer.

One possible suggestion: It might be nice if we had a uint<N> type for N: usize. These would have next-power-of-two size and alignment normally but would be treated specially by the compiler when used as a #[flatten] field. This probably has a bunch of complications though and might be stretching the weirdness budget .

Edit: it would also be nice to be able to represent the 3 bits of padding as a flattened [bool; 3] rather than as three bool fields. But if flattening is non-transitive then the array would take up three bytes rather than three bits.

Topic		Replies	Views
Bitfields: wanted but hated [Pre-pre-RFC discussion] language design	14	2676	March 25, 2019
Pre-RFC: Allow array stride != size language design	34	3355	January 16, 2023
Cell, references, and struct layout Unsafe Code Guidelines	35	2465	December 22, 2024
Pre-RFC: repr(bitpacked) for automatic bitflags language design	17	1412	April 1, 2024
Could we implement some kind of small struct optimization? compiler	16	2137	July 7, 2021

Towards even smaller structs

Guide Level

Implementation Concerns

Related topics