Towards even smaller structs

For first pass #[flatten] support, I'm strongly in favor of only supporting it for fields of #[repr(Rust)] types. Extending it to #[repr(C)] and more controlled layout, as well as nonstandard width integers, can and should be left to a future extension, as it's all of an additional source of nontrivial complexity, not required for #[flatten] to be meaningfully useful, and cleanly sectioned off, not impacting initial support.

12 Likes

repr(inline) (or repr(flat)) sounds promising, but as written in that RFC, I'm not sure what semantic you're proposing for applying it to a whole struct; does that mean that all the fields are flat, or does that also mean that you can't have references to the whole type either (and it may get flattened into its parent struct)? I'd expect the former.

Other than that ambiguity, and naming, that RFC looks 98% ready to me.

I think I'd also drop the suggestion that "bit shifting and masking" is all that might be required, because we might sometimes add/subtract an offset rather than just shifting bits.

Would you be up for updating that RFC, giving some examples both of applying it to fields and to a whole struct, and submitting it?

Complete agreement there. The attribute gives the compiler control over the layout, so it's incompatible with something like repr(C).

2 Likes

I am opposed to any "flattening" that would apply to the struct as a whole and is not field-by-field. At the absolute minimum, flattening should take into account that some types have niche values, negating the need for an additional bit of storage.

1 Like

This is kind of an odd restriction, but would it make sense to say that flattening is recursive, only for #[repr(Rust)] types with exclusively pub fields?

So basically, this should be able to two bytes:

#[compact]
struct S {
    pub pair: (bool, u8),
    pub bit: bool,
}

This would not be compresable without recursively tagging as compact, because otherwise you could take a reference as &mut s.pair.0.

Personally, I value the ability to write and use Option8<T> over a recursive compacting. At a minimum, recursive #[compact] makes it unusable for types with non-pub fields, as there's no way to get at them.

Now what is valuable is for a #[compact]ed type to be transparent to a #[compact]ed parent type, so that any remaining niche space can be filled in.

I just think that being able to retroactively #[compact] someone else's type is less valuable (and prone to problems) than a nonrecursive #[compact] that allows Option8.

Indeed the former. But as I was extending the draft to cover (as I named it) #[repr(inline)] as well, it occurred to me that the shorthand might cause this very confusion. For a moment I considered removing it, but then I remembered it’s probably better kept, as I intend this feature to also cover stuffing enum discriminants in pointer alignment bits; having to annotate each individual reference in the enum would be quite burdensome.

It’s supposed to be an implementation detail anyway. I might add what other transformations are also possible; but I won’t be removing it entirely, since I think it’s useful to explore the various optimisation possibilities in order to assess what this feature actually affords us.

struct Foo {
    #[repr(inline)]
    a: Option<bool>,
    #[repr(inline)]
    b: Option<bool>,
}

If it has to remain possible to borrow the interior of either a or b, then they cannot share a memory address. Transitive compactness allows you to squeeze that structure down to a single byte.

Another possible use case is when you have a data structure for which you need different layouts during actual use versus in transit/longish-term storage. The latter can be annotated #[repr(compact)] for tighter bit packing, while the former will have no ABI-packing attributes for faster access.

Yet another reason is probably best illustrated by an example. Suppose you have this structure:

struct A(u32);

struct B(A);

struct C(#[repr(compact)] B);

Now suppose the definition of A changes to:

#[repr(align(16))]
struct A(u32);

This will change the ABI of B (since B incorporates A), but not necessarily that of C. This might become advantageous if we ever develop a system which remembers the ABI of a crate when compiling in order to maintain compatibility with that ABI in future versions of the crate. (A sort of automated generalised symbol versioning, if you will.)

Instead of adding new attributes, how about creating a new (language-level) special wrapper Packable<T>? You will not be able to get &T out of it, but compiler will be free to modify its layout depending on structure of types in which it will be used. Since a packable type can require tracking of a parent type, you will not be able to have it in a owned form. To work with it methods like store(&mut self, val: T), replace(&mut self, val: T) -> T, clone(&self) -> T where T: Clone, etc. It would make such types a bit tedious to work with, but conversion points would be explicit.

I guess it should be possible to pass around &Packable<T> and &mut Packable<T>, compiler will replace those with a pointer to the parent value and generate code accordingly (although it would mean that for functions which accept such types, they would behave as a sort-of generic arguments, so I am not sure if it's worth the trouble).

To make types with packable fields a bit more convinient, it could be worth to allow them behave as T in de-structuring and pattern matching scenarios. One of drawbacks is that it will create a bunch of implicit conversion points. For example:

struct Foo {
    a: Packable<Option<u32>>,
    b: Packable<Option<u32>>,
}

// initialization can be done using `T` with implicit conversion
let f: Foo = Foo { a: Some(1), b: None };
match f {
    // this match arm will perform implecit conversion,
    // so `b` will have type `Option<u32>`
    Foo { a: Option<a>, b } => { ... }
    _ => { ... }
}

let f: Foo = ...;
// `a` and `b` will be of type `Option<u32>`
let (a, b) = f;

But matching references will be a bit more restricted:

et f: Foo = ...;
match &f {
    // works
    Foo { a: None, b: Option<1> } => { ... }
    // could work if are to allow passing around `&Packable<T>`,
    // otherwise will cause a compilation error
   Foo { a, b: None } => { ... }
   _ => { ... },
}

The main issue is interaction between several &mut Packable<T> which track the same struct. They effectively will break the aliasing guarantees provided by &mut, since such references should be able to read and modify the same region of memory (well, of course they will operate on separate bits, but our computers do not work with bit granularity). So probably we would have to restrict creation of mutable references to packable fields to only one per parent value.

2 Likes

Actually what I was referring to was Option<NonZeroU8> alongside other fields. Said type is fully capable of storing the None value without increasing size, and as such should not have its discriminant extracted to the bitflag field.

Option<NonZeroUN> is already perfectly niche filled, so I would expect #[compact] for a field of a niche-optimal type wouldn't rearrange it, ever. Nominally the compiler would be allowed to, but it would have no reason to.

The problem that compact isn't really a property of the type of the field, it's a property of the containing type. And the whole point is that you can't take a reference to it, and you can't compose over Packable<T>. In general I agree that things should be proper types, but compact clearly isn't a type. It's less magic to make it a property of the containing type than a special not-quite-a-normal-type type for the fields.

I see the problem statement here. It definitely is desirable to be able to pack this into a single byte. I think my answer would be that, for generic types, you can write #[compact] Option<bool>, which will monomorphize the type #[compact]ly rather than with the standard layout. This makes some amount of sense for generic types specifically, since you're monomorphizing them anyway.

I don't know, though. I'm at my logic's end for figuring this out. I'll just be over here on the side to bounce ideas off of and to remind people to keep Option8 useful.

2 Likes
enum OurBoolean {
    LolNoGenerics(bool),
    FileNotFound,
}

struct Foo {
    #[repr(inline)]
    a: OurBoolean,
    #[repr(inline)]
    b: OurBoolean,
}

An advantage of Packable<T> is that you will be able to take "references" to such fields, which in some cases may be useful (e.g. if you decompose your code into separate functions working with different fields). Plus explicit methods for changing such fields will be more visible and probably less surprising than various restrictions around #[repr(inline)] fields. But I agree that such wrapper feels quite magical.

Why should the Packable go on the fields vs the whole struct?

What's the advantage of

struct Foo {
  a: Packable<...>,
  b: Packable<...>,
}

vs

struct Foo {
  a: ...,
  b: ...,
}
// ...
let x: Packable<Foo> = ...;

?

(or OptionalFields<Foo> maybe)

Because we may want to leave some fields as-is, e.g.:

struct Foo {
    a: Packable<Option<u32>>,
    b: Packable<Option<u32>>,
    c: Option<u32>,
}
1 Like

Is the goal here size or speed? We might need a way for the user to say "don't allow references to these fields, so that the struct is as small as possible" versus "don't allow references to these fields, so that the compiler is free to pack fields if it thinks it would be faster"

Imo size. But when checking with a real-world benchmark on an existing struct I have with 13 niche-free optional values, performance was nearly identical (around a 1% performance gain). This was while shrinking the size of the struct from 52 bytes to 32. So while I believe that size is the reason for a change like this, I have empirical evidence that indicates performance will not suffer (and may actually improve).

Just split it like you would with enums:

enum FooAB {
  None,
  A(u32),
  B(u32),
  AB(u32, u32),
}
struct Foo {
  ab: FooAB,
  c: Option<u32>,
}

vs

struct FooAB {
  a: u32,
  b: u32,
}
struct Foo {
  ab: OptionalFields<FooAB>,
  c: u32,
}

This isn't gonna add any overhead or anything.

My thought here was to have something like #[flatten(recursive)] that recursively flattens the struct, making it as if all of its fields were also marked recursive. This is usually not the thing you want, which I think is pretty clear at this point.

I think the way we can make this work is by adding a PackedArray<T, N> type that behaves as if each array element is a flattened field, and then define

struct Bits<const N: usize>(#[flatten] PackedArray<bool, N>);

This struct would always take up ciel(N/8) bytes, unless flattened, in which case it only takes up N. You would not be able to get references into the array, and need to copy out ranges of bits rounded up to bytes or whatever.

There are probably questions about ensuring that PackedArray remains contiguous...

1 Like

I think recursive packing only makes sense for #[packed] fields which type contains #[packed]

This also makes it easy to see and understand which fields can be referenced and which can't be.

struct Foo {
    // LEGAL: &self.n
    n: u8,
    // ILLEGAL: &self.a, &self.b
    #[packed] a: bool,
    #[packed] b: bool,
}

struct Bar {
    // LEGAL: &self.quux, &self.quux.n
    quux: Foo,
   
    // LEGAL: &self.foo.n
    // ILLEGAL: &self.foo.a, &self.foo.b, &self.c, &self.d
    #[packed] foo: Foo,
    #[packed] c: bool,
    #[packed] d: bool,
}

(equivalent to something like vvvv)

struct actual_Foo {
    n: u8,
    a_b: u8,

struct packed_Foo {
    n: u8
}

struct packed_Bar {
    quux: packed_Foo,
    
    foo: flattened_Foo,
    foo_a_foo_b_c_d: u8,
}

If you can shrink a structure while not making it any slower, you'll gain performance in other areas. Every page of data not used for your program's data is a page that could be use for caching or similar.

5 Likes

It's wouldn't be that special. All you need is to say that the out-of-range values are a niche of the type. For example, uint<6> would be represented as u8, but values 64 through 255 are a niche. Then at least from a theoretical perspective, the compiler being allowed to optimize this into a bitfield is a natural result of the representation being unspecified (combined with the inability to take references). Even from a concrete implementation perspective, the compiler could look at niches when deciding whether to optimize into a bitfield, instead of literally special-casing uint<N>.

Still, the use of a bit width would be confusing, as users might expect that to always result in a bitfield. But that's easily solved. Instead of defining the type in terms of bits, define a more generic 'integer in this range' type . For example, uint<0, 42> would represent an integer from 0 to 42. It would have niches from 43 to 255, and when combined with #[flat] would typically be represented in 6 bits. This kind of type would be useful for high-level tasks, completely independently from its use for bitfields.

5 Likes

One way to create a u4 type today is with an enum:

#[repr(u8)]
enum U4 {
    N0 = 0,
    N1 = 1,
    N2 = 2,
    ...
    N15 = 15,
}

This is too cumbersome for larger types (e.g. u15) though.

@comex 's idea of creating numeric types with a lower and upper bound would be really nice!