[Idea] Explicit layout structure definitions


#1

I’m currently working on a device driver (written in C, targeting multiple OSs – wish I could use Rust :slight_smile:, but I digress). One recurring pattern is mapping memory layouts imposed by a device into structures accissible to the programming language.

An often found method for this are packed structs into which padding elements are added. However repr(packed) seems to be broken at the moment. However one has to acknowledge that for interfacing with explicit, preexisting memory layouts it’s actually not the packing we’re after, but being able to explicitly specify the offset of structure elements.

Also sometimes one may want to overlap certain elements (because there is weird hardware out there which overloads address mappings and sometimes does crazy things).

Furthermore often hardware demands certain fixed values / bit patterns to appear at specific locations.

And last but not least often hardware registers consist of several values packed into a single bitfield. It would be great if that could be represented as well.

I thus propose a way to explicitly specify the layout of structures. There are three aspects that should be controllable:

  • the size of the struct
  • the offset of each element down to the bit offset
  • and often also fixed bit patterns

So my idea on how to tackle this is to introduce syntax that allows to explicitly specify offset, bit shifts and masks inside a struct. Something along the lines of this

struct foo_registers @(FOO_SIZE) {
    a: i32 @(0x00); // a plain 32 bit register at offset 0
    b: u32 @(0x04), <<(5), &(0xfff); // a value with mask 0xfff (12 bits) located at offset 0x04 + 5 bits
    c: i32 @(0x06); // ←note the overlap of c with b here
    0xcafed0de: u32 @(0x20); // the fixed pattern of value `0xcafedode` that must appear at offset 0x20
    m: u16 @(0x28), &(0xf00f); // a 16 bit value where certain bits are masked
    0x0a50: u16 @(0x28), &(0x0ff0); // fixed pattern that shall appear inside the masked portions of m
};

Of course the syntax should be chosen so that parsing is easy but also programmer friendly. Also maybe things like element overlapping could go with a “yes, I am aware of the overlap” syntax, so that it doesn’t happen accidently (and the compiler can warn about that).

Any thoughts on that?


#2

Any references about this, did you try #[repr(C, packed)]?

Too special syntax for special task, why not write macros that generates bunch of inline function to access to suitable registers like:

define_hardware_registers!(Foo {
  a: i32 @ (0x0),
});

#3

IIRC, @japaric implemented something like this as a procedure macro to emulate hardware registers.

A macro can accept some DSL for layouts and define getters/setters methods for fields using shifting and masking internally.
The DSL may need to support plenty of stuff that you don’t necessarily want in the core language - various bit position formats (start+end vs start+length), read/write only fields, reserved fields, unnamed padding, endianness, various static assertions to make sure fields cover all the storage, or dynamic assertions like “you can write to this field, but only the value that was previously read from it”, maybe truncation/extension policies.
And procedural macros should be able to do all of this perfectly!


#4

However repr(packed) seems to be broken at the moment.

Any references about this, did you try #[repr(C, packed)]?

See https://doc.rust-lang.org/nomicon/other-reprs.html#reprpacked link at the bottom.

(…) bunch of inline function (…)

Ha, that’s actually what I do in driver code that I implement in C. Packed structs are just too finicky to work with in my experience; also adding a new element for a new hardware version, that may sit in a previously “unused” reserved space means recalculating all those paddings. Hence the explicit offsets.


#5

Explicit memory layout is something that I miss from a lot of languages. IMHO it’s something that should really be core. Most notably all that masking and shifting may lead to undesired masking of values and I think, that in a safe language it’s crucial that the compiler is able to detect these things and warn about it.

Of course if this can be done with a DSL, that’s fine by me. But to me anything concerning explicit memory layout reeks of something that belongs with type infrastructure.


#6

The linked issue isn’t a bug in the implementation, it’s a bug in the unsafe warning system. As far as I know it works exactly as intended. Packing structs is fundamentally unsafe because pointers can’t know they point into something packed and do the target platform’s dance to avoid alignment issues.


#7

Prior art: Ada’s representation clauses.