repr(C) AIX Struct Alignment

On PowerPC AIX, there are special layout alignment rules for structs not represented by the repr(C) attribute in Rust. Specifically, the layout of the following struct is misaligned:

#[repr(C)]
#[derive(Copy, Clone)]
pub struct Floats {
    a: f64,
    b: u8, // currently has 7 bytes of padding
    c: f64,
}

While the same struct generated by Clang and GCC

struct Floats {
    double a;
    char b;
    double c;
};

has the following layout

Layout: <ASTRecordLayout
  Size:192
  DataSize:192
  Alignment:32
  PreferredAlignment:64
  FieldOffsets: [0, 64, 96]>
Layout: <CGRecordLayout
  LLVMType:%struct.Floats = type <{ double, i8, [3 x i8], double, [4 x i8] }>
  IsZeroInitializable:1
  BitFields:[
]>

There are already existing infrastructure in rustc_target to customize the target ABI calling convention for extern "C". I'm wondering what would be the best course of action to implement this special layout rule for repr(C)?

How does that rule not cause UB when taking references to fields? A reference to an f64 has to be aligned to 8 bytes, but field c would only be aligned to 4 bytes.

In any case there are some corner cases with Windows MSVC too where repr(C) doesn't match C. Fixing this would break expectations of existing code. Because of this a better name for repr(C) would likely have been repr(linear).

1 Like

I still think we should fix those, even if that takes an edition or similar.

1 Like

On AIX, in C, what is alignof(double) (or _Alignof(double) or __alignof__(double) or however your compiler spells it)?

1 Like

repr(C) serves two purposes:

The C representation is designed for dual purposes. One purpose is for creating types that are interoperable with the C Language. The second purpose is to create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type.

The reference then goes on to document the exact layout algorithm used. Therefore, making repr(C) do something other than that algorithm would be a breaking change. The name C is unfortunate since oddities like the one this thread is about exist, and could perhaps be split into two across an edition, but we cannot just make repr(C) platform-specific now.

1 Like

Perhaps it would be useful to add a repr(platform) (or lets call it repr(bikeshed) for now to avoid that can of worms) that can match both MSVC and AIX on their respective platforms. Though looking at the provided link AIX has three different options... And it isn't clear what is default to me.

1 Like

IBM Documentation seems to have more info on this:

In aggregates, the first member of this data type is aligned according to its natural alignment value; subsequent members of the aggregate are aligned on 4-byte boundaries.

and

If you are working with aggregates containing double, long long, or long double data types, use the natural mode for highest performance, as each member of the aggregate is aligned according to its natural alignment value.

I don't know if the hardware is OK with unaligned but slower (like x86) or straight up errors (like ARM). Either way the AM isn't OK with this as I understand it.

1 Like

On AIX, __alignof__(double) double has a 8 byte alignment

How does that rule not cause UB when taking references to fields? A reference to an f64 has to be aligned to 8 bytes, but field c would only be aligned to 4 bytes.

In any case there are some corner cases with Windows MSVC too where repr(C) doesn't match C. Fixing this would break expectations of existing code. Because of this a better name for repr(C) would likely have been repr(linear).

I need to get back on this to get a full explanation why this is allowed.

But I'm more worried that suppose (if) we take the platform dependent layout, then would this type be valid in Rust if the f64 is not aligned to the 8 byte boundary?

We could lower the alignment of f64 to 4 bytes to make a 4 byte aligned &f64 valid, but then for example a struct containing an f64 as first field would only be 4 byte aligned, while it needs to be 8 byte aligned. This has also causes the size of the struct to be wrong and thus offets for types containing this struct to be wrong.

It would also make bare f64s underaligned, e.g. on the stack.

1 Like

With unnamed fields -- RFC 2102 and rust#49804 -- we could model it like:

#[repr(C)]
pub struct Floats {
    a: f64,
    #[repr(packed(4))]
    _: struct {
        b: u8,
        c: f64,
    },
}

In fact, there's a quite similar example in the Representation section of the RFC. I suppose the inner packed will also prevent Rust references, to avoid that alignment problem.

That is fascinating and horrifying. And yeah, we can't make Rust support that without a much larger change, because it requires under-aligned fields.

1 Like

Very much so. Also look at the mac68k column for some more horror. I don't even know why AIX would have a repr compatible with early 90s Macs (I presume this is about the Motorola 6800).

That said, a possible workaround on this platform is to use repr(C, packed) and handle the alignment yourself. I think that should work. It is also quite awful, but it is a tier 3 target, so ymmv.

3 Likes

While we now mostly remember the AIM Alliance for turning POWER into PowerPC, and getting Apple to move from 68k to PPC, it also had a goal of gradually unifying IBM and Apple operating systems, via Taligent and the "Pink" OS.

As part of this goal, IBM did their best to enable you to use their compilers on AIX for all your software development needs, assuming that you were either targeting Apple or IBM hardware.

3 Likes