Summary
Introduce semicolon as group separator in #[repr(Rust)] struct
definitions.
Groups are stored in the order they are declared, while fields inside a group are subject to reordering for meeting alignment requirements.
Motivation
The default repr(Rust)
says nothing about fields order, and repr(C)
guarantees every fields are stored in the order they are declared.
C-style down-casting needs guaranteed data layout, but may not need such a strong guarantee that repr(C)
does.
Group separators can specify partially the struct layout as described above, keeping rustaceans from resorting to repr(C)
.
Guide-level explanation
If struct Base
and part of struct Derived
are identical in layout, pointers of Base
and Derived
may be casted to each other.
Example
The following is an example of a base/derived struct.
// a base struct.
struct Link {
next : *mut Link,
prev : *mut Link,
}
// a derived struct.
struct Node<T,W> {
next : *mut Link,
prev : *mut Link,
flag : u8,
elem : T,
weight : W,
}
Casting between *Link
and *Node
requires next
and prev
fields stored in the very beginning in both struct
, which results in using repr(C)
.
With group separators, the struct
s can be written as:
struct Link {
next : *mut Link; // a group composed by one field
prev : *mut Link; // another group composed by one field
}
struct Node<T,W> {
next : *mut Link; // a group composed by one field
prev : *mut Link; // another group composed by one field
flag : u8,
elem : T,
weight : W,
}
If groups containing identical fields declarations are guaranteed to have the same layout, they can be written as:
struct Link {
next : *mut Link, // first field in the group
prev : *mut Link; // second field in the group
}
struct Node<T,W> {
next : *mut Link, // first field in the group
prev : *mut Link; // second field in the group
flag : u8,
elem : T,
weight : W,
}
Compiler Errors and Warnings
-
Semicolons shoud not be used in instantiating struct.
error: expected one of `,`, `.`, `?`, `}`, or an operator, found `;` --> main.rs:13:21 | 13 | n = Node{ next:a, prev:b; /* omitted */ }; | ^ expected one of `,`, `.`, `?`, `}`, or an operator
-
Only the first group can have ZSTs.
error: only the first group can have ZSTs. --> main.rs:13:21 | 13 | struct S { a: u8, b: bool; c: () } | ^ only the first group can have ZSTs
-
Only the last group can have DSTs.
error: only the last group can have DSTs. --> main.rs:13:21 | 13 | struct S { a: isize, b: [u8]; c: String } | ^ only the last group can have DSTs
-
Semicolons as group separators are only applicable in
repr(Rust)]
.warning: layout groups are not applicable to repr(C) --> main.rs:13:21 | 13 | #[repr(C) struct S { a: u8, b: bool; c: usize } | ^ help: consider using ',' instead
Reference-level explanation
The implemenation is straight-forward. Rather than doing sorting on all the fields to meet alignment requirements, the fields are grouped by semicolons and sortings are applied on each group. Finally each group’s layout are concatenated and paddings may be appended to groups if needed.
The compiler should also guarantee that identical declarations result in identical layouts, to make group containing more than one field suitable for poiner casting.
As mentioned previously, ZST/DST are allowed only in first/last group, or compiler errors will be generated.
Drawbacks
Programmers who are used to C/C++ may use semicolons as field separators by chance,
resulting in potentially more space cost of the struct
.
Rationale and alternatives
If resorting to repr(C)
, the Link
/Node
struct mentioned in the example section can be written as two different alternatives:
-
1.simply adding
#[repr(C)]
#[repr(C)] struct Link { next : *mut Link, prev : *mut Link, } #[repr(C)] struct Node<T,W> { next : *mut Link, prev : *mut Link, flag : u8, elem : T, weight : W, }
-
2.using extra struct definition(s) to achieve similar result
#[repr(C)] struct Link { next : *mut Link, prev : *mut Link, } // it's repr(Rust) struct Data { flag : u8, elem : T, weight : W, } #[repr(C)] struct Node<T,W> { link: Link, data: Data, }
Group separator proposed here has advantages over them:
-
potentially more compact layout size compared to alternative #1
For example, the size of
Node<usize,u8>
in repr© is greater due to field reordering completely disabled. -
more brief compared to alternative #2
No need to use attributes since
repr(Rust)] is default. No need to define
struct Data`.