PhantomData is a Zero-Sized Type. In Rust’s #[repr(C)], ZSTs remain zero-sized, so we could add PhantomData to a layout-sensitive struct without damaging the layout.
But PhantomData cannot be used in types with #[repr(C)], because PhantomData itself is not #[repr(C)]. It triggers the improper_ctypes warning.
(Side note: the fact that marking a struct #[repr(C, packed)] and including a non-C-repr member is a warning by default, and not an error, is disconcerting.)
#[repr(C, packed)] is overloaded to mean two things:
- Lay this out like C.
- Lay this out exactly as I have written it without any clever field reordering.
In embedded/driver development we use meaning 2. Under this meaning, it’s reasonable (if unusual) to want a laid-out-as-written type (such as a register set) to take an otherwise unbound type parameter, requiring the use of PhantomData. (I am attempting to use this to model the distinctions between the six different timer-counter flavors on the STM32F4, for example.)
Because PhantomData is a lang item, I can’t define my own equivalent of it bearing #[repr(C)] (and still link with core).
This same set of problems applies to UnsafeCell, something that comes up in discussions of the right way to model “volatile” register accesses in Rust (e.g. here). UnsafeCell is not #[repr(C)], so it can’t be used to safely model parts of layout-sensitive aggregate types. What do I mean by this? Here is a lightly-contrived example derived from @huon’s suggestions from that thread:
#[repr(C, packed)]
struct Volatile<T> {
val: UnsafeCell<T>,
}
impl<T> Volatile<T> {
pub fn get(&self) -> T { ... }
pub fn set(&self, v: T) { ... }
}
#[repr(C, packed)]
struct Registers {
a: Volatile<u32>,
b: Volatile<u32>,
...
}
UnsafeCell is more complicated than PhantomData because its ability to be #[repr(C)] depends on whether its type parameter is #[repr(C)].
Like PhantomData, UnsafeCell is a lang item, so I can’t replace it (at least if I want to link with core).
So:
-
Is there a way we could enable the use of phantom type parameters in #[repr(C)] types? For example, could we mark PhantomData as #[repr(C)] – would this hurt anything?
-
Should there be a way to indicate conditional #[repr(C)] that can degrade, e.g. "UnsafeCell<T> is #[repr(C)] iff T is #[repr(C)]?" Sort of like noexcept in modern C++. (In practice, improper_ctypes being a warning rather than an error by default almost achieves this for all uses of #[repr(C)]… but that’s not the right answer.)