Problem
Bear with me. I've been banging my head against a wall for a while, trying to express the bound a type needs to satisfy for it to be truly POD: can be serialized as [u8] and read back with no memory unsafety. mmap'd, etc, all that goodness. I don't think it can be done in the existing type system.
This basically means types that are Copy, #[repr(packed)]
and contain no references transitively (not even 'static
). I'm going to call this trait Mappable
for lack of a better term (Pod
is overloaded in the Copy: Pod: Clone
debate). This would open the door to a family of tremendously useful safe functions:
fn transmute_mappable<S, D>(src: S) -> D // compile error for size mismatch
where S: Mappable, D: Mappable;
fn map_bytes<S>(src: &S) -> &[u8]
where S: Mappable + ?Sized;
fn map_bytes_mut<S>(src: &mut S) -> &mut [u8]
where S: Mappable + ?Sized;
Of course other helper functions like Read::read_mappable
would also be trivially definable.
This would be hugely useful for talking to devices (where the need is to serialize structs and the like) as well as any other place where exact layout and fast serialization is desired, such as video games (this is currently one of the only inevitable uses of unsafe in my doom renderer).
I think there are two things missing, essentially:
- Expressing
#[repr(packed)]
as a trait bound. - And something which I'd call "explicit opt-in OIBITs". EOIOIBIT, if you will. Although you probably don't.
EOIOBITs
The reason we can't make Mappable
an OIBIT is that we don't want any type whose members are Mappable
to also be implicitly mappable since it could be used to violate privacy and thus all sorts of invariants (imagine something like SmallAsciiString([u8; 16])
or whatever).
On the other hand for a type to satisfy an EOIOBIT it needs two things: (a) all its fields also satisfy the EOIOBIT and (b) an explicit impl block for the trait exists in the same module as the type. I'll get to the latter restriction in a bit. Some syntax:
pub trait Mappable: Copy + ReprPacked {}
impl Mappable for ?.. {} // notice the ?
impl Mappable for u8, u16, u32, u64, i8, i16, i32, i64, f32, f64 {}
// no explicitly !Mappable for &'a T, since it requires explicit **opt-in**.
// no usize/isize since that's platform dependent (?)
// no bool since it must be 0 or 1.
// no char because invariants.
Then any type which only contains the above is eligible to implement Mappable
:
mod x {
struct A(String);
impl Mappable for A {} // compile error, A is not Mappable (String not Mappable).
struct B<'a>(&'a str);
impl Mappable for B {} // compile error, B is not Mappable (has refs).
#[repr(packed)]
struct C(u32); // not Mappable--no explicit impl block in this module
struct E(u32);
impl Mappable for E {} // compile error, E not repr(packed).
#[repr(packed)]
struct E(u32, u64);
impl Mappable for E {} // u32 and u64 are Mappable, have impl, so E is now mappable.
}
mod y {
impl Mappable for x::C {} // compile error, impl block needs to be in same module
}
Discussion
Why not allow reference transmutes between Mappable types?
Initially I had suggested a
fn transmute_mappable_ref<S, D>(src: &S) -> &D
where S: Mappable + ?Sized, D: Mappable + ?Sized;
This, however, falls foul of strict aliasing rules. I'm interpreting these based on C++'s restrictions of reinterpret_cast
. In particular:
When a pointer or reference to object of type T1 is reinterpret_cast (or C-style cast) to a pointer or reference to object of a different type T2, the cast always succeeds, but the resulting pointer or reference may only be accessed if both T1 and T2 are standard-layout types and one of the following is true:
- T2 is the (possibly cv-qualified) dynamic type of the object
- T2 and T1 are both (possibly multi-level, possibly cv-qualified at each level) pointers to the same type T3 (since C++11)
- T2 is the (possibly cv-qualified) signed or unsigned variant of the dynamic type of the object
- T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it.
- T2 is a (possibly cv-qualified) base class of the dynamic type of the object
- T2 is char or unsigned char
If T2 does not satisfy these requirements, accessing the object through the new pointer or reference invokes undefined behavior. This is known as the strict aliasing rule and applies to both C++ and C programming languages.
So we can cast to &[u8]
or &[i8]
(6), but not to anything else without violating strict aliasing. Maybe thanks to Rust's ability to rule out simultaneous mutability and aliasing, these rules could be relaxed (since they generally concern optimisations which allow skipping loads after stores of unrelated types), but let's be conservative for now.
This sadly rules out transmuting [f64]
to [SimdVec4]
or [u8]
to [u32]
for performance.
Can't Mappable eligibility be implemented as an OIBIT?
Basically this idea boils down to:
trait MappableEligible {}
impl MappableEligible for .. {}
impl<'a, T> !MappableEligible for &'a T {}
impl<'a, T> !MappableEligible for &'a mut T {}
trait Mappable: MappableEligible {}
The issue here is something like:
#[repr(packed)]
struct A(u32, u32);
#[repr(packed)]
struct B(A);
impl Mappable for B {}
The last line succeeds because B
is MappableEligible
because A
is MappableEligible
. But A
is not Mappable
! So by making B
Mappable
we may be violating invariants of A. OIBIT-s are too viral to support this by themselves.
Why not #[repr(C)]
?
Because reading padding bits would be reading uninitialised memory, which is unsafe in Rust. Just manage your own padding for Mappable
types and you're fine.
Why same module?
We need to enforce privacy. That leaves us with same modules or types whose members are all public. The problem with the latter is that it becomes a backwards incompatible change to add a private field to a struct (it would still be limited to the same crate, so maybe not so terrible).
Alternatives
- Implement
map_bytes
andmap_bytes_mut
as trait methods. - Don't provide the methods in
std
at all. If theMappable
guarantee is provided, then they can be implemented with unsafe code and transmutes anywhere. Mappable
is a great name for the trait because everyone can get behind the fact that it's a terrible name. Some potentially better names:BitSafe
,Bytes
,MapBytes
,StdLayout
,Pod
,SafeTransmute
,Transmute
,SuperReallyPod
.- Some kind of
#[derive(...)]
shenanigans? I can't think how to do that. - Some very obvious solution which I'm too dumb to see.