Pulling this discussion out here to avoid derailing RFC #2756 any further.
This is largely stream-of-consciousness, describing a problem I would like to see solved and the issues I am aware of with solving it.
The problem
I want there to be a well-defined way to perform the following transmutes (especially the last one, which is impossible to achieve with safe code):
// Newtype index wrapper for a graph node.
#[repr(transparent)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
struct Node(usize);
unsafe {
// It is highly desirable for all of these to be valid operations.
// Currently, however, they are UB.
transmute::<Vec<Vec<usize>>, Vec<Vec<Node>>>(vecs);
transmute::<Vec<HashSet<usize>>, Vec<HashSet<Node>>>(sets);
transmute::<&'a HashSet<usize>, &'a HashSet<Node>>(set);
}
These sort of transmutes (especially those that occur behind references) significantly decrease the cost of adopting a newtype wrapper in a crate, by allowing it to remain private to the crate rather than having to make it part of the public API.
Unfortunately, to my understanding, all of these are currently UB because Rust provides no guarantees about the layout[^1] of a generic struct for different T
. That is to say, Type<T>
and Type<U>
can have different layouts even if T
and U
have the same layout (or are even ABI compatible)—a prominent example being Option<u32>
versus Option<NonzeroU32>
. Beyond that, there is occasional talk of potentially making the compiler "randomize" struct layouts.
The precise set of requirements for well-defined transmutes is not yet clear. But what I can tell from the discussion thus far is that, if it even is possible to specify a subset of transmutes that are well-defined, then we need at least all of the following:
- Some unambiguous set of compatibility requirements for
T
andU
. - Some unambiguous set of parametricity requirements for
Type
. -
Cooperation from the compiler that, if
Type
meets the parametricity requirements, andT/U
meet the compatibility requirements, thenType<T>
andType<U>
must lay out their fields identically.
Let me go into each of these in more detail:
Compatibility requirements
Let's consider some of the types that we would like to be compatible:
-
#[repr(transparent)] struct Node(u32);
should definitely be compatible withu32
. -
Vec<T>
should be compatible withVec<U>
ifT
is compatible withU
. -
PhantomData<T>
should be compatible withPhantomData<U>
ifT
is compatible withU
. (and maybe even ifT/U
aren't compatible!) -
u32
should maaaaaybe be compatible withOption<NonzeroU32>
. -
Type<'a>
andType<'b>
are compatible, as lifetimes are erased in codegen. - Compatibility should be symmetric and transitive.
-
Note: Symmetry might actually be undesirable, so that we can have
NonZeroU32 -> u32
. In that case, however, the term "compatible" needs to be replaced, and I am too lazy to rewrite this!
-
Note: Symmetry might actually be undesirable, so that we can have
But let's also be mindful of counter-examples:
-
u32
is clearly not compatible with[u8; 4]
as they have different alignments. -
u32
is not compatible withNonzeroU32
even though it is#[repr(transparent)]
.
The last counterexample is tricky. #[repr(transparent)]
alone isn't enough to guarantee compatibility! There must also be no niches! Some possibilities I see for the requirement:
Implicit: The requirement for compatibility could simply be that the T
is #[repr(transparent)]
, and has no niches. Because the presence of niches depends on private details, the author of a newtype would need to provide a documented guarantee that the type will never contain niches.
Explicit: We could add an attribute stronger than #[repr(transparent)]
(say, #[repr(field_transparent)]
). This way, it can be explicitly opted into by the author of the newtype, and the requirements could be verified by the compiler. An opt-in attribute such as this doesn't seem too demanding as the code that wants to transmute a newtype is likely closely associated with that newtype.
In addition to one of the above, there should also be a recursion rule: If A
and B
are compatible, and Type<T>
is parametric in T
(as discussed below), then Type<A>
and Type<B>
are also compatible.
Parametricity requirements
Compatibility of A
and B
is insufficient to guarantee that Type<A>
and Type<B>
have the same layout. This is because Type
could use associated items:
// Peekable<A> and Peekable<B> can have different layouts
// even for compatible A and B
pub struct Peekable<I: Iterator> {
iter: I,
peeked: Option<Option<I::Item>>,
}
Suppose that some crate tries to transmute Struct<A>
into Struct<B>
when Struct
uses an associated type like Peekable
does. Who is to be held accountable? The code that defines A/B
? The code that defines Struct
? The code that performs the transmute?
We cannot blame this failing on the requirements of compatibility (i.e. we can't claim that the issue is that A
and B
are not truly compatible), because any downstream crate can implement a trait for A
and B
:
// crate a
#[repr(field_transparent)]
pub struct Node(u32);
// crate b
pub trait Trait { type Assoc; }
impl Trait for u32 { type Assoc = u32; }
impl Trait for Node { type Assoc = u8; }
pub struct Struct<T: Trait>(<T as Trait>::Assoc);
In the above, crate b
clearly is the one responsible for the fact that Struct<u32>
and Struct<Node>
are not layout-equivalent.
In the above examples, the trait bounds appearing on the struct clearly warn downstream users of the fact that these types may not be parametric in T
. But in the future, specialization will allow this to occur even without any visible evidence in the public details of Struct<T>
:
#![feature(specialization)]
pub trait Trait {
type Assoc;
}
impl<T: ?Sized> Trait for T {
default type Assoc = u32;
}
#[repr(transparent)]
struct Node(u32);
impl Trait for Node {
type Assoc = u8;
}
// violates parametricity despite having no visible trait bounds
struct Struct<T> {
foo: <T as Trait>::Assoc,
}
There could also conceivably be things like private_field: [u8; type_name::<T>().len()]
in the future if type_name
is stabilized.
Thus, we cannot blame the user for failing to verify that Struct<T>
is parametric. It must be the responsibility of the author of the Struct
to declare that their type is parametric. As with compatibility, this requirement could be implicit or explicit:
Implicit: When a type uses nothing like <T as Trait>::Assoc
(or <Struct as Trait<T>>::Assoc
) or functions like type_hint::<T>
that violate parametricity, the compiler guarantees that Type<T>
and Type<U>
are compatible if T
and U
are compatible. Users of a type should not rely on this fact unless it is explicitly guaranteed in the type's documentation (or is otherwise dead obvious, e.g. due to a lack of private fields).
Explicit: We could introduce an attribute like struct Struct<#[parametric] T>
so that this guarantee can be made explicit, and even be checked by the compiler. Unfortunately, this is a lot more invasive than the related idea of #[repr(field_transparent)]
and would cause tons of churn, because code that needs to add #[parametric]
is not necessarily anywhere near the code that wants to transmute newtypes. We would want it on all sorts of types like &T
, [T]
, Vec<T>
, HashMap<K, V>
(for both K
and V
), Wrapping<T>
...
Cooperation from the compiler
Whatever requirements are chosen, it must become part of the language specification that Type<T>
and Type<U>
have identical layouts when these requirements are met.
If layout randomization is ever added to the compiler, each group of types with identical layouts must receive the same layout. I believe this is always possible. (if you consider the DAG of monomorphized types whose edges are #[repr(field_transparent)]
compatibility relationships, it should form a forest, where each tree is rooted in a type that is not #[repr(field_transparent)]
. That root node lives in a crate that must be upstream to all other types in the tree, thus all types in the tree can agree to use the layout of the root type.)
Possible extension to #[repr(C)]
types
There may also be use cases for declaring identically-shaped #[repr(C)]
types to be compatible.
#[repr(C)] struct Foo { a: i32, b: u32 }
#[repr(C)] struct Bar { x: i32, y: u32 }
#[repr(field_transparent)]
doesn't make sense for these types, but that annotation could still exist for newtypes. (#[repr(C)]
types would simply constitute another set of "base cases" for compatible types).
Randomization would be a bit more complicated when #[repr(C)]
types appear as fields in #[repr(Rust)]
types. In particular, given a generic #[repr(Rust)]
type Struct
, the types Struct<Foo>
and Struct<Bar>
could appear in separate crates with no single type to serve as a common ancestor. To solve this there would need to be a global random seed shared by all crates in the build tree so that each crate can independently generate the same randomized layout for these types.
That's what I have so far. Any thoughts?
[^1]: To clarify, when I use the term "layout" in this post, I am referring to a type's size, alignment, ABI, niches, and the offsets of all of its fields. This is different from alloc::Layout
, which only considers size and alignment.