Add `repr(inherit($type))`

Before writing a pre-RFC, I want to see if there are any fundamental problems with this idea.

Introduce a new repr variant: #[repr(inherit($type))].

This representation ensures that the annotated type has the same layout as $type. This allows safe transmutations between the two types.

Example:

struct Foo {
    a: MaybeUninit<u8>,
    b: MaybeUninit<u32>,
}

#[repr(inherit(Foo))]
struct Bar {
    b: u32,
    a: u8,
}

Without repr(inherit), we would need to use #[repr(C)] (because otherwise Foo and Bar are not guaranteed to have compatible layout) and we would also need to ensure that the order of the fields is the same. When many structs with many fields (potentially with #[cfg(...)]) are involved this is error-prone. It also requires the programmer to think about the effects that the order has on the layout/size of the type.

Here are some more examples of how it interacts with some other features.

#[repr(C)]
struct Foo {
    a: u8,
    b: u32,
}

#[repr(C)] // you are allowed to mention other reprs, if they are compatible
#[repr(inherit(Foo))]
struct Foo1 {
    a: u8,
    b: u32,
}

#[repr(C)]
#[repr(inherit(Foo))]
struct Foo2 {
    b: u32,
    a: u8,
//  ^ error fields in the wrong order
// this only occurs, because we explicitly specified `#[repr(C)]`
}

#[repr(inherit(Foo))]
#[repr(inherit(Foo1))]
// only without error if they are compatible
// maybe allow `#[repr(inherit(Foo, Foo1))]`
struct Foo3 {
    b: u32,
    a: u8,
}

A more elaborate example, why this is needed:

I want to create a library that enables safe pinned initialization (see my older post). The core idea is to prevent misuse of uninitialized data with the type system (you cannot use FooUninit in a place where you need Foo). But when you want to initialize pinned data, then you need to transmute the pointer to the correct type. This is why the two types (Foo and FooUninit) need to have the same layout. At the moment I am using #[repr(C)] and the same order of fields, this works, but is not ideal. Layout optimizations are not in effect and a user cannot specify #[repr(transparent)], because I need #[repr(C)]. Here some code that would be generated by the use of my macro library:

#[repr(C)]
struct FooUninit {
    ptr: MaybeUninit<*const i32>,
    data: [i32; 64],
    bar: BarUninit,
}

#[repr(C)]
struct Foo {
    ptr: *const i32,
    data: [i32; 64],
    bar: Bar,
}

(I am currently ignoring the problem of mutable aliasing)

This might also work well together with rfcs/2835-project-safe-transmute.md at master · rust-lang/rfcs · GitHub

7 Likes

I prefer to see a Repr trait or similar, allowing to express complex representations as trait bounds.

1 Like

What kind of complex representations would you like to express? And how would the trait look like? How would the compiler be able to ensure that types do indeed have the correct layout?

I think keeping this feature simple (allowing two types to have the same layout is going to be the most complex layout repr anyway) is a benefit. The compiler does not have to handle very complex representation relations, it only has to know if layout(A) == layout(B) is true in a stable sense (so that changing a compiler version/enabling layout randomization do not change the layout equivalence) the value of layout(A) should still be variable. In particular layout(A) == layout(B) is not true for:

struct A {
    _0: Field0,
    _1: Field1,
    _2: Field2,
    _3: Field3,
}

struct B {
    _0: Field0,
    _1: Field1,
    _2: Field2,
    _3: Field3,
}

If you have a good example of why you would need more complex repr then I would like to see it. But I think that you can already do a lot with this new feature.

If you would like to create code like

fn safe_transmute<A, B>(a: A) -> B
where
    A: SameLayout<B>,
{
    unsafe { core::mem::transmute(a) }
}

Then you should look at project-safe-transmute, as that is outside of this scope (allthough it could probably be nicely integrated).

Assuming that the safe transmute project happens, that could also be done with just something like

assert_impl_all!(Foo: SafeTransmuteFrom<Bar>);

(using static_assertions::assert_impl_all - Rust)

And thus it's not obvious to me that this should be a full language feature.

That would be great! I have not had the time to read all the details of the safe transmute project, so I do not know what it does exactly.

It also seems like I was looking at the wrong places, the repository hasnt had a commit for 2 years (so it seems like the discussion is on zulip, but following there is probably a lot of reading).

I don't know that precisely the form it will take is decided, but here's some progress towards part of it:

I think this is not enough, because if Foo and Bar are #[repr(Rust)], then e.g. layout randomizations would not give the two types the same layout even if their fields are identical/transmutible

Well, but then the assert would fail. (Or maybe it always would, without repr(C).)

Then the safe transmute project might not be enough. That is why I want repr(inherit(...)). It is to guarantee that two types have the same layout.

I want to begin writing a pre-RFC and I want to add this in the alternatives section. Did you have a clear idea of how this would be implemented?

Not completely clear, but something like

pub trait HasLayout {
    type Layout: sealed::LayoutType;
    
    const ALIGN: Option<usize>;
    const PACKED: Option<usize>;
}

mod sealed {
    pub trait LayoutType;
}

// #[repr(rust)] or no repr generates an impl HasLayout<Layout = RustLayout>.
pub struct RustLayout;
impl sealed::LayoutType for RustLayout {}
pub struct CLayout;
// ...
pub struct TransparentLayout;
// ...
pub struct U8Layout; // Etc.
// ...

// There may be problems with layout not applicable, for example primitive representations for non-enums or `#[repr(transparent)]` for more than one field.
// There may be a need for post-monomorphization errors to handle that, I'm not sure.

But IIUC you still need to specify the layout using #[repr(...)], right? So it is essentially just a marker trait that you could use in some trait bounds. How would you specify my proposed #[repr(inhereit(...))]?

If it is a marker trait, then this is probably not the right RFC for that, as I only want to add the additional repr.

No, #[repr()] becomes like a macro that emits an implementation of the trait. You can implement the trait to change the layout.

Ok that makes sense, I do not know how the current mechanism of layout selection works in the compiler, so I do not know if this is a feasible architecture change. I will add it to the pre-RFC

Here is my current draft for the RFC:

Summary

Introduce a new repr variant: #[repr(inherit($type))].

When a type A is marked with this #[repr(inherit(B))], the compiler ensures that A and B have the same layout.

Motivation

#[repr(inherit($type))] allows transmuting between two types with transmutable fields without requiring the use of #[repr(C)]. This allows the compiler to still choose the optimal layout, while providing simple a way to have predictable layouts.

Examples

struct FooUninit {
	pub a: MaybeUninit<u8>,
	pub b: MaybeUninit<u32>,
}

#[repr(inherit(FooUninit))]
struct FooInit {
	b: u32,
	a: u8,
}

Without repr(inherit), both structs would need #[repr(C)] and we would have to list the fields in the same order. When many fields and structs are involved (possibly with #[cfg(...)] enabling/disabling fields) this is error-prone. It also requires that the programmer chooses a good order of the fields to avoid padding bytes.

Guide-level explanation

repr(inherit($type))

This repr attribute ensures that the type the attribute is found on has the same in-memory representation as the type given in the attribute. This means that it is sound to mem::transmute the two types to and from each other (you still need to ensure custom invariants are upheld). The two types of course need to have the same size and specify the same name for each field. The order of the fields may be different but the types need to be layout compatible. For example:

struct FooUninit {
	pub a: MaybeUninit<u8>,
	pub b: MaybeUninit<u32>,
}

#[repr(inherit(FooUninit))]
struct FooInit {
	b: u32,
	a: u8,
}

The layout of Foo{Uninit, Init} is repr(Rust) (implicit). It will probably be laid out in a way that avoids unnecessary padding bytes and the u32 will come first.

This repr allows you to write layout dependent types that still can have memory layout optimizations handled by the compiler.

The fields of the two types need to be layout compatible. This means that they are

  • the same type
  • two types A and B where A has the transitive #[repr(inherit(B))] attribute
  • two types A and B where A has the transitive #[repr(transparent)] attribute and contains only one field of type B

"transitive" in this context means, that the attribute behaves transitively, so the following is legal:

struct A {
	// ..
}

#[repr(transparent)]
struct B {
	a: A,
}

#[repr(transparent)]
struct C {
	b: B,
}

#[repr(transparent)]
struct D {
	pub c: C,
}

#[repr(inherit(D))]
struct E {
	c: A,
}

Reference-level explanation

I have no knowledge of how the compiler computes the layout of a type, but I have seen some parts of the miri codebase allocating layout for types (I believe these are the same). I hope that the compiler has a bit more information than just the alignment and size of a type, otherwise we would need to add that.

Compiler changes

In order to provide the feature, the compiler needs to

  1. compute the layout of all types not marked with/without transitive fields with #[repr(inherit(..))]
  2. iteratively compute the layout of all types marked with #[repr(inherit(..))]
  • check that the fields with the same name have compatible layouts and that all fields are present
  • check that no conflicting #[repr(inherit(..))] was specified

There should exist an iteration limit, for the beginning we could choose something low like 16. There should also exist a similar option to #[recursion_limit()], say #[repr_inherit_iter_limit()] to select the limit.

Semver

The attribute is only allowed to reference types with all fields visible to the current module and incompatible with types marked by #[non_exhaustive]. This applies to only the type mentioned in the parenthesis, so the struct you define can have private fields. This is because the attribute relies upon types not renaming fields, changing field types and adding types for semantic version compatibility.

Drawbacks

Why should we not do this?

  • it complicates (more code to maintain, longer compile times) the layout selection process of the compiler
  • adds more difficulty trying to learn rust layouting (should be marginal)

Rationale and alternatives

Repr trait instead of #[repr]

Alternatively we could add a trait to use instead of #[repr] attributes altogether:

pub trait Repr {
	type Layout: sealed::LayoutType;

	const ALIGN: Option<usize>;
	const PACKED: Option<usize>;
}

// all implement sealed::LayoutType
pub mod layout {
	pub struct Rust;
	pub struct C;
	pub struct Transparent;
	pub struct U8;// etc.
	pub struct Inherit<T>(PhantomData<T>);
}

#[repr(...)] would then become an implementation of the trait.

What if this is not implemented

Functionality wise, this can already able done, if one is careful. Using #[repr(C)] and ensuring the same order of fields and transmutibility between them also results in transmute compatible layouts. However we are losing

  • automatic (and future) layout optimizations
  • layout randomizations of repr(Rust), helping with security
  • compiler assisted layout generation - we are essentially doing the layouting manually (which is error prone and a burden)

Prior art

I (y86-dev) have not been able to find any discussions on this topic on zulip or the internals forum. There are however related topics:

Unresolved questions

  • How does the compiler layout mechanism need to be adjusted?
  • Do we need more complex rules? I wanted to keep it simple.
  • what about the primitive reprs like #[repr(u32)]?

Future possibilities

Project safe transmute

Integration with project-safe-transmute seems to be a good idea, types with repr(inherit(...)) should be soundly transmutable.

Enums and Unions

Allow the repr to be used on other types as well. I do not know how useful this would be, but I think in theory it could be done.

I could definitely use some help as this would be my first rfc and I am not familiar with the rfc process/phrasing.

One question to ask: we have #[repr(u32)] for primitives, so now we'd have both #[repr(u32)] and #[repr(inherit(u32)]`. Are these different?

Is u32 the only type that can have #[repr(u32)]? If yes, then I think it would be of little benefit if we allow #[repr(inherit(u32))], because #[repr(transparent)] + u32 field is still transmute compatible with u32.

We also could add different semantics to #[repr(inherit(u32))] (e.g. allow/deny niches), but I am not sure what to do here.

Some questions that crop up in my mind that any RFC would need to at least discuss:

Is this possible?

type Tuple = (u8, u32);

#[repr(inherit(Tuple))]
struct Struct {
    a: u8,
    b: u32,
}

How about:

type Tuple = ((u8, u32), u16);

#[repr(inherit(Tuple))]
struct Struct {
    a: u8,
    b: u32,
    c: u16,
}

How are fields lined up if the types are not unique and names don't match?

struct A {
    a: u8,
    b: u8,
}

#[repr(inherit(A))]
struct B {
    b: u8, // Would this overlap A::a or A::b? How to control or tell?
    a: u8,
}

What about:

#[repr(transparent)]
struct U8 {
    a: u8,
}

struct A {
    a: u8,
    b: U8,
}

#[repr(inherit(A))]
struct B {
    b: u8, // Would this overlap A::a or A::b? How to control or tell?
    a: U8,
}
1 Like

Oh I totally forgot tuple types! I think that this will be rather difficult to support, because tuple types essentially have #[repr(Rust)], see the UCG section. And because I am proposing name based matching you cannot do this with structs...

I do not think this is a real problem, you can just declare the struct you need to inherit from.

The field to field identification is done via names, if they do not match, an error is emitted.

This should work, as U8 is transparent (we would map a<->a and b<->b).

Thanks for the good questions, I am going add these to the draft!

1 Like