Add `repr(inherit($type))`

y86-dev · May 13, 2022, 11:46am

Before writing a pre-RFC, I want to see if there are any fundamental problems with this idea.

Introduce a new repr variant: #[repr(inherit($type))].

This representation ensures that the annotated type has the same layout as $type. This allows safe transmutations between the two types.

Example:

struct Foo {
    a: MaybeUninit<u8>,
    b: MaybeUninit<u32>,
}

#[repr(inherit(Foo))]
struct Bar {
    b: u32,
    a: u8,
}

Without repr(inherit), we would need to use #[repr(C)] (because otherwise Foo and Bar are not guaranteed to have compatible layout) and we would also need to ensure that the order of the fields is the same. When many structs with many fields (potentially with #[cfg(...)]) are involved this is error-prone. It also requires the programmer to think about the effects that the order has on the layout/size of the type.

Here are some more examples of how it interacts with some other features.

#[repr(C)]
struct Foo {
    a: u8,
    b: u32,
}

#[repr(C)] // you are allowed to mention other reprs, if they are compatible
#[repr(inherit(Foo))]
struct Foo1 {
    a: u8,
    b: u32,
}

#[repr(C)]
#[repr(inherit(Foo))]
struct Foo2 {
    b: u32,
    a: u8,
//  ^ error fields in the wrong order
// this only occurs, because we explicitly specified `#[repr(C)]`
}

#[repr(inherit(Foo))]
#[repr(inherit(Foo1))]
// only without error if they are compatible
// maybe allow `#[repr(inherit(Foo, Foo1))]`
struct Foo3 {
    b: u32,
    a: u8,
}

A more elaborate example, why this is needed:

I want to create a library that enables safe pinned initialization (see my older post). The core idea is to prevent misuse of uninitialized data with the type system (you cannot use FooUninit in a place where you need Foo). But when you want to initialize pinned data, then you need to transmute the pointer to the correct type. This is why the two types (Foo and FooUninit) need to have the same layout. At the moment I am using #[repr(C)] and the same order of fields, this works, but is not ideal. Layout optimizations are not in effect and a user cannot specify #[repr(transparent)], because I need #[repr(C)]. Here some code that would be generated by the use of my macro library:

#[repr(C)]
struct FooUninit {
    ptr: MaybeUninit<*const i32>,
    data: [i32; 64],
    bar: BarUninit,
}

#[repr(C)]
struct Foo {
    ptr: *const i32,
    data: [i32; 64],
    bar: Bar,
}

(I am currently ignoring the problem of mutable aliasing)

This might also work well together with rfcs/2835-project-safe-transmute.md at master · rust-lang/rfcs · GitHub

chrefr · May 15, 2022, 12:39am

I prefer to see a Repr trait or similar, allowing to express complex representations as trait bounds.

y86-dev · May 15, 2022, 4:45pm

What kind of complex representations would you like to express? And how would the trait look like? How would the compiler be able to ensure that types do indeed have the correct layout?

I think keeping this feature simple (allowing two types to have the same layout is going to be the most complex layout repr anyway) is a benefit. The compiler does not have to handle very complex representation relations, it only has to know if layout(A) == layout(B) is true in a stable sense (so that changing a compiler version/enabling layout randomization do not change the layout equivalence) the value of layout(A) should still be variable. In particular layout(A) == layout(B) is not true for:

struct A {
    _0: Field0,
    _1: Field1,
    _2: Field2,
    _3: Field3,
}

struct B {
    _0: Field0,
    _1: Field1,
    _2: Field2,
    _3: Field3,
}

If you have a good example of why you would need more complex repr then I would like to see it. But I think that you can already do a lot with this new feature.

If you would like to create code like

fn safe_transmute<A, B>(a: A) -> B
where
    A: SameLayout<B>,
{
    unsafe { core::mem::transmute(a) }
}

Then you should look at project-safe-transmute, as that is outside of this scope (allthough it could probably be nicely integrated).

scottmcm · May 15, 2022, 6:53pm

Assuming that the safe transmute project happens, that could also be done with just something like

assert_impl_all!(Foo: SafeTransmuteFrom<Bar>);

(using static_assertions::assert_impl_all - Rust)

And thus it's not obvious to me that this should be a full language feature.

y86-dev · May 15, 2022, 7:00pm

That would be great! I have not had the time to read all the details of the safe transmute project, so I do not know what it does exactly.

It also seems like I was looking at the wrong places, the repository hasnt had a commit for 2 years (so it seems like the discussion is on zulip, but following there is probably a lot of reading).

scottmcm · May 15, 2022, 7:20pm

I don't know that precisely the form it will take is decided, but here's some progress towards part of it:

github.com/rust-lang/rust

Initial implementation of transmutability trait.

rust-lang:master ← jswrenn:transmute

opened 01:01AM - 25 Dec 21 UTC

jswrenn

+6026 -2

*T'was the night before Christmas and all through the codebase, not a miri was s…tirring — no hint of `unsafe`!* This PR provides an initial, **incomplete** implementation of *[MCP 411: Lang Item for Transmutability](https://github.com/rust-lang/compiler-team/issues/411)*. The `core::mem::BikeshedIntrinsicFrom` trait provided by this PR is implemented on-the-fly by the compiler for types `Src` and `Dst` when the bits of all possible values of type `Src` are safely reinterpretable as a value of type `Dst`. What this PR provides is: - [x] [support for transmutations involving primitives](https://github.com/jswrenn/rust/tree/transmute/src/test/ui/transmutability/primitives) - [x] [support for transmutations involving arrays](https://github.com/jswrenn/rust/tree/transmute/src/test/ui/transmutability/arrays) - [x] [support for transmutations involving structs](https://github.com/jswrenn/rust/tree/transmute/src/test/ui/transmutability/structs) - [x] [support for transmutations involving enums](https://github.com/jswrenn/rust/tree/transmute/src/test/ui/transmutability/enums) - [x] [support for transmutations involving unions](https://github.com/jswrenn/rust/tree/transmute/src/test/ui/transmutability/unions) - [x] [support for weaker validity checks](https://github.com/jswrenn/rust/blob/transmute/src/test/ui/transmutability/unions/should_permit_intersecting_if_validity_is_assumed.rs) (i.e., `Assume::VALIDITY`) - [x] visibility checking What isn't yet implemented: - [ ] transmutability options passed using the `Assume` struct - [ ] [support for references](https://github.com/jswrenn/rust/blob/transmute/src/test/ui/transmutability/references.rs) - [ ] smarter error messages These features will be implemented in future PRs.

y86-dev · May 15, 2022, 7:25pm

I think this is not enough, because if Foo and Bar are #[repr(Rust)], then e.g. layout randomizations would not give the two types the same layout even if their fields are identical/transmutible

scottmcm · May 15, 2022, 9:58pm

Well, but then the assert would fail. (Or maybe it always would, without repr(C).)

y86-dev · May 15, 2022, 10:00pm

Then the safe transmute project might not be enough. That is why I want repr(inherit(...)). It is to guarantee that two types have the same layout.

y86-dev · May 21, 2022, 4:54pm

I want to begin writing a pre-RFC and I want to add this in the alternatives section. Did you have a clear idea of how this would be implemented?

chrefr · May 22, 2022, 5:32am

Not completely clear, but something like

pub trait HasLayout {
    type Layout: sealed::LayoutType;
    
    const ALIGN: Option<usize>;
    const PACKED: Option<usize>;
}

mod sealed {
    pub trait LayoutType;
}

// #[repr(rust)] or no repr generates an impl HasLayout<Layout = RustLayout>.
pub struct RustLayout;
impl sealed::LayoutType for RustLayout {}
pub struct CLayout;
// ...
pub struct TransparentLayout;
// ...
pub struct U8Layout; // Etc.
// ...

// There may be problems with layout not applicable, for example primitive representations for non-enums or `#[repr(transparent)]` for more than one field.
// There may be a need for post-monomorphization errors to handle that, I'm not sure.

y86-dev · May 22, 2022, 10:33am

But IIUC you still need to specify the layout using #[repr(...)], right? So it is essentially just a marker trait that you could use in some trait bounds. How would you specify my proposed #[repr(inhereit(...))]?

If it is a marker trait, then this is probably not the right RFC for that, as I only want to add the additional repr.

chrefr · May 22, 2022, 11:03am

No, #[repr()] becomes like a macro that emits an implementation of the trait. You can implement the trait to change the layout.

y86-dev · May 22, 2022, 11:13am

Ok that makes sense, I do not know how the current mechanism of layout selection works in the compiler, so I do not know if this is a feasible architecture change. I will add it to the pre-RFC

y86-dev · June 6, 2022, 6:02pm

Here is my current draft for the RFC:

Summary

Introduce a new repr variant: #[repr(inherit($type))].

When a type A is marked with this #[repr(inherit(B))], the compiler ensures that A and B have the same layout.

Motivation

#[repr(inherit($type))] allows transmuting between two types with transmutable fields without requiring the use of #[repr(C)]. This allows the compiler to still choose the optimal layout, while providing simple a way to have predictable layouts.

Examples

struct FooUninit {
	pub a: MaybeUninit<u8>,
	pub b: MaybeUninit<u32>,
}

#[repr(inherit(FooUninit))]
struct FooInit {
	b: u32,
	a: u8,
}

Without repr(inherit), both structs would need #[repr(C)] and we would have to list the fields in the same order. When many fields and structs are involved (possibly with #[cfg(...)] enabling/disabling fields) this is error-prone. It also requires that the programmer chooses a good order of the fields to avoid padding bytes.

Guide-level explanation

`repr(inherit($type))`

This repr attribute ensures that the type the attribute is found on has the same in-memory representation as the type given in the attribute. This means that it is sound to mem::transmute the two types to and from each other (you still need to ensure custom invariants are upheld). The two types of course need to have the same size and specify the same name for each field. The order of the fields may be different but the types need to be layout compatible. For example:

struct FooUninit {
	pub a: MaybeUninit<u8>,
	pub b: MaybeUninit<u32>,
}

#[repr(inherit(FooUninit))]
struct FooInit {
	b: u32,
	a: u8,
}

The layout of Foo{Uninit, Init} is repr(Rust) (implicit). It will probably be laid out in a way that avoids unnecessary padding bytes and the u32 will come first.

This repr allows you to write layout dependent types that still can have memory layout optimizations handled by the compiler.

The fields of the two types need to be layout compatible. This means that they are

the same type
two types A and B where A has the transitive #[repr(inherit(B))] attribute
two types A and B where A has the transitive #[repr(transparent)] attribute and contains only one field of type B

"transitive" in this context means, that the attribute behaves transitively, so the following is legal:

struct A {
	// ..
}

#[repr(transparent)]
struct B {
	a: A,
}

#[repr(transparent)]
struct C {
	b: B,
}

#[repr(transparent)]
struct D {
	pub c: C,
}

#[repr(inherit(D))]
struct E {
	c: A,
}

Reference-level explanation

I have no knowledge of how the compiler computes the layout of a type, but I have seen some parts of the miri codebase allocating layout for types (I believe these are the same). I hope that the compiler has a bit more information than just the alignment and size of a type, otherwise we would need to add that.

Compiler changes

In order to provide the feature, the compiler needs to

compute the layout of all types not marked with/without transitive fields with #[repr(inherit(..))]
iteratively compute the layout of all types marked with #[repr(inherit(..))]

check that the fields with the same name have compatible layouts and that all fields are present
check that no conflicting #[repr(inherit(..))] was specified

There should exist an iteration limit, for the beginning we could choose something low like 16. There should also exist a similar option to #[recursion_limit()], say #[repr_inherit_iter_limit()] to select the limit.

Semver

The attribute is only allowed to reference types with all fields visible to the current module and incompatible with types marked by #[non_exhaustive]. This applies to only the type mentioned in the parenthesis, so the struct you define can have private fields. This is because the attribute relies upon types not renaming fields, changing field types and adding types for semantic version compatibility.

Drawbacks

Why should we not do this?

it complicates (more code to maintain, longer compile times) the layout selection process of the compiler
adds more difficulty trying to learn rust layouting (should be marginal)

Rationale and alternatives

`Repr` trait instead of `#[repr]`

Alternatively we could add a trait to use instead of #[repr] attributes altogether:

pub trait Repr {
	type Layout: sealed::LayoutType;

	const ALIGN: Option<usize>;
	const PACKED: Option<usize>;
}

// all implement sealed::LayoutType
pub mod layout {
	pub struct Rust;
	pub struct C;
	pub struct Transparent;
	pub struct U8;// etc.
	pub struct Inherit<T>(PhantomData<T>);
}

#[repr(...)] would then become an implementation of the trait.

What if this is not implemented

Functionality wise, this can already able done, if one is careful. Using #[repr(C)] and ensuring the same order of fields and transmutibility between them also results in transmute compatible layouts. However we are losing

automatic (and future) layout optimizations
layout randomizations of repr(Rust), helping with security
compiler assisted layout generation - we are essentially doing the layouting manually (which is error prone and a burden)

Prior art

I (y86-dev) have not been able to find any discussions on this topic on zulip or the internals forum. There are however related topics:

Unresolved questions

How does the compiler layout mechanism need to be adjusted?
Do we need more complex rules? I wanted to keep it simple.
what about the primitive reprs like #[repr(u32)]?

Future possibilities

Project safe transmute

Integration with project-safe-transmute seems to be a good idea, types with repr(inherit(...)) should be soundly transmutable.

Enums and Unions

Allow the repr to be used on other types as well. I do not know how useful this would be, but I think in theory it could be done.

y86-dev · June 6, 2022, 6:03pm

I could definitely use some help as this would be my first rfc and I am not familiar with the rfc process/phrasing.

CAD97 · June 6, 2022, 6:06pm

One question to ask: we have #[repr(u32)] for primitives, so now we'd have both #[repr(u32)] and #[repr(inherit(u32)]`. Are these different?

y86-dev · June 6, 2022, 6:15pm

Is u32 the only type that can have #[repr(u32)]? If yes, then I think it would be of little benefit if we allow #[repr(inherit(u32))], because #[repr(transparent)] + u32 field is still transmute compatible with u32.

We also could add different semantics to #[repr(inherit(u32))] (e.g. allow/deny niches), but I am not sure what to do here.

mathstuf · June 6, 2022, 6:37pm

Some questions that crop up in my mind that any RFC would need to at least discuss:

Is this possible?

type Tuple = (u8, u32);

#[repr(inherit(Tuple))]
struct Struct {
    a: u8,
    b: u32,
}

How about:

type Tuple = ((u8, u32), u16);

#[repr(inherit(Tuple))]
struct Struct {
    a: u8,
    b: u32,
    c: u16,
}

How are fields lined up if the types are not unique and names don't match?

struct A {
    a: u8,
    b: u8,
}

#[repr(inherit(A))]
struct B {
    b: u8, // Would this overlap A::a or A::b? How to control or tell?
    a: u8,
}

What about:

#[repr(transparent)]
struct U8 {
    a: u8,
}

struct A {
    a: u8,
    b: U8,
}

#[repr(inherit(A))]
struct B {
    b: u8, // Would this overlap A::a or A::b? How to control or tell?
    a: U8,
}

y86-dev · June 6, 2022, 6:59pm

Oh I totally forgot tuple types! I think that this will be rather difficult to support, because tuple types essentially have #[repr(Rust)], see the UCG section. And because I am proposing name based matching you cannot do this with structs...

I do not think this is a real problem, you can just declare the struct you need to inherit from.

The field to field identification is done via names, if they do not match, an error is emitted.

This should work, as U8 is transparent (we would map a<->a and b<->b).

Thanks for the good questions, I am going add these to the draft!

Topic		Replies	Views
Repr questions, and requests	7	1010	March 25, 2019
Pre-RFC: repr(C) for traits language design	14	2516	September 18, 2020
Pre-RFC interest gauging: type aliases in repr	5	617	September 13, 2024
Idea: Automatic marker traits for repr(C) and friends language design	2	1073	March 25, 2019
Random thought: #[repr(C++)]? libs	8	1974	March 25, 2019