Pre-RFC / discussion: Legal transmute between repr(Rust) with repr(transparent) members

Salabar · March 17, 2020, 10:02pm

Hello. I'm working on a graph processing crate for Rust. The documentation is quite cryptic at this point, so I hope examples are more or less self-explanatory.

It seems promising so far, but I've recently come up with a way to radically streamline its API, which is unfortunately illegal according to Rustonomicon. Every graph is a collection of structures such as this one: https://docs.rs/dynamic_graph/0.0.3/src/dynamic_graph/lib.rs.html#75-78 The payload is an arbitrary data stored in a graph node, refs is a collection of pointers to other nodes in the same graph.

Since I can't allow users to use raw pointers, a special type called GraphPtr is used in the public API. As you can see, this is simply a #[repr(transparent)] wrapper of an ordinary NonNull pointer, except it uses this crate to ensure that every GraphPtr is always legal to dereference. Ergo, it is exactly the same as a normal pointer at runtime.

Intuitively this means any collection which holds NonNull should also look the same as a one holding GraphPtr

If this were the case, this no-op method and its friends would've been totally safe.

This actually works on my machine (Win10x64 + Rust 1.40), but it might inexplicably break on any update should the language team decide to do something weird with layout optimization.

If this hack will become a safe operation I will be able to replace a ton of boilerplate methods with Deref and Index traits, allow to trivially use user-defined graph structures and simply do away with the whole concept of Cursors.

I would like to get the following guarantee:

Any generic struct, tuple or enum Type<T, C>, must be layout compatible with Type<Tr, C> where T and Tr are sized copy types and Tr is #[repr(transparent)] of T.

It is a user's responsibility to ensure Tr behaves identically to T inside Type.
Non-generic types with the same fields and compiler-generated types like Futures and Closures don't need to adhere to this guarantee.

I'd like to know:

Is there actual work required to implement my request, besides not breaking stuff that already works?
Is there a realistic scenario when my suggestion blocks an important optimization?

Alternative solutions:

Fork std::Collections into a separate crate and redefine every struct with #[repr(c)]
A wrapper for every collection type which casts normal NonNull into my GraphPtr before giving it to a user.

These solutions are a hassle to maintain, don't provide 1-to-1 mapping of interfaces and may turn out non-zero cost in some scenarios.

Add a new #[repr()] which is the same as 'transparent' plus my thing.
Add a magical type for wrappers of this kind: SameLayout<T, Marker : !Sized> which does the same thing, but differently.

I have no idea if 3 and 4 make any difference implementation-wise, but these are the options I'm perfectly fine with if they do.

bill_myers · March 18, 2020, 12:04am

Would it work to use GraphPtr<'static, T> instead of NonNull<T>?

Also you could use an &LCell<'a, T> instead of inventing a GraphPtr<'a, T> pointer (where LCell is the type provided by the qcell crate or a variant that uses generativity).

toc · March 18, 2020, 3:18am

It's being worked on. Main thread:

CAD97 · March 18, 2020, 4:42am

This isn't possible in general as worded. Counterexample adapted from this thread [playground]:

trait Foo {
    type Bar;
}

struct S(u8);
#[repr(transparent)]
struct SS(S);

impl Foo for S {
    type Bar = u8;
}

impl Foo for SS {
    type Bar = u128;
}

#[repr(transparent)]
struct Baz<T: Foo>(T::Bar, std::marker::PhantomData<T>);

fn hole<T>() -> T {
    panic!()
}

fn test() {
    let s: Baz<S> = hole();
    let ss: Baz<SS> = unsafe { std::mem::transmute(s) };
    // ^ cannot transmute between types of different sizes
}

In this example, S and SS are repr(transparent)-equivalent. However, even though Baz<T> is itself #[repr(tranparent)], Baz<S> and Baz<SS> are not repr(transparent)-equivalent, as the container can actually contain any associated data.

To make this even more startling, this probably won't even need a trait bound at some point in the future:

#[repr(transparent)]
struct Baz<T>([u8; {std::any::type_name::<T>().len()}], std::marker::PhantomData<T>);

What you want here would have to be an opt-in documented promise from containers to not do any of these tricks, and a language guarantee of roughly "two generic instantiations of the same #[repr(Rust)] structure are considered layout-compatible if the only difference is that they contain differently-named but layout-compatible structures." If a structure doesn't promise to uphold that requirement, breaking this property in the future is a non-breaking change.

Hey look, that's me! Your use case may actually be better served by a closure-based generativity than the macro-based, to be completely honest.

Salabar · March 18, 2020, 5:05am

Darn. I have thought about using the struct itself, but decided that adding a random lifetime just to appease the type checker would be confusing. I totally forgot 'static exists. Thank you for making me feel dumb. :o)

Salabar · March 18, 2020, 5:20am

Closures might get awkward when sharing data between two or more graphs in the same scope. And making an Anchor a simple object on the stack provides a nice trick of foregoing garbage collection through mem::forget which is a desirable property.

RalfJung · March 18, 2020, 12:41pm

Well, we'd first need a language-level guarantee before libraries can even make such a statement.

And I agree having such a guarantee would make sense. I think this corresponds roughly to this UCG issue:

github.com/rust-lang/unsafe-code-guidelines

Deterministic (but undefined) layout

opened 05:38PM - 11 Oct 18 UTC

nikomatsakis

A-layout S-not-opsem T-lang

From #31: Can we say that layout is some deterministic function of a certain, fi…xed set of inputs? This would allow you to be sure that if you do not alter those inputs, your struct layout would not change, even if it meant that you can't predict precisely what it will be. For example, we might say that struct layout is a function of the struct's generic types and its substitutions, full stop -- this would imply that any two structs with the same definition are laid out the same. This might interfere with our ability to do profile-guided layout or to analyze how a struct is used and optimize based on that. (Some would call that a feature.) Also, this presumably applies to enums as well as other types.

steffahn · December 22, 2024, 5:03pm

This topic was automatically closed 540 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Safe transmute for transparent struct language design	15	1000	March 27, 2024
Specifying a set of transmutes from Struct<T> to Struct<U> which are not UB	42	3174	December 15, 2019
Documenting more layout guarantees Unsafe Code Guidelines	20	1356	December 22, 2024
Repr questions, and requests	7	1011	March 25, 2019
Automatic marker trait for unconditionally valid `#repr(C)` types compiler	6	1097	March 25, 2019

Pre-RFC / discussion: Legal transmute between repr(Rust) with repr(transparent) members

Related topics