Documenting more layout guarantees

@joshlf and I were chatting about adding more standard types to zerocopy. There's a few that are contentious, and it might be worth documenting this somewhere:

  • Is it safe to transmute fn(), &T, &mut T, Box<T>, Arc<T>, Rc<T>, and Option<_> of each, into usize? This question applies for both T: Sized and T = [U].
  • Is it safe to transmute uX into Option<NonZeroX> and back?
  • Does ManuallyDrop<T> have the same niches as T, such that if an enum with a T in it has no padding due to niche-filling, replacing it with ManuallyDrop<T> doesn't, either? E.g. is Option<ManuallyDrop<&T>> a single word or two?
  • If a type is declared repr(transparent) in std, is that a stability promise? I assume so but I can't find a citation for that.

Thanks!

1 Like
  • For the various pointer types, check out rust-lang/unsafe-code-guidelines#286. It's not yet a settled issue (except at compile time, which will be landing soon).
  • For transmuting uX into Option<NonZeroX> and vice versa, I believe that would be safe given the layout guarantees of Option<NonZeroX> — zero is always None.
  • Given ManuallyDrop is #[repr(transparent)], this is in fact the situation. This is stated in the documentation right at the top.
  • #[repr(transparent)] is absolutely a stability promise.
2 Likes

Thank you! I was not aware that we were considering doing this with ptr2int. I was undert the impression that ptr -> int -> ptr produced a fresh pointer of unspecified provenance, similar to materializing a pointer into e.g. hardware registers out of the aether.

For now, I think we can assume that none of the pointer types can be directly transmuted (i.e., without going through ptr2int or int2ptr at the LLVM level).

For the others, thanks. I was pretty certain all those were correct but Josh was a bit nervous about it. =)

Even though it's true for ManuallyDrop, you can't rely on this behavior based on repr(transparent) alone. UnsafeCell and MaybeUninit are repr(transparent), too, and neither preserves niches.

1 Like

The documentation of each NonZeroX already says they have the same layout.

TIL UnsafeCell doesn't preserve niches. It makes sense for MaybeUninit<T> since it isn't a direct wrapper around T, it's an union, but what is the reason for UnsafeCell?

Here's the writeup which links to some related rust and UCG issues that contain further discussion.

2 Likes

fn() is fun. It may or may not be a hard error depending on the platform. For T: Sized, this is well-formed, but has a few issues that could make it UB (What about: Pointer-to-integer transmutes? · Issue #286 · rust-lang/unsafe-code-guidelines · GitHub). For T = [U] this transmute will be a compile-time error since they aren't the same size.

It's interesting that the intent discussed there doesn't match rustc's current behavior, which avoids niche optimization for UnsafeCell and RefCell, but still does it for Cell, RwLock, and Mutex.

Though I suppose for RwLock/Mutex it might be niching a different part of the structure, e.g. the Box for the system impl. So that could match the discussed plan + reënabling niching for Cell.

Look in the docs for std to see what's guaranteed. If it's not specifically mentioned in the docs, it's not guaranteed.

For example, this section for Option: https://doc.rust-lang.org/std/option/index.html#representation

(If you think there's something that should be guaranteed but isn't documented, then send a PR for the docs and the libs team will consider it.)

Mutex and RwLock both contain a Box on some platforms, which does have a niche

What about transmuting into two usizes side-by-side (we're not actually asking about std::mem::transmute, but rather asking about whether the layouts are well-defined and don't contain any padding/poison bytes)

For transmute, that would work, though it would implicate the same issue I mentioned for the thin pointers. However, if it's just layout, then I believe yes, &[T] has the size of 2 usizes, though it's unspecified which is what. All thin pointers, though (T: Sized), are guaranteed to have the same size as usize, and (based on this informal discussion) may not have padding bits or bytes.

I don't have an explanation but that does indeed seem to be the case, and as far as I can tell the current behavior has been the behavior since the PR landed in 1.43. An exception for Cell did not seem to be intended based on the comments in the PR to my reading. Perhaps some follow-up is in order.

Even more interesting, looks like the current behaviour of a type that contains an UnsafeCell is to allow niche optimizations only if it doesn't contains something else before the UnsafeCell, and that something is not a ZST. Rust Playground That's weird...

This comment seems to imply it was kinda intended.

I saw that comment too, but took it to mean it was a future possibility and not the current plan. Cell is still just a repr(transparent) UnsafeCell.

I'll post a comment in one of the issues asking for clarification when I get a minute to write it up, if no one else has.

I am pretty sure Cell exposing the niche is a bug... at least I don't remember that being part of the deal (though my memory for such things is bad), and now I wonder about the soundness of that.^^ Could you open an issue?


As was already mentioned, be careful with ptr-to-int transmutes. There are some sleeping daemons here, and it might well be that LLVM has to declare such transmutes to return poison or else risk losing some crucial optimizations (or else have serious soundness bugs).


What is that based on? Usually I'd think only things visible in rustdoc are promised, everything else is an implementation detail gleaned by inspecting the source code.

1 Like

repr(transparent) does show up in the docs, if you expand out the type definition at the top. Just like all other #[attributes] :slight_smile:

2 Likes

Hm, that's fair...

Still I was a bit surprised by how definitely that statement was made.^^ That means adding a repr to a pub type should essentially require FCP, which doesn't match my recollection of how we usually do this (e.g., MaybeUninit is repr(transparent) but on a union that actually means much less than it does otherwise and it was done just to not kill performance...).

1 Like

While the repr shows up, the private field(s) are not shown, so the repr(transparent) in the docs don’t really tell you anything.

7 Likes
2 Likes