[Pre-RFC] Safer Transmutation

Independently I think there is some overlap between this RFC and ABI stability that could be discussed in the RFC:

  1. The RFC requires repr(C), but if there is another stable representation such as discussed in the stable modular ABI thread, it should also work.

  2. derive(PromiseTransmutable) seems to correspond to library ABI stability for this type. Maybe in the future this could thus be required for public types of stable ABI crates. Considering this use case, maybe a more neutral name could be chosen, thus as derive(StableLayoutGuarantee).

There are plenty of situations where you want a safer transmute, but not necessarily a safe transmute. This is where the options system of the RFC comes in. You might want to give up some static guarantees of safety when there is a more powerful runtime check you can perform; e.g., NeglectAlignment unsafely disables the static alignment check.

Without the options system, the only way to perform a conditionally-safe transmutation would be to use the wildly unsafe mem::transmute and company.

My hope is that additional options will be added in the future until there is no conditionally-safe use-case that isn't solved by this RFC with the right set of options! A NeglectConstructability option is one such possible future option, but requires a lot of design work to get right.

Someday moving to the complete formulation of constructability should generally not cause backcompat issues.

The exception to this rule is the pub-in-priv trick. The documentation for the stability declaration traits should be clear you should not implement these traits for your type if you are using the pub-in-priv trick to restrict their implicit constructability.


Agree completely! The RFC actually only requires that the transmuted types have a well-defined representation. This does not necessarily mean #[repr(C)]; e.g., the layout of certain option-like types is well-defined.

I'm surprised the RFC doesn't mention integer byte order concerns. I think it should be explicit about whether e.g. a u32 <-> [u8; 4] conversion is or isn't considered a “safe transmute”.

There is a section on platform-dependent layouts, but it's not (yet) explicit that endianness is a kind of platform-dependent layout. Our RFC is expressly oblivious to platform-dependent layouts. Safety, in this RFC, is scoped to memory safety. A transmutation from u32 to [u8; 4] might be inconsistent between platforms, but it is not unsafe.

11 Likes

This is a very worthy project. The first obvious thing that strikes me about it is that the reuse of the rust-specific term "transmute" will be potentially confusing in the future. "mem::transmute" and this API are quite different. Talking about "transmute" when it means two different things will be complicated and pretty much guarantee the term is always accompanied with a clarifying adjective, like "unsafe transmute" vs "safe transmute".

I'd suggest giving this a new name, maybe with "cast" in it. This could be an opportunity to use the C++ "reinterpret" terminology, like "ReinterpretInto", "ReinterpretFrom", but anything else will make it easier to discuss. If we want to keep it cute and magic themed, maybe come up with another magical word that also has a connotation of safety.

10 Likes

Could you elaborate? The lense through which I've always seen this proposal (and, before it, typic), has been: What's the trait bound we add to mem::transmute to make it safe? And this RFC's answer is:

pub fn transmute<Src, Dst>(src: Src)
where
    Dst: TransmuteFrom<Src>
{
    ...
}

(Of course, we cannot literally just slap trait bound on mem::transmute—that would a breaking change. I propose two free functions, safe_transmute and unsafe_transmute, and that we perhaps someday deprecate mem::transmute.)

1 Like

Would there ever be any scenario in which a user has to manually state every single tuple combination of these options?

Maybe? I suspect that would be very uncommon. I don't think we even have a complete sense of all the options that might eventually exist. For instance, a hypothetical NeglectConstructability option would have subtle interactions with !Send, !Sync, UnsafeCell and any abstraction whose safety depends on restricting visibility. Deciding if and how to partition that space of dangerous transmutations between different options is going to be a substantial undertaking.

What's important is that the API surface can handle that future work, and it can: those additional options can be freely added in the future if they're deemed necessary. I would love for this API to eventually get to a place where any use of mem::transmute necessitated by a lack of neglectable checks in the safe API was an indicator of unsoundness.

Is there a way for something like NonZeroU8 to be transmutable via ignoring validity?

Also how does this act in generic code? Will the fixed/fully completed constructibility check be dependent on which module the code is in? (Would transmuting a struct with a pub(crate) field inside that crate work?)

mem::transmute is unsafe and this api is safe

I agree that it can be slightly confusing to talk about when the APIs are not qualified as unsafe vs safe, though I think the RFC does a good job of making it clear when the safe variety is meant and when the unsafe one is. I do, however, like the idea of borrowing terminology from C++ rather than using the prefix safe_ to distinguish it with the existing API. I don't think that that needs to be decided in this RFC, however.

1 Like

Yes, this is precisely the sort of situation where the proposed NeglectValidity option is useful!

Yes, the full formulation of constructability is intrinsically tied to scope. So, whether you can transmute a struct with a pub(crate) field inside it would depend on where you attempt to do it.

The complete details of this full formulation are not totally fleshed out. It won't be as simple as *assess constructability at the point where TransmuteFrom occurs. TransmuteInto, for instance, is just a blanket implementation over TransmuteFrom—but no end-users types are constructible from the libcore! You really need to assess the provenance of the TransmuteFrom bound.

I'm least confident about the feasibility of this, hence the proposed initial simplification of constructability.

1 Like

I don't have anything to add other than overwhelming positive feedback and appreciation for the work that has gone into this. Having this kind of functionality in the language and in std makes me downright giddy. It will be a dramatic improvement to many types of unsafe code, and will also open the doors to writing safe code that we can be sure is correct. Doing this in Rust today is of course possible, but there are so many invariants to check that I often avoid doing it because it isn't worth the risk. But this RFC appears to largely de-risk it, which I think will be a huge boon.

Thank you so much for putting together this RFC. :smiley:

13 Likes

Is there a reason pub(crate) is being considered here, instead of just defining constructibility as having all bare pub files recursively and treating pub(crate) as private? Given this is purely for backwards compatibility, I agree that a complex definition might not be useful. (Crate authors wanting to use these impls can/should just use the PromiseTransmutable* traits instead).

This is exactly what the simplified formulation is. However, constructability is also impacted by the visibility of the field type definitions, and the visibility of the paths of those definitions. Usually, Rust forbids private types in public type signatures, but the pub-in-priv trick circumvents this check. Ignoring this is a safety hazard.

2 Likes

I haven’t read through the whole RFC yet, but is seems like there are no comparisons to Haskell’s safe coercions yet. The situation is kind-of similar. There was (and still is) a function called unsafeCoerce in Haskell that completely “disables” the type checker like Rust’s transmute but can safely be used around e.g. newtypes (which are like #[repr(transparent)] structs). The safe coercions are presenting themselves with a coerce method of a Coercible type class (type classes are what Rust’s traits are based on) that the compiler resolves AFAICT similarly to the proposed TransmuteFrom/-Into, that is: also ad-hoc, based on the visibility of constructors of the types involved (a distinction similar to visibility of fields in Rust). Unlike this proposal, the safe coercions in Haskell are only dealing with newtypes and generic types, and not with safely transmutable C-style structs, and they are an important implementation detail of the GeneralizedNewtypeDeriving feature, something that Rust is still missing but should in my opinion eventually get, too.

They introduce so-called “roles” for parameters of generic types, in particular those roles are called nominal, representational and phantom. Speaking in Rust examples, in a type like PhantomData<T>, the variable T would have a phantom role with the effect that a PhantomData<T> could be freely transmuted into a PhantomData<S> for any S. A type like Vec<T> would have a parameter T with representational role, meaning that Vec<T> could be transmuted into Vec<S> if and only if T can be transmuted into S. And a type like HashSet<T> has a parameter T with nominal role, which means that HashSet<T> can never be transmuted into HashSet<S> except for when T and S are the same type. The nominal role of HashSet<T> would be an API design decision since changing the type inside the HashSet would change the Hash implementation of the contained type which could put the whole HashSet into a totally broken state. A different situation, where nominal parameters are actually strictly necessary would be for something like

trait SomeTrait {
    type AssociatedType
}
struct MyStruct<T: SomeTrait>(T, T::AssociatedType);

because there changing T to some different S changes the associated type and hence the layout of MyStruct<T>, even if T is transmutable into S.

The safe coercions of Haskell are always invertible, unlike this Rust proposal, so a representational role in Rust would probably need to come with some kind of notion of “variance”.

I’ll continue reading up on how the current RFC perhaps already handles these kinds of issues but I wanted to make sure this prior art is also taken into consideration.

6 Likes

This would probably want to be another Neglect* option, since inconsistent Hash/Eq implementations are only a logic error and not a memory safety error - you normally want to be protected from logic errors, but it makes sense to allow it as an option.

I actually knew someone who was working (alone) on a Rust safe-transmute system based on Haskell's Coercible (unfortunately the project is pretty much lost now). One thing I remember is that it did have explicit variance of type parameters.

Mostly off-topic discussion about library UB

Hmm, but in an unsafe-using HashContainer implementation, such a "logic error" (elsewhere named "library UB") could get promoted to memory errors pretty quickly upon attempting to actually use the container, depending on what exactly is done with the bits of the hash (which has now changed) - e.g. if the set somehow cached that a bucket is present for a given element, but the element's hash has now changed, the bucket for that hash is no longer present.

In the case of HashMap and Hash the container has no way of relying on the results to be consistent anyways. For what it's worth, an impl Hash could use a random number generator or read values from the network. Nonetheless, with reasonable Hash impls around, you cannot break your hashmaps, and thus safe transmutations should not introduce a way to do this either without some extra hurdles. I agree that those hurdles don't need to include requiring unsafe in the case of HashMap.

Other data structures could be different, in particular they could rely on the correct implementation of sealed or unsafe traits and thus require a nominal type role to be safe because otherwise the transmutation would trigger some actual library UB.

Edit: On a second thought, for a type with a correct Hash implementation, it would be impossible to create a broken HashMap currently. Thus a function receiving a HashMap<i32, T> or so, using unsafe and requiring the map to be not broken for ensuring no UB is triggered, could become unsound by this change.

Edit2: Wait... perhaps it is possible to create a broken HashMap<i32, T> because of the HashMap API using the Borrow trait. I'm not sure. Maybe it's still impossible because Borrow is used in retrievals but not in (key-changing) updates.

1 Like