pre-RFC FromBits/IntoBits

I think both the Foo/Bar and SIMD cases can be addressed by allowing to specify compatibility explicitly and then automatically closing the relation transitively (i.e. if A is manually declared as compatible with B, and B with C, then A becomes compatible with C).

The compatibility specification could either by an explicit trait impl, an attribute on the type or a macro-like syntax.

Could also have a way to say to do as if fields were not public, as a shorthand for declaring compatibility with an identical struct but with public fields.

bool -> int but no int -> bool seems orthogonal, and seems addressable by making Compatible not be reflexive.

Also note that structs with multiple fields without a layout-fixing repr should probably not be compatible with anything other than maybe structs with the exact same contents (and even this would mean giving up the option to do profile-guided struct layout differently for each type, so probably not a good idea).

So how would this system fit with your Compatible trait proposed above? Could you summarize your whole proposal?

@jmst I’ll try to summarize your proposal as good as I can. You propose to add an auto trait:

auto trait Compatible<T> {
    fn from_compatible(T) -> Self; // needed for the explicit impls
}

that:

  • is automatically derived for all permutations of two equally-sized equally-aligned types with whose fields are all public,
  • allows coercing T as Self, e.g., using Self::from_compatible(T) (or some other syntax)
  • is transitive: allows coercing T as V, if Compatible<T> for U and Compatible<U> for V

For types with private fields, the users would need to specify this trait manually for the relations they care about and due to transitivity the compiler would “fill in the blanks” from the relations the users specifies to the relations the target types support.

Is that it?

The idea would that something like this is added to the core crate:

pub unsafe trait Compatible<T> {}

pub fn safe_transmute<T, U>(x: T) -> U where U: Compatible<T> {unsafe{transmute(x)}}

A first version could just be this, requiring manual implementation of the trait.

A second version could add compiler support for automatically deriving the trait in addition to doing so manually and for performing transitive closure so that if A: Compatible<B> and B: Compatible<C> (whether manually or automatically), then A: Compatible<C>.

A further addition could be to provide a safe way to manually implement the trait for private types in the current crate.

The compiler would generally derive it when doing so would not result in safe_transmute having undefined behavior, changing its result depending on compiler optimization strategy or backwards-compatible crate changes or creating values of U that could not be created otherwise.

The details of the exact rules to accomplish that would be take quite some work and probably some trial-and-error to specify precisely: for example, it needs to account for private field preservation, alignment preservation, the possibility of struct/enum layout optimization changes, non-exhaustive types like char and enum, and so on.

1 Like

How does coherence interact with your approach?

For example, with the FromBits/IntoBits approach if you have two crates, A and B, exposing two types A::AT and B::BT, a third crate C cannot implement FromBits<A::AT> for B::BT.

Maybe your approach could “fill in this blank”, but I don’t know whether doing so is brittle. For example if crate B adds a new private field to BT, which used to have only public fields. This already is a breaking change, so this breaking might be ok.

Now that I think of it one could have a fallible extension, like:

pub unsafe trait MaybeCompatible<T> {
    fn is_compatible(&self) -> bool {true}
}

pub unsafe trait Compatible<T> : MaybeCompatible<T> {}

pub fn safe_transmute<T, U>(x: T) -> U where U: Compatible<T> {unsafe{transmute(x)}}

pub fn try_safe_transmute<T, U>(x: T) -> Option<U> where U: MaybeCompatible<T> {if(x.is_compatible) {Some(unsafe{transmute(x)})} else {None}}

which would allow to convert int -> char when int happens to be a valid Unicode code point, or &[u8] -> &[u32] when it happens to be 4-byte-aligned.

1 Like

In the minimalist version, that’s impossible.

With automatic transitive closure crate C could define a type CT and (using “unsafe”) declare that AT is compatible with CT and CT with BT.

But in general this shouldn’t be an issue if the trait is derived by the compiler, since if AT and BT are not automatically compatible, crate C usually cannot safely declare them compatible (not sure if there can be any cases where that would be safe and compatible with changes in crates A/B).

So I like your approach a lot. Declaring all FromBits/IntoBits implementation in stdsimd is painful and error prone (as in, it is easy to miss one). With your approach we would still need to declare some of them manually, but due to transitivity we would get most of them for free.

For example, we would only need to implement the traits for most portable vector types only once, e.g., from u8x16 to __m128 and viceversa, and because of transitivity, we get the u16x8, u32x4, i32x4, f32x4, … conversions automatically for free.

For the vector mask types like b8x16, we can implement b8x16 -> __m128 but not the other way around, and get all the correct conversions for free.

With FromBits/IntoBits one needs N^2 implementations, with your approach 1 for types with public fields which the compiler fills in for you, and at worst O(N) for types that should only support uni-directional conversions.

I think it would be really nice if you could write an RFC, and if you need help with the motivation or examples we have more than enough in stdsimd.

AFAIK [T] to [u8] always works. [u8] to [T] is not guaranteed to work indeed, but that’s not a problem for me.

I don’t need a specific built-in function/method for conversion between just slices. I need a generic trait bound that lets me implement myself a family of such functions for various combinations of types.

This could be extremely useful for use with C-like enums if the compiler is smart enough to recognize valid ranges, much like it does when checking for exhaustiveness in match statements.

The simplest case is a #[repr(T)] enum where every possible value in T is defined in the enum. In practice this would be useful mainly for repr(u8) because of the current require to specify enum items for each individual value.

A slightly more advanced case is a #[repr(T)] enum where every value between 0 and N are defined. Currently it is only possible to convert between enums with the same ranges either through exhaustive match statements or using transmute. Compiler-generated implementations of FromBits / IntoBits or Compatible would eliminate a huge amount of unsafe boilerplate code and be the first step into turning C-like enums into a general-purpose ranged value type.

For instance, bobbin-bits defines enums for U1 to U32 (covering 1-bit to 32-bit values) and ranges R1 to R32 (covering ranges for 1…N). These are used for modeling bit fields and for type-checked indexing of small fixed arrays, but using them with enums requires manually implementing From and To traits to do conversions, which is tedious and mistake-prone.

It would be far more useful if I could do something like this and have the compiler check that Compatible<U4> is valid:

fn takes_u4<T: Compatible<U4>>(v: T) {
   let value: U4 = v.safe_transmute();
   do_something(value)
}

enum ThingWithSixteenItems {
    Item0 = 0,
    Item1 = 1,
    ...
    Item15 = 15,
}

impl Compatible<U4> for ThingWithSixteenItems {}

or even better, have the Compatible<U4> trait automatically derived for any enum matching the valid range of U4.

@jmst One thing that came out during the preparation of the portable SIMD vectors RFC is that a safe Compatible<T> would probably need to produce the same results in both little endian and big endian platforms.

That is, currently unsafe { mem::transmute } produces this behavior (playground):

let x: [i8; 16] = [
    0, 1, 2, 3, 4, 5, 6, 7,
    8, 9, 10, 11, 12, 13, 14, 15
];
let t: [i16; 8] = unsafe { mem::transmute(x) };
if cfg!(target_endian = "little") {
    let t_el: [i16; 8] = [256, 770, 1284, 1798, 2312, 2826, 3340, 3854];
    assert_eq!(t, t_el);  // OK on LE
} else if cfg!(target_endian = "big") {
    let t_be: [i16; 8] = [1, 515, 1029, 1543, 2057, 2571, 3085, 3599];
    assert_eq!(t, t_be);  // OK on BE
}

It would be nice if a safe_transmute operation could produce the same result in both architectures:

let x: [i8; 16] = [
    0, 1, 2, 3, 4, 5, 6, 7
    8, 9, 10, 11, 12, 13, 14, 15
];
let t: [i16; 8] = safe_transmute(x);
let el: [i16; 8] = [???];
assert_eq!(t, e);  // OK on LE and on BE

Maybe this might not only be nice, but actually a necessary condition to make safe_transmute safe.

AFAIK the only way to achieve this would be to swap bytes inside safe_transmute on either little endian or big endian architectures.

That would be impossible when converting slices, and in any case, endianness dependence is not unsafe.

All of the transmute_copy() calls in simd_funcs.rs of encoding_rs are actually safe and I'd like to write them so.

That's news to me, but happy news.

For ergonomic portable SIMD, it's essential that we have convenient safe syntax for the SIMD type conversions that are zero-cost reinterpretations in the little-endian case but produce different results in the big-endian case.

Since, thanks to WebGL, big endian is not coming back, I don't care much what Rust does for SIMD in the big endian case (compute different results, inject shuffles to match little-endian results or make the conversion unavailable when compiling for a big-endian target), but I really want to have safe wrappers for the little-endian SIMD transmutes.

In addition to just ignoring the problem, a way to handle endianness could be to add a trait variant of Compatible that is guaranteed to give the same results regardless of endianness.

To do so, one would introduce a #[repr(ecC)] for “endian-corrected C repr” (could maybe find a better name for this) which would be like repr© except that fields are laid out in reverse order on big-endian machines.

In addition, endian-corrected tuples, slices and arrays need to be introduced, where on big-endian machines items are laid out backwards and indexing is implemented by subtracting from the end (the syntax maybe being something like &'a #[repr(ecC)] [u8]).

Then cases where the conversion would give different results would be defined only for repr(ecC) when using the endian-corrected version of the trait.

Having to duplicate tuples, slices and arrays is annoying, but those types would probably only be used for small sections of code.

Alternatively one could lay out everything backwards by default, but this would be incompatible with C FFI and performance might be reduced since the CPU instruction sequence for backwards indexing is often less efficient and the backward direction might not trigger hardware prefetching if the CPU is not sophisticated enough.

As a further extension, one could add an “always-reversed repr” and an “reversed only on little-endian repr” and provide a ByteSwap trait that could convert between the versions (either implemented with custom derive or again by the compiler).

An even further extension would be allowing structs and code to be generic over the repr (without using macros), although I guess this would not be worth it.

[not sure about the WebGL argument, since it seems to me that only the GPU needs to be configurable to run in little-endian mode, which is true for all PCIe GPUs since they must work in PCs, and the JS/Wasm CPU code could just byteswap on all memory accesses - it’s not ideal, but CPUs are not normally designed to primarily run WebGL or WebAssembly code]

1 Like

I recently proposed Pre-RFC: Trait for deserializing untrusted input, which is similar to some of the arbitrary bits stuff (like the pod crate) discussed here.

One important difference is that we propose having a custom derive. So instead of requiring your fields to be public or requiring you to use an unsafe impl, you can voluntarily do #[derive(ArbitraryBytesSafe)] (that’s the trait name we propose), and the custom derive will verify whether all of your fields, and their composition, are safe for deserializing from arbitrary byte patterns. It works for private fields too, but because you have to explicitly use the derive, it won’t just automatically apply to your type, invalidating your invariants.

@joshlf is the trait transitive ? That is, is there a way to safely transmute between two unrelated types deriving ArbitraryBytesSafe ?

Well you can copy the bytes of any T into something which is ArbitraryBytesSafe because those bytes are just bytes. You could even transmute a T into an ArbitraryBytesSafe as long as you were willing to mem::forget or drop the original T. So I guess it’s transitive in the trivial sense that anything can be converted into an ArbitraryBytesSafe.

It’s a bit hard for me to visualize it, could you show a code example?

Sure.

// NOTE: need to verify that size_of::<T>() == size_of::<U>().
// How to do that is an open question in the pre-RFC.
fn safe_transmute<T, U: ArbitraryBytesSafe>(t: T) -> U {
    unsafe {
        // First, convert t to its underlying bytes. This effectively
        // mem::forget's it. Could also drop first. Now we just have
        // a meaningless pile of bytes.
        let bytes = mem::transmute::<T, [u8; mem::size_of::<T>()]>(t);
        // Second, convert the pile of bytes to a U. We know this is
        // safe because U: ArbitraryBytesSafe. We could have done both
        // of these steps at once (transmuting T to U), but this is more
        // illustrative of why it's safe to do the conversion.
        mem::transmute::<[u8; mem::size_of::<T>()], U>(bytes)
    }
}

I see. So there are some pairs of types for which safe_transmute is not bidirectional. For example, m32x4 and m16x8 can be safely transmuted into i32x4 but i32x4 cannot be safely transmuted into either m32x4 nor m16x8. I guess that would be handled by making i32x4 derive ArbitraryBytesSafe and leaving m32x4 and m16x8 without an implementation. Since all these types have the same size, m32x4 and m16x8 can be safely transmuted into i32x4, but since m32x4 and m16x8 do not derive ArbitraryBytesSafe, i32x4 cannot be safely transmuted into any of these. So far so good.

However, m32x4 can be safely transmuted into a m16x8 while m16x8 cannot be safely transmuted into m32x4. I wonder how that could be handled by ArbitraryBytesSafe without allowing any safe transmutes of m32x4 or m16x8 to i32x4.