[Pre-RFC v2] Safe Transmute

Sounds like an "implied implementation" since we're talking "implication constraints" (if Implements(A: B) then Implements(C: D)).

Is that true? Some benchmarks to confirm it could be helpful. Provided that it is true, then I don't see why not.

I have not had time to sit and focus on reading this whole RFC carefully, and I will try to do so soon. However, I'll say right now that I would love if bytemuck were included in the prior art.

It's a much, much simpler interface than all of this. That naturally means that there's some uncommon cases where it might fall down a bit, but the common case is kept very plain and simple for the end user. I think that having a good common case is essential here. I know many folks on the Community Discord who don't even want to touch zero-copy or safe-transmute simply because of the API complexity, and I've been able to get them to adopt bytemuck by having a clear and easy API.

7 Likes
  • uninit/padding bytes can at best have "unrepeatable reads" semantics
  • supporting that generally seems unrealistic due to LLVM behaviour
  • may declare it not-UB to pass such mem to syscall-s and memcpy instead

So I'd say it's a different Alterantive I'd been referring to

TIL about bytemuck. I like it a lot! It doesn’t cover some use cases that this proposal does, but it’s not clear to me how important those are. And the simplicity is compelling.

2 Likes

Looks great. Some comments:

Compile-time alignment checking

I'd like to have an API that can rule out alignment errors at compile time, e.g. when casting from &[u32] to &[u16], or from &[NonZeroU32] to &[u32]. But I believe it's best to wait on that until const generics are fully implemented, as then we should be able to directly express align_of::<Target>() <= align_of::<Source>() in the type system. Even when that becomes possible, there will still be cases where it's useful to check alignment at runtime, as the APIs proposed here do.

Therefore, the aforementioned API should be considered a future extension to this RFC rather than an alternative.

Marker trait

It would be nice to be able to mark both FromAnyBytes and ToBytes as #[marker] traits once support for those is stabilized, to allow for overlapping generic impls.

This would require removing all methods from ToBytes. Instead, we could potentially use the "supertrait-with-blanket-impl" pattern:

trait FromAnyBytesExt { /* methods here */ }
trait FromAnyBytes : FromAnyBytesExt {}
impl<T: FromAnyBytes> FromAnyBytesExt for T { /* ... */ }

That would have the downside of requiring users to import FromAnyBytesExt to use the methods, though oddly enough, only in non-generic code...

Unsafe impl

This is pretty annoying to me. It's not the end of the world, but I'd like to have unsafe impl:

  • For generic impls, as a workaround for the initial lack of support for validating them.

  • For structs containing a field of a foreign type which satisfies the requirements for the traits but doesn't implement them. I know this is dangerous from a backwards compatibility perspective, but still.

  • For a special case of the last point:

    I'm fine with deferring that decision. But if we eventually decide we're not going to add those impls, then you'd definitely want to be able to unsafe impl ToBytes or FromAnyBytes for structs containing raw pointers etc.

Generalized safe transmute

I'm sure the following has been proposed before. I haven't followed all the bikeshedding on the many previous proposals along these lines, so please forgive me for duplicating discussion. However, I'd like the RFC to mention it as an alternative and explain why it instead chose to focus on bytes.

If I want to go from, say, &[u32] to &[u16], I can do it with the proposed API by first going through &[u8]. In practice this should satisfy the majority of "safe transmute" use cases. But suppose I have:

#[repr(transparent)]
struct BoolWrapper(bool);

or even

#[repr(transparent)]
struct StringWrapper(String);

It's safe to transmute between &mut [BoolWrapper] and &mut [bool], or between &mut [StringWrapper] and &mut [String], but this API has no path toward allowing that.

Sure, we could make a separate API. But what about the alternative of having a more generic trait?

trait CastSliceTo<T> {
    fn cast_slice(slice: &[Self]) -> Result<&[T], TransmuteError>;
}

The name isn't the best, but basically T: FromAnyBytes would be equivalent to u8: CastSliceTo<T>, and T: ToBytes would be equivalent to T: CastSliceTo<u8>.

If the expectation is that these methods ought never be overridden anyway (since they're just a transmute), then we could also consider just allowing defaulted methods on #[marker] traits which are prohibited from being overridden, as I've proposed before.

(That's essentially the same as the extension-trait plan, in fact identical without specialization, but without forcing people to think in two different trait names.)

:+1:

I have a sketch in Tracking issue for the to_bytes and from_bytes methods of integers · Issue #49792 · rust-lang/rust · GitHub; I could turn that into a more real proposal if it'd help discuss tradeoffs.

2 Likes

Note that FromBytes may not always succeed because alignment of bytes to want to transmute is a runtime property. So it would not be applicable to provide a panicking API only.

zerocopy crate solves this with performing the layout checks once and if they succeed, encoding the result in the type system so that any following checks can be omitted: https://docs.rs/zerocopy/0.2.8/zerocopy/struct.LayoutVerified.html

2 Likes

alignment checks are a single a&mask==0

ATTENTION!

The conversation around this topic has been really great. We've received a lot of wonderful feedback and need some time to fully process it. If you would like to join in on this effort, we have started a new project under the guidance of the language team. You can learn more about it by reading the RFC and visiting the dedicated repo. If you'd like to collaborate, come join the effort on the Rust Zulip.

7 Likes

Hello, I just wanted to chime in and say that this proposal solves a real issue I have with the language as it is today. For example I have wanted to cast out the contents of a byte slice into an EXT2 superblock or a devicetree blob, and at present rust presents no easy, safe way to accomplish this. My one issue with the proposal as it is today is that it punts on the issue of endianness, which is a common occurrence with these sorts of serialization and validation problems. It would be nice if endianness was addressed either as automatic translation to a native endian type, or by providing a library of integer types with explicit endianness which implement this trait, e.g. U32LittleEndian or u32le. Nonetheless, I don't want to hold up this proposal with bikeshedding. This proposal addresses a real issue I have with rust today, and if it gets accepted without addressing endianness such that I can easily implement the e.g. U32LittleEndian types myself that's better than the status quo.

3 Likes

Thanks! There are no plans to address endianess since this is specifically for transmuting which by definition uses the native endiness of the machine. That being said, I think it would be awesome to have a library on top of this that provides conveniences for endianess. I don't believe anything in the proposal will be incompatible with that.

1 Like

I would like to add PhantomData to the core types that have these traits. This is useful for handling endianness. I use it like this:

#[repr(transparent)]
pub struct U32<E: Endian>(u32, PhantomData<E>);

impl<E: Endian> U32<E> {
    /// Construct a new value given a native endian value.
    pub fn new(e: E, n: u32) -> Self {
        Self(e.write_u32(n), PhantomData)
    }
    /// Return the value as a native endian value.
    pub fn get(self, e: E) -> u32 {
        e.read_u32(self.0)
    }
    /// Set the value given a native endian value.
    pub fn set(&mut self, e: E, n: u32) {
        self.0 = e.write_u32(n);
    }
}

The Endian methods are implemented using u32::to_le etc.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.