[Pre-RFC]: Safe Transmute

Sure, in that case you can't prove a lack of padding before monomorphizing.

I think there's probably a lot of iteration to go on the design, but I want to strongly endorse the fundamental idea here: making unsafe code more type-safe. We've already seen some movement in this general direction with types like MaybeUninit and NonNull; I think this is an area of work which we could really elevate in focus now and in the future.

In Rust 2015, unsafe code was basically "here's C, have fun!" But we've discovered that we can actually apply a lot of type safety techniques to unsafe code to prevent errors even though the code is not type-checked for full memory safety. This is a big improvement for all authors of unsafe code!

One issue I'd highlight as an area of concern though: its not really enough to just be a safer API, it also has to be similarly or more ergonomic. We've seen with NonNull in particular that many people actually prefer not to use it over raw pointers, even though its providing type safety around null and even a representation optimization, because its more painful than just writing if ptr != ptr::null(). Gankra has put it: NonNull at rest, immediately converted to a raw pointer whenever you need to use it. Not great!

Some specific comments on the discussion:

Unfortunately not unless your excluding types that are repr(align), which can be used to overalign types.

Personally I would really consider leaving generics out of the 1.0 of this.

Something with LLVM freeze (which avoids UB, but still has downsides) or initializing the padding somehow.

4 Likes

Over-aligned types also get "over-sized": Rust Playground (This is necessary to make slices work, since we don’t track stride separately from size)

3 Likes

One way to do that would be, given union Foo { a: u8, b: u32 }, making a let x = Foo { a: 0 }; initialize 4 bytes instead of just 1 like it does now.

But doing this for all unions would break MaybeUninit (we don’t want MaybeUninit::<[u16; 100]>::uninit() to be writing 200 bytes), so it would have to be opt-in or opt-out.

using &[MaybeUninit<u8>] instead of &[u8] might avoid UB in LLVM..
..but is it safe to pass a MADV_FREE-cursed page to a syscall?

1 Like

I agree with this; I'd love to hear more about endianness as well. It might very well be out of scope, but having some clarification on that would be very useful.

I spent some time today thinking about extensions for async_std::io::{Read, Write} to read and write bytes with endianness from. Current status is here: https://github.com/async-rs/async-std/issues/578.

What I'm wondering is if we know how to convert from arrays to numbers (given endianness), and vice versa, whether we could enable something like this:

let mut reader = Cursor::new(vec![2, 5, 3, 0]);
assert_eq!(517_u16, reader.read_be_bytes().await?);
assert_eq!(768_u16, reader.read_be_bytes().await?);

note: I hope this is not off-topic. I was just thinking about this today, and figured it might be relevant enough to share. Apologies if it is!

The problem with BigEndian representations of integers is that, as the number of bytes of the integer representation increases, all such nominally BigEndian representations switch to a multi-precision LittleEndian-like encoding of chunks so that arithmetic carries can be propagated low-to-high in constant time.

Example: On a CPU with 32-bit BigEndian representation, a u128 typically consists of four u32s with the lowest-significance u32 first in memory. Even if that constituent memory ordering does not occur for u128, it certainly would for larger integers like u1024.

Thus on a 32-bit BigEndian architecture, a u128 might be a succession of BigEndian u32s starting with the least significant u32, with byte order in memory (3,2,1,0, 7,6,5,4, b,a,9,8, f,e,d,c). On a 64-bit BigEndian architecture that byte order probably would be (7,6,5,4,3,2,1,0, f,e,d,c,b,a,9,8), while on a hypothetical 128-bit BigEndian architecture it would simply be (f,e,d,c,b,a,9,8,7,6,5,4,3,2,1,0).

This calls into question the whole philosophy of _be_ methods: shouldn't they be _be4_, _be8_ and _be16_ for the prior-listed three architectures? (Fortunately we need only consider _be4_ and _be8_, and perhaps _be2_ if Rust is extended to 16-bit architectures such as the MSP430.)

2 Likes

Hello!

Summary

First of all, thanks for the RFC. This is by far the biggest issue I have with the Rust language as a developer of many parsers, operating systems, and hypervisors in Rust.

Background

Since this issue is often brushed off as "use a library for it" or "this is not a common problem", I think for systems developers this is indeed a very common issue. In fact as someone pushing Rust in my work environment, this is one of the biggest complaints I get about the language from my peers (often from a handful of people every quarter). Thus, here is some perspective.

I'm a security researcher who specializes in writing tools in Rust (as do many of my peers). A daily part of our work is implementing parsers for various protocols and file formats. I'd say I write over 30+ parsers a year. This means I'm constantly reading bytes from a socket or file and converting them to Rust structures. This is an extremely painful part of Rust, especially before the from/to_xe_bytes() APIs, which are still a pain.

For example, if I want to read a file and parse it into a structure, it'll look something like this:

struct Header {
    magic: u32,
    foo:   u64,
    bar:   u128,
    baz:   i8
}

let file = std::fs::read(filename).unwrap();
let header = struct Header {
    magic: u32::from_le_bytes(file[0..4].try_into.unwrap()),
    foo:   u64::from_le_bytes(file[4..12].try_into.unwrap()),
    bar:   u128::from_le_bytes(file[12..28].try_into.unwrap()),
    baz:   i8::from_le_bytes(file[28..29].try_into.unwrap()),
};

Doing this a few dozen times per protocol (for each struct or nested structure), over 20+ times a year lead me to implementing a library for this safecast. Often for small projects I don't like to bring in a dependency, so I often have to write macros to compute the offsets into the buffer (or implement a io::Read-style consumer so I don't have to know anything but the field size).

Further, this style of parsing (as is parsing using third-party libs like byteorder) is not nearly as effective as casts, which for these plain-old-data-style structures is entirely provable by the compiler to be safe!

Safecast

I implemented a library called safecast to accomplish a very similar goal to this RFC, and thus I'll mention why I designed some things the way I did, and suggestions I have to the RFC.

Safecast is a hacked up little procedural macro that accomplishes very similar to this RFC, but with some constraints due to not being able to compute things like padding space at compile-time.

For Safecast I have a #[derive(ByteSafe)] which is very similar to Transmutable here, and the APIs I have are:

Safecast::cast_copy_into<T: Safecast + ?Sized>(&self, dest: &mut T)

This copies the raw bytes of a self: Transmutable to an existing buffer of a different T: Safecast. This allows for alignment checks to be ignored for the structures as the bytes will be copied byte-by-byte under-the-hood by a copy_nonoverlapping which doesn't have alignment constraints.

Safecast::cast_copy<T: Safecast>(&self) -> T

This is the same as cast_copy_into but it creates the output T for you, this is the most commonly used form. For example let foo: u32 = [1, 2, 3, 4].cast_copy();.

Safecast::cast<T: Safecast>(&self) -> &[T]

This is more similar to this RFC as it provides casting from self to T. This implements run-time alignment checks as both self and T have to be compatible in their alignments.

Safecast::cast_mut<T: Safecast>(&mut self) -> &mut [T]

A mutable version of cast.

These casting APIs return a [T] rather than T because I don't implement Safecast for [T: Safecast].

Suggestions to this RFC

API should convert Transmutable to other Transmutables rather than always to and from bytes

Instead of providing an API for converting from T: Transmutable to &[u8] instead I think it should just provide an API for converting T: Transmutable to T: Transmutable. This allows more generic conversions, and if you want a slice of bytes you could simply request a conversion to [u8] (I recognize this RFC doesn't coverage implementing Transmutable for [T: Transmutable], but I think this should be implemented in a feature like this.

This would mean that now casting from something to a T would also require a check on alignment, but in the "default" case of casting from a [u8] the compiler will easily remove these alignment checks as it knows that all alignments are valid (the alignment check would be constantly provable to always be true).

There should be a mut variant

I don't know if it's entirely safe WRT aliasing rules, but I also think it should be possible to provide casting between Transmutable types with mutability in-place. Perhaps the compiler would need to be aware of aliasing being broken here? But at compiler level I think it's possible. This would allow for in-place modification of structures which is critical for performance in things like a TCP-stack.

I'd recommend introducing a cast_copy and cast_copy_into-style interface like I have in safecast

Since casting requires alignments to match correctly, copying does not. For types that implement Transmutable they can be copied to another Transmutable safely by doing a byte-by-byte copy of the underlying data in memory. This allows using this API for converting between types in situations where alignment may not always line up and some performance can be sacrificed.

I recognize this is a little out-of-scope as this is explicitly about transmute, but I think it's an important aspect to the API and I'd think it'd be the most common use. For example:

let foo: Structure = std::fs::read(file)?.cast_copy()?

leads to a really clean format for casting a file to a structure (eg: parsing a header).

Consider making it have an unsafe impl

I think it'd be fine if the trait gets automatically derived based on a #[derive(Transmutable)] but I don't know why it shouldn't be possible to unsafe impl it if you really wanted to.

Things I like about this RFC

  • First I love that this exists, thank you, this is my biggest complaint and I've brought it up numerous times but never got traction with the Rust community

  • I like how the API 'ignores' bytes for instances where [u8].len() exceeds T. While this differs from the common behaviours in Rust slice.copy_from_slice() and integer::from_le_bytes() which require explicit sizing. I just imagine if there is a strict size match requirement, all my code which uses this API will end up being Structure::from_bytes(&slice[..std::mem::size_of::<Structure>]) which in my opinion is implied enough by the API that I think it'd just get annoying.

Thoughts

Please let me know if there is anything I can do to help get this into the language. I recognize that people always suggest "there's a crate for that", but currently this is not possible to implement (correctly) in a procedural macro. Further, I have no interest in bringing in third-party crates using unsafe code to implement transmutes into my kernel for critical parsing like in my networking stack.

This RFC has no impact on the size of the compiled core or std libraries, and would not affect existing code in binary formats. The amount of code required to maintain this feature over time is minimal as it's pretty much identical to any other recursive-derive with hardcoded derives for root-level types (u8, u16, ...). I don't know what the rustc internal derives look like but my Safecast library is only ~200 LoC for the derive and ~100 LoC for the impls for the traits. While my implementations are incomplete, I can't imagine this would be more than ~500 LoC for Rust to maintain, which would almost never change.

Finally, I want to stress how important this is to systems, parser, and protocol developers like myself. It shouldn't be required to write unsafe code to parse basic headers from files in a performance way, nor should it be required to pull in (potentially unmaintained) unsafe third-party crates to do this when it's something that can only be done correctly by the compiler itself.

Sorry for the rant, but this genuinely would make a world of difference to me, -B

9 Likes

I am starting to see a pattern here: :wink:

1 Like

Have you ever heard of scroll? It is made for this kind of binary format parsing and writing.

2 Likes

Also, see https://docs.rs/bytemuck (cc @Lokathor,@Shnatsel)


Personally I find that ::zerocopy has already gotten very close to this design, it just lacks "popularity". (and by popularity, I think that awareness against "unsafe Rust being like C and vice versa", a very common common misconception, is sorely needed).

Imho, building, on top of the UCG, a solid framework for correctly / soundly manipulating raw memory should be one of the main 2020 targets. Else the current unsafe mistakes will simply become more and more pervasive, due to the lack of a visible better alternative.

2 Likes

The biggest thing holding bytemuck back is that (as far as I know) you cannot use proc-macro to verify that a type is a valid canidate for Pod / Zeroable / other unsafe marker trait, because a proc-macro runs at expansion time so you can't just query the types of the fields of a struct and get what traits those types implement.

Huh? I thought zerocopy implemented that already with its proc macro, and I believe @dhm here also found a way to make this with a macro-by-example.

@gamozolabs I find this ironic because zerocopy was implemented for this exact purpose: providing safe parsing primitives in a networking stack in Fuchsia.

Ultimately the code in the standard library is no more safe than code in a third-party crate. Possibly less safe, in fact, since it is much harder to build std with sanitizers or do meaningful fuzzing of it than with crate.

3 Likes

Indeed, zerocopy is very close to this design. Yet the ecosystem has not stabilized around it - in fact, there are two other crates that do basically the same thing: scroll and structview.

Normally I would be the guy telling you that having a crate is totally fine and you don't need an std abstraction for this, but seeing how we have at least 3 crates reimplementing the same thing already... 5 if you count safe-transmute and bytemuck... maybe I need to update that opinion.

4 Likes

For Zeroable, you can use this to assert that every field implements it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0749b2ca7b3cc47fbb22322ceffeb84c

Or do you mean conditionally implementing Zeroable if every field implements Zeroable,like Send or Sync ?

1 Like

hm, i did mean like the former. I'll have to look into this checker more, it might be simpler than i thought.

!ATTENTION!

Thanks everyone for the great feedback! After reading through it all, Josh and I have created a version 2 of the proposal which you can find here. Please let us know what you think!

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.