Sure, in that case you can't prove a lack of padding before monomorphizing.
I think there's probably a lot of iteration to go on the design, but I want to strongly endorse the fundamental idea here: making unsafe code more type-safe. We've already seen some movement in this general direction with types like MaybeUninit
and NonNull
; I think this is an area of work which we could really elevate in focus now and in the future.
In Rust 2015, unsafe code was basically "here's C, have fun!" But we've discovered that we can actually apply a lot of type safety techniques to unsafe code to prevent errors even though the code is not type-checked for full memory safety. This is a big improvement for all authors of unsafe code!
One issue I'd highlight as an area of concern though: its not really enough to just be a safer API, it also has to be similarly or more ergonomic. We've seen with NonNull
in particular that many people actually prefer not to use it over raw pointers, even though its providing type safety around null and even a representation optimization, because its more painful than just writing if ptr != ptr::null()
. Gankra has put it: NonNull
at rest, immediately converted to a raw pointer whenever you need to use it. Not great!
Some specific comments on the discussion:
Unfortunately not unless your excluding types that are repr(align)
, which can be used to overalign types.
Personally I would really consider leaving generics out of the 1.0 of this.
Something with LLVM freeze (which avoids UB, but still has downsides) or initializing the padding somehow.
Over-aligned types also get "over-sized": Rust Playground (This is necessary to make slices work, since we don’t track stride separately from size)
One way to do that would be, given union Foo { a: u8, b: u32 }
, making a let x = Foo { a: 0 };
initialize 4 bytes instead of just 1 like it does now.
But doing this for all unions would break MaybeUninit
(we don’t want MaybeUninit::<[u16; 100]>::uninit()
to be writing 200 bytes), so it would have to be opt-in or opt-out.
using &[MaybeUninit<u8>]
instead of &[u8]
might avoid UB in LLVM..
..but is it safe to pass a MADV_FREE
-cursed page to a syscall?
I agree with this; I'd love to hear more about endianness as well. It might very well be out of scope, but having some clarification on that would be very useful.
I spent some time today thinking about extensions for async_std::io::{Read, Write}
to read and write bytes with endianness from. Current status is here: https://github.com/async-rs/async-std/issues/578.
What I'm wondering is if we know how to convert from arrays to numbers (given endianness), and vice versa, whether we could enable something like this:
let mut reader = Cursor::new(vec![2, 5, 3, 0]);
assert_eq!(517_u16, reader.read_be_bytes().await?);
assert_eq!(768_u16, reader.read_be_bytes().await?);
note: I hope this is not off-topic. I was just thinking about this today, and figured it might be relevant enough to share. Apologies if it is!
The problem with BigEndian representations of integers is that, as the number of bytes of the integer representation increases, all such nominally BigEndian representations switch to a multi-precision LittleEndian-like encoding of chunks so that arithmetic carries can be propagated low-to-high in constant time.
Example: On a CPU with 32-bit BigEndian representation, a u128
typically consists of four u32
s with the lowest-significance u32
first in memory. Even if that constituent memory ordering does not occur for u128
, it certainly would for larger integers like u1024
.
Thus on a 32-bit BigEndian architecture, a u128
might be a succession of BigEndian u32
s starting with the least significant u32
, with byte order in memory (3,2,1,0, 7,6,5,4, b,a,9,8, f,e,d,c). On a 64-bit BigEndian architecture that byte order probably would be (7,6,5,4,3,2,1,0, f,e,d,c,b,a,9,8), while on a hypothetical 128-bit BigEndian architecture it would simply be (f,e,d,c,b,a,9,8,7,6,5,4,3,2,1,0).
This calls into question the whole philosophy of _be_
methods: shouldn't they be _be4_
, _be8_
and _be16_
for the prior-listed three architectures? (Fortunately we need only consider _be4_
and _be8_
, and perhaps _be2_
if Rust is extended to 16-bit architectures such as the MSP430.)
Hello!
Summary
First of all, thanks for the RFC. This is by far the biggest issue I have with the Rust language as a developer of many parsers, operating systems, and hypervisors in Rust.
Background
Since this issue is often brushed off as "use a library for it" or "this is not a common problem", I think for systems developers this is indeed a very common issue. In fact as someone pushing Rust in my work environment, this is one of the biggest complaints I get about the language from my peers (often from a handful of people every quarter). Thus, here is some perspective.
I'm a security researcher who specializes in writing tools in Rust (as do many of my peers). A daily part of our work is implementing parsers for various protocols and file formats. I'd say I write over 30+ parsers a year. This means I'm constantly reading bytes from a socket or file and converting them to Rust structures. This is an extremely painful part of Rust, especially before the from/to_xe_bytes()
APIs, which are still a pain.
For example, if I want to read a file and parse it into a structure, it'll look something like this:
struct Header {
magic: u32,
foo: u64,
bar: u128,
baz: i8
}
let file = std::fs::read(filename).unwrap();
let header = struct Header {
magic: u32::from_le_bytes(file[0..4].try_into.unwrap()),
foo: u64::from_le_bytes(file[4..12].try_into.unwrap()),
bar: u128::from_le_bytes(file[12..28].try_into.unwrap()),
baz: i8::from_le_bytes(file[28..29].try_into.unwrap()),
};
Doing this a few dozen times per protocol (for each struct or nested structure), over 20+ times a year lead me to implementing a library for this safecast
. Often for small projects I don't like to bring in a dependency, so I often have to write macros to compute the offsets into the buffer (or implement a io::Read
-style consumer so I don't have to know anything but the field size).
Further, this style of parsing (as is parsing using third-party libs like byteorder
) is not nearly as effective as casts, which for these plain-old-data-style structures is entirely provable by the compiler to be safe!
Safecast
I implemented a library called safecast
to accomplish a very similar goal to this RFC, and thus I'll mention why I designed some things the way I did, and suggestions I have to the RFC.
Safecast is a hacked up little procedural macro that accomplishes very similar to this RFC, but with some constraints due to not being able to compute things like padding space at compile-time.
For Safecast
I have a #[derive(ByteSafe)]
which is very similar to Transmutable
here, and the APIs I have are:
Safecast::cast_copy_into<T: Safecast + ?Sized>(&self, dest: &mut T)
This copies the raw bytes of a self: Transmutable
to an existing buffer of a different T: Safecast
. This allows for alignment checks to be ignored for the structures as the bytes will be copied byte-by-byte under-the-hood by a copy_nonoverlapping
which doesn't have alignment constraints.
Safecast::cast_copy<T: Safecast>(&self) -> T
This is the same as cast_copy_into
but it creates the output T
for you, this is the most commonly used form. For example let foo: u32 = [1, 2, 3, 4].cast_copy();
.
Safecast::cast<T: Safecast>(&self) -> &[T]
This is more similar to this RFC as it provides casting from self
to T
. This implements run-time alignment checks as both self
and T
have to be compatible in their alignments.
Safecast::cast_mut<T: Safecast>(&mut self) -> &mut [T]
A mutable version of cast
.
These casting APIs return a [T]
rather than T
because I don't implement Safecast
for [T: Safecast]
.
Suggestions to this RFC
API should convert Transmutable
to other Transmutable
s rather than always to and from bytes
Instead of providing an API for converting from T: Transmutable
to &[u8]
instead I think it should just provide an API for converting T: Transmutable
to T: Transmutable
. This allows more generic conversions, and if you want a slice of bytes you could simply request a conversion to [u8]
(I recognize this RFC doesn't coverage implementing Transmutable
for [T: Transmutable]
, but I think this should be implemented in a feature like this.
This would mean that now casting from something to a T
would also require a check on alignment, but in the "default" case of casting from a [u8]
the compiler will easily remove these alignment checks as it knows that all alignments are valid (the alignment check would be constantly provable to always be true
).
There should be a mut
variant
I don't know if it's entirely safe WRT aliasing rules, but I also think it should be possible to provide casting between Transmutable
types with mutability in-place. Perhaps the compiler would need to be aware of aliasing being broken here? But at compiler level I think it's possible. This would allow for in-place modification of structures which is critical for performance in things like a TCP-stack.
I'd recommend introducing a cast_copy
and cast_copy_into
-style interface like I have in safecast
Since casting requires alignments to match correctly, copying does not. For types that implement Transmutable
they can be copied to another Transmutable
safely by doing a byte-by-byte copy of the underlying data in memory. This allows using this API for converting between types in situations where alignment may not always line up and some performance can be sacrificed.
I recognize this is a little out-of-scope as this is explicitly about transmute
, but I think it's an important aspect to the API and I'd think it'd be the most common use. For example:
let foo: Structure = std::fs::read(file)?.cast_copy()?
leads to a really clean format for casting a file to a structure (eg: parsing a header).
Consider making it have an unsafe impl
I think it'd be fine if the trait gets automatically derived based on a #[derive(Transmutable)]
but I don't know why it shouldn't be possible to unsafe impl
it if you really wanted to.
Things I like about this RFC
-
First I love that this exists, thank you, this is my biggest complaint and I've brought it up numerous times but never got traction with the Rust community
-
I like how the API 'ignores' bytes for instances where
[u8].len()
exceedsT
. While this differs from the common behaviours in Rustslice.copy_from_slice()
andinteger::from_le_bytes()
which require explicit sizing. I just imagine if there is a strict size match requirement, all my code which uses this API will end up beingStructure::from_bytes(&slice[..std::mem::size_of::<Structure>])
which in my opinion is implied enough by the API that I think it'd just get annoying.
Thoughts
Please let me know if there is anything I can do to help get this into the language. I recognize that people always suggest "there's a crate for that", but currently this is not possible to implement (correctly) in a procedural macro. Further, I have no interest in bringing in third-party crates using unsafe
code to implement transmutes into my kernel for critical parsing like in my networking stack.
This RFC has no impact on the size of the compiled core
or std
libraries, and would not affect existing code in binary formats. The amount of code required to maintain this feature over time is minimal as it's pretty much identical to any other recursive-derive with hardcoded derives for root-level types (u8
, u16
, ...). I don't know what the rustc
internal derives look like but my Safecast
library is only ~200 LoC for the derive and ~100 LoC for the impls for the traits. While my implementations are incomplete, I can't imagine this would be more than ~500 LoC for Rust to maintain, which would almost never change.
Finally, I want to stress how important this is to systems, parser, and protocol developers like myself. It shouldn't be required to write unsafe
code to parse basic headers from files in a performance way, nor should it be required to pull in (potentially unmaintained) unsafe
third-party crates to do this when it's something that can only be done correctly by the compiler itself.
Sorry for the rant, but this genuinely would make a world of difference to me, -B
I am starting to see a pattern here:
Have you ever heard of scroll? It is made for this kind of binary format parsing and writing.
Also, see https://docs.rs/bytemuck (cc @Lokathor,@Shnatsel)
Personally I find that ::zerocopy
has already gotten very close to this design, it just lacks "popularity". (and by popularity, I think that awareness against "unsafe
Rust being like C and vice versa", a very common common misconception, is sorely needed).
Imho, building, on top of the UCG, a solid framework for correctly / soundly manipulating raw memory should be one of the main 2020 targets. Else the current unsafe
mistakes will simply become more and more pervasive, due to the lack of a visible better alternative.
The biggest thing holding bytemuck
back is that (as far as I know) you cannot use proc-macro to verify that a type is a valid canidate for Pod / Zeroable / other unsafe marker trait, because a proc-macro runs at expansion time so you can't just query the types of the fields of a struct and get what traits those types implement.
Huh? I thought zerocopy
implemented that already with its proc macro, and I believe @dhm here also found a way to make this with a macro-by-example.
@gamozolabs I find this ironic because zerocopy
was implemented for this exact purpose: providing safe parsing primitives in a networking stack in Fuchsia.
Ultimately the code in the standard library is no more safe than code in a third-party crate. Possibly less safe, in fact, since it is much harder to build std with sanitizers or do meaningful fuzzing of it than with crate.
Indeed, zerocopy
is very close to this design. Yet the ecosystem has not stabilized around it - in fact, there are two other crates that do basically the same thing: scroll
and structview
.
Normally I would be the guy telling you that having a crate is totally fine and you don't need an std abstraction for this, but seeing how we have at least 3 crates reimplementing the same thing already... 5 if you count safe-transmute
and bytemuck
... maybe I need to update that opinion.
For Zeroable, you can use this to assert that every field implements it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0749b2ca7b3cc47fbb22322ceffeb84c
Or do you mean conditionally implementing Zeroable if every field implements Zeroable,like Send or Sync ?
hm, i did mean like the former. I'll have to look into this checker more, it might be simpler than i thought.
!ATTENTION!
Thanks everyone for the great feedback! After reading through it all, Josh and I have created a version 2 of the proposal which you can find here. Please let us know what you think!
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.