Auto-trait for stable layout types with no padding?

There's a fairly common pattern of casting structs to bytes, and it has (among others!) footguns of using repr(Rust) structs, tuples (undefined ABI), or types with padding.

repr(Rust) is an invisible default, so this mistake doesn't stand out. It's harder to notice absence of something, like repr(C) or repr(transparent).

Padding is also invisible, and sometimes tricky to spot, like padding at the end of struct caused by alignment. It's also hard to check when using nested structs or type aliases.

Libraries that try to make cast/transmute safer, like bytemuck, unfortunately can't prevent these footguns at all. Their trait requires unsafe impl, and the library must trust it as-is. It can't check if it any implementation is correct. Incorrectly implementing a Pod trait is unfortunately just as easy to get wrong as incorrectly implementing a cast to bytes yourself.

Could Rust help here with auto traits for repr(C)/repr(transparent)/repr(packed) and especially auto traits for no padding?

I thought there was a safe transmute project working on this, but it looks very dead.

11 Likes

I'd like something like that too.

I only see two small problems:

  1. An auto-trait for "no padding" is a semver hazard. Once a type without padding is in a public interface, adding anything to it that causes padding breaks foreign code that uses it with a trait bound on e.g. NoPadding.

    With a Pod auto-trait that combines "stable layout" and "no padding" the semver hazard becomes less problematic, as somebody using repr(C)/repr(transparent) usually intends to guarantee layout properties. Weighting this downside against the perks I think this could work.

  2. All currently existing auto-traits are leaky, which can (again) cause semver hazards. For example

    fn create_opaque_value() -> impl Sized { 42u8 }
    
    fn consume_sync_value<T: Sync>(_: T) {}
    
    fn main() {
        let token = create_opaque_value();
        consume_sync_value(token);
    }
    

    compiles, despite create_opaque_value not guaranteeing that its result will be Sync. The same would probably also happen with any new ReprC/NoPadding/Pod auto-trait. Edit: ReprC etc. can't be normal auto-traits, so the auto-trait leakiness probably won't apply.

2 Likes

+2, at TigerBeetle we use comptime assert(stdx.no_padding(T)) all over the place all the time, and it does prevent nasty bugs in pervasively zero-copy designs.

4 Likes

I don't think just an auto trait works for this, because u8: NoPadding and u16: NoPadding but (u8, u16): !NoPadding.

An auto trait works for AnyBitPatternValid or more limitedly ZeroedIsValid, but not for a lack of padding.

1 Like

Right, it would need to be a compiler-implemented marker trait like Sized, not an auto-trait.

4 Likes

They have a derive macro for it.

I'd also like to see some of these traits or similar stdized (and some reviving of safe transmute for that matter). I do think it should be opt-in, like Copy. Reusing an existing attribute like repr(transparent) runs into problems around "this was an implementation detail, not a promise".

3 Likes

If all you need is to turn some struct into bytes wouldn't a freeze be sufficient?

I assume it would need some magic from the compiler (as if the padding was a field with !NoPadding).

Depends on the context, but I think generally no. If the padding is unexpected, it's a bug. It may mean that the exposed data layout doesn't match its spec.

Hi! I'm the person working on project safe transmute. I'm happy to say it's very much not dead — in fact, working on safe transmutation is now my full-time job! We also have an experimental safe transmute trait landed in the compiler!

That said, at the moment, most of the experimentation is happening outside of the standard library. We're iterating on what the right API is, and scraping the defined limits of Rust's memory model. This API iteration work is happening in the zerocopy crate and relevant discussions about Rust's memory model are happening in the Unsafe Code Guidelines repo (here's a sampler).

Right now, we're sprinting on nailing down fallible transmutation and support for unsized types. Soon, I'll be shifting focus back on the compiler-supported analysis to implement support for lifetime checking and revising our safety analysis.

I want to touch on this:

I can't speak to bytemuck, but you might be pleasantly surprised by zerocopy. Our AsBytes trait comes with a fairly sophisticated derive that statically ensures both that the type has no uninitialized bytes, and that the bytes of the type have no provenance. Using zerocopy is way easier than implementing byte casts correctly yourself.

21 Likes

Good to hear. Please update the repo, because “last updated 4 years ago” all over it doesn’t look like any work has been done, and that repo is linked from the official announcement of the wg creation.

14 Likes

NOT A CONTRIBUTION

I've been searching for a way to do this with structs/enums that have components on the heap, and also the other way around i.e load such structs from bytesequences. I understand it's even less safe than going bungee jumping without a cord, but I'd like to look into it anyway. Where can I find more info on how to accomplish that with the current state-of-the-art?

EDIT: I'm looking into zerocopy, but an answer to my original question above would still be very much appreciated.

Okay, I'll bite. What is that about? I have seen a couple of people use that.

Now to (attempt to) answer your actual question: there is zerocopy as you already found. Bytemuck is another crate for similar things.

I believe there are a couple of relevant zero copy deserialisation crates too. GitHub - djkoloski/rust_serialization_benchmark: Benchmarks for rust serialization frameworks has some benchmarks. Some of those are zero copy. Maybe you can take inspiration from whatever they are doing (or avoid it, I belive abomonation in particular is very very unsound).

The way I interpret it is mostly as a marker for people who only want to "follow the thread" that they needn't waste their time reading such a reply.

The main issue with both of them is that I haven't seen any support for heap storage (i.e. Box<T>), which is essential for my use case. If that support does exist, I'd be happy to be corrected; but for example the docs for zerocopy list no FromBytes impl for Vec<T>. The other direction isn't such an issue because Vec<T> can deref to a slice, which does have an impl listing for AsBytes.

A nice to have would be support for other collections, eg BTreeMap/Set. HashMap/Set would also be nice but without support for ordering that could potentially get awkward quickly.

I generally assume such disclaimers are something some people are required to include on their posts due to their employment contract or the like.

2 Likes

designated in writing by the copyright owner as "Not a Contribution."

5 Likes

It's impossible for actual heap usage, but the zerovec crate is designed to support truly zero copy deserialization of immutable variable size collections. Generally the approach requires Cow-like functionality, where any mutation of the collection switches back to the usual heap representation.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.