Automatic marker trait for unconditionally valid `#repr(C)` types


#1

TL;DR: Adding an automatic marker type (one that can’t be externally implemented with current features) would remove unsafety from a significant source of unsafe{} blocks today.

Hello, I’ve only been playing with Rust for a few weeks, but I’ve noticed a certain unsafe pattern that can easily be made safe with just a tiny bit of compiler grease.

In low level/embedded/OS code, it is not uncommon to define #repr(C) structures that are used to interpret external data, e.g. I/O-mapped memory areas, or even files when you can depend on endianness being native to the machine. Unsafe blocks are simply used to cast references.

Obviously, this includes a number of pitfalls. Some rust types cannot contain arbitrary bit patterns (bool, enum, references, etc.), and while most Rust programmers should know that, it’s inevitable some people coming from C will unknowingly invoke undefined behavior with this. Additionally, casting pointers around comes with alignment issues, which even less people would consider when writing unsafe code.

I wrote a tiny crate to alleviate said issues: https://github.com/randomites/plain

The gist of it is that when you know certain set of assumptions holds for type T (#repr(C), all bit patterns being valid, not being a Drop type, perhaps also that all contents are public – note that these requirements are strictly stronger than those of Copy), then you can perform conversions between &T and &[u8] without ever using unsafe blocks yourself. The wrapper functions just check size and alignment, and restrictions on T make sure safe code can’t do anything unsafe with those references.

In the current crate, plain::Plain is an unsafe trait that you have to unsafely apply to types you work with. However, in theory, it should be quite easy to add trait core::marker::Plain (name subject to discussion), and let compiler automatically derive it for admissible types, similar to how Send and Sync are currently handled. Doing this would remove a notable source of unsafety in low-level code.

In case this proposal needs an RFC, I’m absolutely willing to write it up, but I’d need guidance from someone more experienced.

EDIT: Added non-Drop to the list of requirements.


Idea: Automatic marker traits for repr(C) and friends
#2

FYI: Send and Sync don’t get special treatment from the compiler (at least w.r.t. being automatically implemented). They are auto traits and any (nightly) library can define new ones. This allows experimenting with this without changing the compiler or putting anything in core.


#3

The only important part that isn’t already in the published crate is determining whether or not a type actually satisfies the constraints. I don’t know how to implement that without (or even with) using compiler internals, so if anyone can help me with that, I’d be much obliged!


#4

Oh, nevermind, auto traits aren’t sufficient for this. You’d want to list all the primitive types that are Plain and then include every struct composed of them, while auto traits start out implemented for every type and then one adds exceptions.


#5

I think you were almost right the first time. The negative impl is supposed to apply to any type that contains it, which is perfect, since the inadmissible primitive types can be enumerated just as well as the admissible (although it’s not future-proof and future additions to Rust can break safety). Thanks for pointing out that RFC.

What remains is bound on #repr(C) – unless someone can point me to a guarantee that rustc will never ever try to stash extra information into padding in #repr(rust) types.

All data being public in the interface is not a requirement for safety, but rather for lack of surprises – unsafe code working on private parts of plain type cannot make assumptions about what’s in there, which is counter-intuitive unless you are explicitly aware of Plain.


#6

No, wait, I forgot about enums. I can’t make negative trait impl for enums, I think. Or tuples.