Many Rust types have trap representations - bit-patterns that fit into the type’s size, but cause UB when interacted with. For example, an all-zeros bit pattern for a safe reference, or the trivial 0-sized bit-pattern for an empty type (I don’t see any good reason to distinguish the two).
In our LLVM backend, this is mostly realized through attributes such as !range
and !nonnull
, which are emitted on callsites and LLVM load
instructions - basically at the convenience of the compiler when values of that type are present (i.e. after a vexpr had created/loaded one).
This generally works just fine - garbage values are not used, and the metadata allows for good optimizations. However, there is one use-case where this leads to trouble.
That case is mem::uninitialized
. The intrinsic is often assigned to a local variable, to create a memory buffer. The returned garbage is often wrapped inside a struct, as in
struct Data {
type: u8,
data: [&'static str; 16]
}
unsafe fn example() -> Data {
let mut data = Data {
type: 0,
data: mem::uninitialized()
};
initialize(&mut data);
data
}
Of course, this also occurs when the type is used generically.
We have to support this sort of pattern - it is very popular, and these buffers are useful for low-level code. On the other hand, we can’t just allow uninitialized values everywhere - that would create too much trouble, as well as ruin our optimizations.
I can’t figure out a clean solution for this. However, there’s a hacky solution that seems to work: allow for poisoned and partially-poisoned values (where an enum with a poisoned field is a poison value itself, to allow for invalid-value optimizations), but prevent them from being passed to functions, and prohibit all operations, except for constructors and explicit calls to the uninitialized
intrinsic from creating them.
This means that moving uninitialized data from place to place, including returning it from a function, is UB. I don’t think this would be too much trouble, but I’m open to usecases.
However, this solution by itself does not restrict references to uninhabited types in any way - you can play with your &!
as much as you want, as long as you don’t actually dereference it.