@arielb1 and I were chatting about this today. I think we came to a few conclusions.
The key question comes down to: what constitutes “using” a value?
I think everyone agrees that “using” an invalid value should be illegal. As @arielb1 pointed out in his initial post, this applies to values of type
!, but also values of most any type that has illegal values (e.g.,
Box<T>, etc). One critical point is when returning a value counts as using it – and in particular if
mem::uninitialized deserves special status.
One might imagine a predicate like
VALID(a: T) that says "the memory at address
a can be typed as
T". (Just bear with me for a bit.)
What is using the referent of a reference
Another key question: Under what circumstances if a
&T required to point to memory of type
T? This is another question that is really bigger than
!. It seems clear that if an
&T must have a valid referent and
T=!, then this code should not be reachable or something “UB-like” has happened.
So, for example, I think that in most any model is you read the referent of an
&T, then the referent must be valid (i.e.,
let x = *r or
let x = ptr::read(r)). Similarly, if you assign to the referent of a
&mut T, the old value will be dropped, and hence the same is true
*r = ... (
ptr::write is different in this respect, of course). But models will vary in terms of when a
&T which is not dereferenced must be valid. For example, the memory it points at may have been freed. (We may want some rules around fn entry and exit, for example, as I talked about in this blog post).
But I think the logic of: “T is unhabited, therefore &T is uninhabited” is not really valid unless we take a strict position that the referent of an
&T must always be valid. I suspect we want rules that are quite a bit looser.
VALID(a: &T) would not necessarily imply that
VALID(*a: T), though it would presumably require that "
a is not null". Instead,
VALID(*a: T) would only be required at some other times, such as when the pointer is dereferenced.
What about returning a value?
In general it seems like returning a value does require that value is valid. The key question then is whether to exempt
mem::uninitialized from this requirement. Currently we do not. But that makes
mem::uninitialized unsuitable for almost any type, as @arielb1 pointed out. Certainly any type that is not a simple scalar like
u8 with no illegal values. e.g.,
mem::uninitialized::<&u8>() is invalid because the returned value
x does not (necessarily) satisfy "
x is not null", and yet it was returned.
So what are our options?
- We could deprecate
uninitialized – or at least deprecate it for types that we cannot statically see are reasonable. This would probably be some ad-hoc rules much like
transmute. I have no idea how much code would be affected but probably a non-trivial amount. Said code would want to be rewritten to use unions, or at least a
MaybeInitialized type that is in the libstd (which is implemented with unions).
- We could special-case
uninitialized, as @arielb1 initially proposed. This means that returning a value from
uninitialized is not considered a “use”. It does mean that a trivial wrapper is impossible and so forth.
- We could also do both. =) This would preserve existing code while encouraging people to move off of
uninitialized and onto more future-proof and well-behaved things, like a
In practical terms, the difference between 1 and 3 is that if we only do 1, then
uninitialized::<!>() will still panic, whereas under options 2 or 3 it would not.
I would argue that at minimum we should pursue a
MaybeUninitialized type in libstd based on unions and deprecate
uninitialized. I am not yet sure whether we can get away without special-casing it, but it would be nice if we could – as
! is not yet stable, I guess we have some time to deliberate? (As @arielb1 poined out, this applies more broadly, but the problem seems most acute for