@arielb1 and I were chatting about this today. I think we came to a few conclusions.
The key question comes down to: what constitutes āusingā a value?
I think everyone agrees that āusingā an invalid value should be illegal. As @arielb1 pointed out in his initial post, this applies to values of type !
, but also values of most any type that has illegal values (e.g., &T
, Box<T>
, etc). One critical point is when returning a value counts as using it ā and in particular if mem::uninitialized
deserves special status.
One might imagine a predicate like VALID(a: T)
that says "the memory at address a
can be typed as T
". (Just bear with me for a bit.)
What is using the referent of a reference &T
?
Another key question: Under what circumstances if a &T
required to point to memory of type T
? This is another question that is really bigger than !
. It seems clear that if an &T
must have a valid referent and T=!
, then this code should not be reachable or something āUB-likeā has happened.
So, for example, I think that in most any model is you read the referent of an &T
, then the referent must be valid (i.e., let x = *r
or let x = ptr::read(r)
). Similarly, if you assign to the referent of a &mut T
, the old value will be dropped, and hence the same is true *r = ...
(ptr::write
is different in this respect, of course). But models will vary in terms of when a &T
which is not dereferenced must be valid. For example, the memory it points at may have been freed. (We may want some rules around fn entry and exit, for example, as I talked about in this blog post).
But I think the logic of: āT is unhabited, therefore &T is uninhabitedā is not really valid unless we take a strict position that the referent of an &T
must always be valid. I suspect we want rules that are quite a bit looser.
IOW, VALID(a: &T)
would not necessarily imply that VALID(*a: T)
, though it would presumably require that "a
is not null". Instead, VALID(*a: T)
would only be required at some other times, such as when the pointer is dereferenced.
What about returning a value?
In general it seems like returning a value does require that value is valid. The key question then is whether to exempt mem::uninitialized
from this requirement. Currently we do not. But that makes mem::uninitialized
unsuitable for almost any type, as @arielb1 pointed out. Certainly any type that is not a simple scalar like u8
with no illegal values. e.g., mem::uninitialized::<&u8>()
is invalid because the returned value x
does not (necessarily) satisfy "x
is not null", and yet it was returned.
So what are our options?
- We could deprecate
uninitialized
ā or at least deprecate it for types that we cannot statically see are reasonable. This would probably be some ad-hoc rules much like transmute
. I have no idea how much code would be affected but probably a non-trivial amount. Said code would want to be rewritten to use unions, or at least a MaybeInitialized
type that is in the libstd (which is implemented with unions).
- We could special-case
uninitialized
, as @arielb1 initially proposed. This means that returning a value from uninitialized
is not considered a āuseā. It does mean that a trivial wrapper is impossible and so forth.
- We could also do both. =) This would preserve existing code while encouraging people to move off of
uninitialized
and onto more future-proof and well-behaved things, like a MaybeUninitialized
type.
In practical terms, the difference between 1 and 3 is that if we only do 1, then uninitialized::<!>()
will still panic, whereas under options 2 or 3 it would not.
I would argue that at minimum we should pursue a MaybeUninitialized
type in libstd based on unions and deprecate uninitialized
. I am not yet sure whether we can get away without special-casing it, but it would be nice if we could ā as !
is not yet stable, I guess we have some time to deliberate? (As @arielb1 poined out, this applies more broadly, but the problem seems most acute for !
)