Here’s a another way to think about things. Imagine:
struct Result<V, E> {
Ok(V),
Err(E) if E: Inhabited,
}
I imagine there are lots of type-parameterized enums where some variants don’t make sense for some types of parameters, so maybe it makes sense to go this direction.
But what is the point of requiring everyone to special-case uninhabited types if things work fine without? That would make Result<T, !> pretty much useless (the only reason to use it over just T is to make the same code work for both fallible and infallible operations). In fact, arguably it would make uninhabited types fairly useless as a whole, which seems to conflict with the acceptance of RFC 1216…
extern crate libc;
mod ut {
mod ffi {
#[repr(C)]
pub struct UninhabitedTestImpl {
_priv: u8,
}
extern {
pub fn uninhabited_test_new(d: ::libc::uint64_t) -> *mut UninhabitedTestImpl;
pub fn uninhabited_test_delete(ut: *mut UninhabitedTestImpl) -> ::libc::c_void;
pub fn uninhabited_test_get_data(ut: *mut UninhabitedTestImpl) -> ::libc::uint64_t;
}
}
pub struct UninhabitedTest {
data: *mut ffi::UninhabitedTestImpl,
}
impl UninhabitedTest {
pub fn new(d: u64) -> UninhabitedTest {
let ut = unsafe { ffi::uninhabited_test_new(d) };
if ut.is_null() { panic!("ran out of memory"); }
UninhabitedTest { data: ut }
}
pub fn data(&self) -> u64 {
unsafe { ffi::uninhabited_test_get_data(self.data) }
}
}
impl Drop for UninhabitedTest {
fn drop(&mut self) {
unsafe { ffi::uninhabited_test_delete(self.data) };
}
}
}
fn main() {
use ut::UninhabitedTest;
let input = 4;
let ut = UninhabitedTest::new(input);
let output = ut.data();
println!("input: {}", input);
println!("output: {}", output);
}
I’d admit that having to use a u8 as a member in order to stop lints from giving me warnings is a bit annoying, but as far as I can tell, this code is correct?
Rust isn’t allowed to dereference a *mut _ automatically and this code never does it manually, so only the C code will ever do it.
The only problem would be if the author of the ut module leaks a *mut UninhabitedTestImpl or *const UninhabitedTestImpl or creates &UninhabitedTestImpl or &mut UninhabitedTestImpl. I.e. authors of unsafe code have to be careful and know what they are doing. I don’t see why that is an unreasonable burden, unless I’m missing something?
The issue here is that you’re creating a ‘safe’ reference to a raw pointer. There’s no reason to do this if what you want is an opaque type and there’s no reason that users of the library/module should have the access required to do this either.
I do this literally all the time in my Rust wrappers around C types. Whenever possible I try to use references instead of pointers as the types of parameters in my FFI functions because references denote aliasing and non-null requirements that pointers don't. For example, I have fn add(result: &mut BIGNUM, a: &BIGNUM, b: &BIGNUM) which indicates that none of the parameters may be NULL, and result may not alias either a or b. Therefore my wrapper around BIGNUM has as_ref(&self) -> &BIGNUM { unsafe { &*self.0 } }, which apparently (and surprisingly) is dreadfully dangerous.
If Rust had a true opaque type mechanism like extern { type BIGNUM; } then this would work perfectly safely, AFAICT.
I’ll just go with a straight C example. The API for Lua has a lua_State as an opaque pointer by virtue of it being declared but not defined in the public header file. What you’re suggesting is equivalent to dereferencing an incomplete type in C. Why should Rust allow that when even C doesn’t?
If that’s the basis of this ‘problem’ with Rust, then I don’t see why the discussion is still happening, because the goals of having opaque pointers for use in FFI and being able to form safe references to the same types in Rust seem directly opposed.
I’m happy to admit I’m not doing it the right way. Please tell me what the right way of creating a reference to an incomplete type, such as in this C++ example, which compiles just fine, https://godbolt.org/g/M2jdnP:
That is in fact, conforming C++. C++ allows using the ‘indirection’ operator on a pointer to incomplete type in limited cases, one of which is to form a reference. But, I fail to see how this gives you anything useful. You can’t call functions on it directly and it can’t be used in a way that would require a lvalue-to-rvalue conversion. I suppose that leaves, passing it as a reference to a function that is defined in a context where the type is complete or to take the address and turn it back into a pointer. It can also be used in some, but not all, metaprogramming techniques.
I still fail to see how having a reference here is useful. The fact that it can’t be null isn’t really useful when you can’t actually do anything with it while it’s incomplete.
In the case of Rust, what point is there in proving that they can’t alias if Rust can’t do anything with the pointers other than pass them to an FFI function? It also doesn’t matter if it can or can’t be null if you can’t dereference it at all.
First of all, that would be a massively breaking change. Empty matches are used all over the place as a stable construct that generates an unreachable intrinsic.
Second of all, people need to write code generators sometimes. That's why RFC 218 was accepted and implemented. I, at least, want to be able to invoke quick_error! with no arguments and get a data type that implements all the error traits, but happens to be impossible to construct because it doesn't have any variants.
The workaround I currently use for opaque types is also really awful:
mod foo { #[repr(C)] pub struct Foo([u8;0]); }
Which, like your solution using u8 instead of [u8;0] also runs into the moving problems, but has the advantage of being zero sized which means that things like mem::swap-ing between pointers of that type is a no-op.
My suggested solution at one point to this problem was [Pre-RFC] Opaque Structs - but the thread didn’t seem to go anywhere.
There’s a PR to roll-back and feature gate all the changes surrounding uninhabited types in matches here. I also just posted and RFC arguing that these changes should be re-instated here (before having seen @arielb1’s comment).
There’s two ways out of this that I can see: One is to say that we hit UB as soon as we try to return a &!. A value returned by get() isn’t just invalid, it’s provably invalid, statically. This is different to trying to return a NULL &u32 - we may not be able to prove that this value is valid but can’t prove that it’s invalid either. We could make a distinction between safe and unsafe code where, in safe code, we need to be able to prove that all accessible data is valid to guarantee safety whereas, in unsafe code, we just need to not be able to prove that some accessible data is invalid.
The other is to recognize that (5) is true - unless your types are lying to you. We shouldn’t force people to consider that possibility. If they do want to consider that possibility, we can say they’re welcome to and this code will run fine:
unsafe {
let x: &! = get();
match x {
y => println!("hi {:?}", y as *const _)
}
}
But the uninhabitedness changes also allow them to write this code:
unsafe {
let x: &! = get();
match x {
}
}
Why shouldn’t they be allowed to write this? Yes it’s broken but it’s broken because they fucked up using unsafe to create an impossible value and then matched on it. I mean, how much leeway to we need to give people, really? If this code shouldn’t be allowed then neither should exhaustively matching on a bool by testing for both true and false. What if they create a 42: bool and try to match on it? We don’t want their code to break! Better force them to match on every possible bit-pattern of a bool instead,
The current consensus is that you can’t have a NULL: &u8 or a 42: bool any more than you can have a 256: u8 - any attempt to create such is instant UB.