Why is c_void a thing at all?

Inspired by that other thread, I imagine much to the OP’s chagrin…

core::ffi::c_void is a strange type; it is meant to stand for void as used in the C types void *, const void * and perhaps volatile void * and volatile const void * (if anyone uses those). The type isn’t meant to be used anywhere else; in particular it is distinct from the use of void as a quasi-unit type, which is represented by ().

This is a potential source of confusion for programmers coming from C, who may fail to realize the two uses of void are distinct, and try to use c_void as a return type; it is something that has to be specifically taught to them. And since it doesn’t make sense to use c_void anywhere other than as a dummy raw pointer target type (there is no point in having a Vec<c_void>, for example, or dereferencing c_void pointers, or perform size_of<c_void>()-granularity pointer arithmetic; I doubt there is any use for &c_void, &mut c_void or Box<c_void> either), why does it have to be a distinct type at all? It strikes me as spurious orthogonality and a potential source of pitfalls.

There may be some advantages of having c_void that I am missing, but I have a hard time coming up with any. Would it not have been better to have untyped_ptr (and maybe untyped_mut_ptr) instead of *const c_void and *mut c_void, and dispense with c_void entirely? Perhaps it’s not worth changing at this point anyway, but has this been discussed anywhere?

(Prior art: consider the Object Pascal Pointer type, which has no underlying ‘referent’ type, not even a placeholder one.)

3 Likes

More prior art: Swift, which has a dedicated non-generic Unsafe(Mutable)RawPointer for void *. But I haven’t thought about what it would take to change that for Rust, or libc specifically, or bindgen.

I assume doing something really good here is blocked on 1861-extern-types - The Rust RFC Book.

3 Likes

It kinda comes down to the reason c_void is an #[repr(u8)] enum and not a ZST.

It might mostly be historical at this point[1], but at least previously LLVM required void*C to be i8*LLIR. By extension, this meant that for LLVM to recognize and optimize known magic libc functions (i.e. void* malloc(size_t);C and void free(void*)C), they MUST be declared as dealing in i8*LLIR.

This explains the existence of c_void: it's the type to write void*C as *mut c_voidRust such that both get translated to the same i8*LLIR and LLVM can unify them. It would've been a very hard sell to translate void*C as i8*Rust when exposing C APIs to Rust.

This is I feel a symptom of Rust being a practical language first. c_void is a concession to reality and a way to translate C headers maintaining type information.

We probably should have a default warning for using core::ffi::c_void in any position that isn't immediately behind a pointer (i.e. isn't *const c_void or *mut c_void, and maybe allow ptr::NonNull<c_void>).

Rust code should not be using c_void except to translate C headers.

And yes, in a perfect world, c_void would be extern { type c_void; }. (Perhaps those used-by-value lints should be future compatibility, so it's maybe kinda (not likely) possible to do this in the (far) future?) Unfortunately, extern type is very blocked on the question of size_of_val/align_of_val. There's been some (kinda not really anymore) movement on resolving this[2], but it's not really anyone's priority.


  1. Once LLVM finalizes the transition to untyped pointers it surely will be. ↩︎

  2. AIUI: Medium term, forbid using extern types in generics at all. Long term, if use cases come forward, consider a new ?DynSized bound to opt in to allowing extern types in generics (and forbid the use of anything which transitively assumes a non-extern type, i.e. that size_of_val/align_of_val are callable / that Rust code can directly own the type). This is intrinsically also tied to custom ?Sized types (i.e. custom DynSized implementations) which comes with its own laundry list of complications which are hard to separate from extern type discussions. ↩︎

9 Likes

That explanation doesn’t quite hold water for me; #[repr(transparent)] struct opaque_ptr(*const u8) would have had the same effect. But pointers are special in Rust; for example, there’s an implicit conversion from *mut T to *const T. So maybe the decision to make a c_void placeholder was simply “people expect these to behave like other pointers, let’s not rock the boat right now”.

1 Like

I think it would be odd for every C pointer type to be a Rust raw pointer except for void *.

The confusion between () and c_void is unfortunate, but that's mostly an artifact of void * being a really bizarre construct in C. Since Rust aims to have strong C compatibility some of that confusion was inevitably going to spill over into Rust.

Swift gets away with UnsafeRawPointer because it doesn't have raw pointer primitives at all, and it pays for that by having a fairly complex web of pointer types instead. There are advantages to that approach, of course. But I don't know that mixing the two approaches would reduce confusion on the whole.

1 Like

Opaque Pointers — LLVM 16.0.0git documentation are now on in nightly (:tada:) so we're well on the way to pointee types being irrelevant for things like that.

3 Likes

(I wrote a footnote for that but then wrote another with the same label oops)

(you can write footnotes in-line to avoid the need for labelling:
text with^[a footnote]: text with[1])


  1. a footnote ↩︎

Not really. I am asking why the untyped pointer type has to have a reified referent type at all; the actual choice of the type is pretty unimportant. Nothing stops special-casing an otherwise-opaque untyped_ptr to lower into LLVM i8* as well, which would side-step the whole problem of whether the ultimately meaningless referent should be an enum, (), u8 or an extern type.

I have been reading much Stack Overflow recently, and it’s full of people who instead of answering questions as written, would rather answer another question they thought the OP meant to ask, because the keywords match and they only happen to know the answer to the latter. Please don’t do that.

Well, void * is an odd type. It makes little sense to perform many operations on values of that type (dereferencing, arithmetic, alignment checking) that are perfectly sensible for other pointer types. It pretty much only exists to be compared against other pointers and cast to and from other pointer types; it is arguably not much like other pointer types at all.

On the Rust side, it makes little sense to provide operations like copy, read, swap or offset for *mut c_void and *const c_void; if the argument is that untyped pointers should be usable in generic context where any pointer type is expected, there are already cases where this being the case is just a source of pitfalls. Untyped pointers could equally well be represented by a newtype that only exposes those operations that do make sense, like cast or byte_offset.

Object Pascal gets away with Pointer despite having ordinary pointer types as well.

3 Likes

Rust is for most part designed to not get any C/C++ properties spill into it's core language. Even repr(C) and extern "C" aim to be defined in compatible but Rust focused semantics. This sometimes make interaction more tricky, as (), semantics do not match with C's void.

As for you question of having c_void vs untyped_mut_ptr. I guess the motivation here is similar to having a [T] type as opposed to a Slice<T,'a> that is equivalent to &mut 'a [T] like most other languages: Rust's typesytem has an array of pointer like types (*const, *mut, &, &mut, Box) and abstracting over the target allows all of these used here. (In particular only a single new manually defined type is needed to map both void* and const void* to *mut c_void and *const c_void. It also allows you to use all the infrastructure allready written *mut T and const T.

2 Likes

@CAD97 I use Option<NonNull<c_void>> all the time with SDL2, it's totally fine.

True. But what I point out is that this orthogonality is actually harmful, because much of that infrastructure it doesn’t actually make sense to use for c_void. Dereferencing is meaningless, pointer arithmetic in units of the size of c_void is meaningless, passing Box<c_void> around is meaningless; to the point that @CAD97's post suggests such things should be linted against by default.

I could almost defend NonNull<c_void> (almost: if you don’t know what the pointer refers to, what’s the use of knowing if it points to anything at all?), but Option<NonNull<c_void>> is just *mut c_void with extra steps. This issue report seems to agree.

1 Like

In the C/C++ world, there's lots of APIs that take or return "either a completely valid pointer, or NULL". Some of these APIs use a pointer to an opaque type. In Rust, it's possible to make a strongly typed opaque type, but it can be simpler at times to make a type alias for c_void:

use std::ptr::NonNull;
extern "C" {
    // either returns a valid `LibData` pointer or `NULL`
    pub fn get_data() -> Option<NonNull<LibData>>;
}

// adapted from https://doc.rust-lang.org/1.63.0/nomicon/ffi.html:
use std::marker::{PhantomData, PhantomPinned};
#[repr(transparent)]
pub struct LibData {
    _marker: PhantomData<(*mut u8, PhantomPinned)>,
}

// vs.:
use std::ffi::c_void;
pub type LibData = c_void;

Or it can be simpler still just to use Option<NonNull<c_void>> on the function signature.

1 Like

So in your view, extern type LibData; would effectively completely eliminate the use-case for c_void? (other than existing code)

It would diminish the use case to an extent, yes. However, there's still plenty of C headers that create opaque types with typedef void lib_data; or typedef void *lib_data_handle;, and automated tools like bindgen will happily convert them into c_void aliases. Some libraries also use weak type aliases to indicate allow implicit convertions to supertypes (e.g., jobject vs. jstring in the JNI headers), and it's not inconceivable that some would just use void * as a common base type.

1 Like

Additionally, not all libraries even use typedef for everything. For example, many C APIs take a void * "user data" parameter. If I was writing Rust bindings for a C API that had a void * parameter or field somewhere I'd use a pointer to c_void on the Rust side.

4 Likes

I don't think so. Sometimes, you need to form a pointer to a void * object, and this is really only expressible if there is a void * type, otherwise the aliasing information won't be correct. That's relevant even to the usual flat-address-space ABIs. The equivalence of different pointer types really only works for non-addressable values.

I'm not sure I understand this: if you mean a pointer to a void object, no such objects exist. If you mean form a void* object, for example representing malloc precisely, I suppose that makes sense but I'm not sure it's anything other than a quirk of LLVM that it couldn't be, for example, *mut ()

By aliasing information, do you mean type-based-aliasing? That doesn't exist in Rust so you can lie about the type of pointers as much as you like as long as the backing memory is valid at the types you use it at (and for void */extern types you never actually "use" the backing memory, so the type behind the pointer is irrelevant).

1 Like