[Pre-RFC] usize is not size_t

Indeed, casting a ptr to a usize and back is already necessarily a lossy operation.

So I wonder if the path forward should not involve fully embracing the idea that pointers and integers are fundamentally distinct kinds of values. As far as I am concerned, in an ideal universe, there would be no way to convert a usize to a pointer -- instead one would be required to explicitly declare which provenance that pointer should have, e.g. by giving another pointer whose provenance should be used:

/// Returns a pointer pointing to `addr`, with the provenance
/// taken from `provenance`.
fn ptr_from_int<T>(addr: usize, provenance: *const T) -> *const T

I assume this API is easy to support on CHERI as well. After all, using usize to represent pointer offsets is still perfectly fine, the issue is "just" that casting a pointer to usize is lossy in ways that are much more obvious than with Rust for regular targets. In other words, the only operation that is problematic if one considers Rust-on-CHERI with usize being 64bit in size is the int-to-ptr cast, which I think is a cursed operation anyway. Two birds, one stone!

Basically, what I imagine is that with the CHERI target, usize-to-ptr casts would fail to compile. We could have an allow-by-default lint against such casts that helps people ensure their code is portable to CHERI. (transmute between pointers and usize would also fail due to their different size, but then again that is already a cursed operation. This one might be harder to lint against, but it should be possible.)

ptr_from_int, together with the existing ptr as usize that extract the address (and loses provenance) is enough to implement things like "packing extra booleans into the aligned part of a pointer".

I think it also suffices to implement schemes such as the OCaml garbage collector where the last bit of a word is used to distinguish pointers from ints, if we further assume some global const DUMMY_PROVENANCE: *const () that can be used to create pointers that cannot be dereferenced (but that can be cast back to usize). Then we could use *const () as type for such a "pointer or int" value.

17 Likes