`*move` raw pointers

Over the years, there has been ample discussion on the concept of a "&move"/"&own"/"move reference"/"owning reference". However, its raw pointer cousin has been comparatively neglected. This topic aims to correct that shortfall.


For & (reference to value you don't own and can't modify), we have *const. For &mut (reference to value you don't own but can modify), we have *mut. But for Box, Vec, Arc, and all the other safe-code references to values you own… we have, erm, NonNull? Except that's not a real pointer type, so you have to call .as_ptr() before using it (and no convenient coercions!). To construct it, you need new().unwrap() (the concerns of "pointer to value you own" and "pointer guaranteed to be non-null" are sadly not separated).

In practice, when writing unsafe code to manage owned memory, one must use an awkward mix of NonNull (correctly represents semantic intent, but inconvenient to use in practice), *const (a real raw pointer type, and with the correct variance, but you can't write to it, and it conveys the wrong semantic intent), and *mut (a real raw pointer you can write to, but with the wrong variance, and semantic intent not fully correct). Unsafe code is tricky enough without this extra hassle; can we do better?


*move (*own?) would be a raw pointer type with the following characteristics:

  • Covariant
  • Allows writes
  • Implicitly coerces to *mut and *const
  • Constructed from a place via addr_of_move!(place), which requires the place to be something you can move out of (so addr_of_move!(*(&mut foo)) doesn't work for example). This macro would in fact logically perform such a move; the place would no longer be considered accessible by the borrow checker, and drop_in_place would not be called on it.
    • No construction via coercion from & or &mut
    • As with the other pointer types, can also be constructed via cast from a pointer with different mutability
  • Basis for NonNull
  • The raw pointer type of choice for dealing with owned memory
    • Backward compatibility will be an issue, the standard library is full of APIs like Box::into_raw() that use *mut as a substitute for *move.

In my Notes on partial borrows a few months ago, I observed that if &mut T were to be made a subtype of &T (as it is logically), Rust's variance rules would need to handle such mutability-based subtyping differently from lifetime-based subtyping. This is because &&mut T is logically a supertype of &&T; &T is contravariant with respect to the mutability of T. However, Box<T> is logically covariant with respect to T's mutability, just as it is with respect to T's lifetime. In a hypothetical Rust with both mutability variance and *move [1], *const T would be contravariant with respect to Ts mutability, but *move T would be covariant. In my view, this further demonstrates that *move is a distinct concept that deserves its own type.


  1. and setting aside backward compatibility for the moment ↩︎

2 Likes

I have a lot to say about &move and *mut, but it's getting late, so I'll have to remember to do it tomorrow. But basically,

  • *move is unrelated to NonNull. Nullability and move capability are orthogonal. The proper analogue is the perma-unstable Unique pointer type, which powers Box and Vec.
  • The variance must be the same as *mut, since *move T can be both read and written. Also, &move is a subtype of &mut in the same sense as &mut T <: &T, thus it cannot provide more relaxed guarantees, including variance.
  • *move T is the proper solution to #[may_dangle] --- it is (almost) a universal type having that property.
  • It must have mostly the same magic as &move, since it's the unsafe pointer powering all safe operations with &move.
  • The latter two points are also the reason it must be a separate type and cannot be modelled with *mut and a bit of careful programming. Variance isn't a good enough reason to add a new pointer type.
1 Like

Agreed, and I even state this in the post, but

does not follow. From the Rustdoc:

Unlike *mut T, NonNull<T> was chosen to be covariant over T. This makes it possible to use NonNull when building covariant types [...]

Covariance is correct for most safe abstractions, such as Box, Rc, Arc, Vec, and LinkedList. This is the case because they provide a public API that follows the normal shared XOR mutable rules of Rust.

It's NonNull's role as a covariant mutable pointer that *move would replace, not the actual "non-null" aspect of it.

Box<T> can be read from and written to, and is covariant. Being the raw-pointer version of Box et al is the primary motivation for *move.

&move/*move imply ownership, and therefore responsibility for dropping; & and &mut do not. Therefore, no subtyping relationship can be derived between them.

2 Likes

For the name, I think *own is a better option, for the reference as well. It conveys more meaning.

I think that new pointer types should be non-null by default, with Option<ptr> as the nullable version. Even in unsafe code, null pointers are the exception, not the norm. The only case null pointers are commonly used in my experience is FFI, and Option<ptr> pretty much always models those cases perfectly. I'm fact, I'd like to see all pointer types transition to this model across an edition.

3 Likes

If there were no backward compatibility considerations, I think this would be a no-brainer. But given that they do exist, I wonder how much we can do with pattern types and a non-null pattern.

2 Likes

Very much against this. It would mean that if I declare an FFI function which takes a *(const/mut/move) T and someone passes in a null pointer, I get immediate UB. That is much worse than getting UB on the pointer's dereference, which may not even happen in my code. How is an immediate hidden UB better than an UB on an explicit unsafe operation? Besides, nullability isn't even the only pointer validity guarantee, nor the most important one. Neither alignment nor liveness can be properly encoded in the typesystem anyway, nor could they be enforced over FFI even if Rust types allowed it.

Null pointers are also super common in C/C++ code. Plenty of functions take or return null as some sentinel value. Overall this means that pointer types would be unsuitable for FFI, and would have to be exclusively used in the Option<ptr> form, which is a significant hit to the ergonomics, and goes against the (natural, and encouraged) assumption that raw pointers are the same as C pointers.

This is already an annoyance and footgun for function pointers, but at least in that case there are strong reasons, both for safety and target architecture behaviour, why function pointers should be non-null and entirely different from data pointers. With pure data, it would cause more UB than it would avoid.

2 Likes

That might just be an argument for disallowing (or at least linting against) the non-null pointer types from being used in FFI (just as with non repr(C) types currently), thus forcing use of an Option around it instead? Granted there is an ergonomic hit as you mention, however.

2 Likes

Covariant and mutable aren't good enough reasons for a new pointer type. NonNull already achieves it, without any changes to the language. It's not quite a pointer, but the difference is, frankly, minor. Again, avoiding a couple of short conversions isn't enough to warrant a new fundamental type. Besides, if pattern types ever get stable, we'll be able to declare non-null type coercible to a pointer purely in library code.

That analogy only goes so far. Unlike Box<T>, which is always guaranteed to be heap-allocated, *move T may point anywhere, including the stack. It would also be a core type, thus available even on #[no_std] systems with no heap allocation.

Thus *move T should be treated as a raw-pointer version of &move T, rather than Box<T>. And &move T has the capability, but not obligation to drop its contents. It's a reference, so one should be able to use it like any other reference type: create and drop without using it, include it in method resolution, create traits like IndexMove, DerefMove and the likes. This means that it must not drop its contents unconditionally, only in specific circumstances. This code should work:

let s = String::new();
let _ = &move s;
dbg!(s);

Similarly, it should be possible to use &move T as &mut T, purely for some local mutations.

let mut s = String::new();
let m = &move s;
*m = String::from("hello");
dbg!(s);

Thus &move T should be able to coerce to &mut T, and to be reborrowed as &mut T. This means that it cannot provide weaker requirements than &mut T, and must be invariant w.r.t. T for the same reasons. *move T, being the unsafe unchecked version of &move T, must also be invariant.

This does introduce an issue when using it inside a Box or Vec. My opinion is that there should be a way to exactly specify the variance of types, and the current approach is a mistake. The result of the current restrictions is that types need to introduce hacky workarounds, like using *const T where *mut T is appropriate, or even storing *mut () and casting to *mut T for actual pointer operations, instead of using the proper pointer type and declaring their variance.

With regards to subtyping, I model the relation between safe references in the following way: there are capabilities Read, Write and Move, which describe everything you can do to the pointed data. Read and Write are self-explanatory. By Move I mean the capability to mark data as initialized or uninitialized (so an alternative name could be Init). This means that the usual move operations is either Read+Move, for moves out of a place, or Write+Move, for moves into the place (or even Read+Write+Move, when we are moving into an occupied place and must drop the old value). The references are thus strictly ordered with respect to their capabilities: &move T <: &mut T <: &T, because &move T: Read + Write + Move, &mut T: Read + Write, &T: Read (assuming no UnsafeCell inside). UnsafeCell, as always, complicates everything, because &UnsafeCell<T>: Write, though it isn't Move.

1 Like

We could just as well lint any dereferences on raw pointers which aren't explicitly checked to be non-null.

What's the point of having an option in the language which is always wrong? It should be made to Just Work instead.