Also known as: &own
, rvalue references, first class places
At a high level, &move T
is a theoretical third reference type which conveys ownership of and responsibility for dropping the T
, but not its backing memory. A major reason it's considered potentially beneficial to Rust is that it enables library code to model the ABI behavior of pass-by-reference directly, eliminating repeated memcpy moves of a value in the source, without dropping down to the use of &mut MaybeUninit<T>
. (Depending on the exact address uniqueness rules, it's not too uncommon to see LLVM not optimizing out such repeated stack copies for somewhat large values.) A second major reason boils down to feature compatibility with C++, though how exactly that gets expressed is highly dependent on the individual.
This sketch only handles "always uninitializing" move references, in essence serving as a safer version of &mut ManuallyDrop<T>
. A separate but related concept which can be called "typestate references" also handles the initialization case, which would today be best unsafely modeled with &mut MaybeUninit<T>
. It is the author's belief that the former can be made coherently sound even in the face of unwinding, but the latter cannot. For the purpose of the Abstract Machine reborrowing, &move T
acts like &mut ManuallyDrop<T>
(assuming a non-memory-recursive borrow validity).
The first question to ask is a two-pronged syntax one, but this syntax question has heavy implications on what semantics are reasonable.
- When calling a function taking a
&move
reference, do you pass an argument by-value or by-move-ref? E.g. to callfn f(_: &move i32)
, givenlet x: i32
:f(x)
, like a non-move-ref function, like how C++ references behave; orf(&move mut x)
, like other Rust references behave? - When passing a
&move
reference binding as a function argument, does using the value act like using a reference, or like using the referenced place by-value? E.g. to callfn g(_: i32)
, givenlet y: &move i32
:g(*x)
, like how other Rust references behave; org(x)
, (sort of) like C++ references behave?
I'll consistently refer to the concept as &move T
, but note that the expression construction syntax cannot just be &move $place
[1], since &move || {}
is a valid expression (reference to move closure temporary).
I currently lean weakly towards implicitly creating move references, because the function interface behaves identically to accepting the move by value (assuming destructive moves). Exception: Copy
types, since the existing value must not be clobbered. We might want a way to explicitly get a destructive move on them for optimization purposes[2] anyway, though.... Using implicit construction also avoids C++-minded viewers from assuming that &move mut $place
is Rust's version of C++'s std::move(lvalue)
. Or worse, mem::ref_move!($place)
For the use of move references, I think they should remain only usable as references unless dereferenced, to avoid the incidental complexity present in C++ with forwarding references in templates and the use of std::reference_wrapper
. (Patterns get a new binding mode of ref move
which is the default when binding behind &move
.) It would be allowed to call fn f(_: &move i32)
as f(x)
with x: i32
or with x: &move i32
, to minimize the burden of this mismatch.
The big semantic question for &move
is how do you guarantee destruction, such that Pin<&move T>
is functional[3]? If you simply drop the pointee when the move reference is dropped (or moved out of), forgetting (or leaking etc) the reference would forget to drop the pointee.
The simplest answer is just not to guarantee destruction. If the move reference is forgotten, that's equivalent to forgetting the pointee, equivalently as if it had been manipulated by value. Pin<&move T>
is unsound to create for T: !Unpin
. This is unfortunate, because address-stable types are a major use case for owning references, since they cannot be owned and passed around by value.
The answer which the moveit crate takes is to make MoveRef
into a fat reference type (extra-fat for an unsized pointee), including an extra reference to the appropriate drop flag in the memory's owning scope, such that that scope can run the drop glue if the called function fails to actually drop the value. Unfortunately, this scheme isn't without its flaws: it increases the size of move references to handle an edge case, and the library still has to provide "drop flags" which abort if the value wasn't dropped, since sometimes the lending scope can't actually drop the value at scope end.
My now preferred solution (and the reason I wrote out this sketch) deceptively subtle for how simple it is: put the drop flags in the scope manipulating the move reference. Not for the move reference itself (though it would still have them, as an owned value), but for the indirectly owned pointee. Effectively, immediately after the binding of some let x: &move T
, insert the moral equivalent of defer! { if builtin#initialized(*x) { drop(*x); } }
. The only "issue" is that this must be done as part of the language implementation, which has access to the drop flags, rather than as a library.
The impact of this is that the referenced place is always dropped as if it were a local place binding. Because the point of move references is manipulating places as-if they were moved into local scope, but without actually changing the pointee's address. Essentially, giving fn(x: &move T)
identical semantics to fn(ref move x: T)
, except for the address of the pointee staying the same as in the caller's scope. (This equivalence also ties into why I think I like the pass-as-value call syntax.)
Thus, forgetting the move reference does nothing: the drop flags for the referenced place are tracked separately and the remaining value at that place is still dropped at the end of scope. Of course, if the move reference is dropped, that drops the pointee then and there, manipulating the drop flags such that it won't get double dropped, and much the same goes for moving out the value or creating a new fresh reborrowed move reference which takes over responsibility for dropping the pointee. Similarly, it's still valid to write mem::forget(*x)
if forgetting the pointee is what is desired; there's just (somewhat unfortunately) no way to forget it without moving it (to counteract and enable maintaining the pinning guarantee), unless we add a mem::forget_in_place
function.
And the majority of the indirect place drop flag tracking functionally already exists in the compiler, for Box
. The box itself and the heap place's initialization states are tracked separately; this is what allows you to move out of a box (the "DerefMove
") and still free the box allocation at the end of scope, or to even move a value back in, recompleting the box and allowing you to manipulate the complete box again. The "only" change that would need to be made for move references is decoupling the dropping of the "stack part" from the dropping of the "heap part," such that the latter can happen without the former. (Plus of course, all of the rules for creation of move references serving as a mut
region on the borrowed place and considering the place deinitialized once access is regained.)
The semantics feels both reasonably workable and simple in hindsight to the point I'm somewhat surprised I haven't seen this approach of handling drop flags in previous discussion — or perhaps I just missed it or don't recall it; that's quite possible.
This seems almost too obvious and simple, and I fear I've overlooked some concern that would make this interpretation of move references impossible.
I don't think Rust is likely to accept an RFC for move references in the near term, but if Rust does support move references in the future, I currently believe that these drop semantics have the most straightforward and predictable behavior of any potential option (with the exception of just ignoring the issue and Pin<&move T>
), so it should be used. (I'm less confident about the implicit construction syntax, but it does seem to have its benefits.)
-
I've used
&move mut $place
as the expression construction syntax here, following behind the unstable&raw mut $place
syntax behindptr::addr_of_mut!
. This syntax should be unambiguous, but it does clash with another far-future pseudo-proposal that offersmut || {}
as potential syntax to indicate that a yield closure loops on return rather than poisoning and panicking latter resumes. ↩︎ -
The short version of it is that if the address of the source place has escaped (i.e. been given to a noninlined function), it's impossible for the compiler to prove that any use of the value is the last in order to eliminate a defensive copy before pass-by-reference, which is necessary to ensure the two values, which are both "live" at the same time, have disjoint addresses.
*&mut $place
is an opaque way of navigating around this and invalidating any extant references, but only so long as that place computation isn't just completely dropped from MIR, losing knowledge of that side effect. ↩︎ -
As discussed in @mcy's second blog post about the moveit crate. ↩︎