Moving out of raw pointers with FnOnce

Today I was faced with this conundrum: given a mutable pointer to a closure of type F: FnOnce(), i.e. a *mut F, how can I invoke call_once on it and consume the F instance, and thus uninitialise its memory? This is instead of copying the memory the pointer points to and invoking call_once on that.

Here are the results of my investigation so far. Given a pointer f: *mut F:

  1. ptr::read(f)() will perform a copy onto the stack before invocation, even with optimisations enabled. This makes sense, since semantically ptr::read(f) should not modify the memory pointed to by f.
  2. ptr::replace(f, MaybeUninit::uninit().assume_init())() appears to compile to code identical to (1), despite explicitly givng the compiler permission to trash the memory pointed to by f (for example by invoking in place). Note that for large enough F, the ABI of call_once happens to take self by pointer, so in principle the compiler could pass the pointer directly instead of copying first.
  3. (*f)() is not legal, since you cannot move out of a raw pointer.

In my opinion, (2) looks the most promising in that it appears to give the compiler enough room to elide the copy if it so chooses, and it is simply not spotting that optimisation opportunity.

My questions are then:

  1. Is there perhaps a different incantation to (2) that achieves the desired effect?
  2. Is this simply a missed optimisation opportunity, that could perhaps be resolved by a new optimisation pass at the MIR level?
  3. Is there some aspect of Rust's semantics that in fact makes copy elision in (2) unsound?
  4. Does my goal instead require an extension to the language or a new intrinsic?

This is UB. Do not pass go, do not collect $200; this invocation is UB 99.99% of the time.[1]

ptr::read(f)() is the operation you want.

We'd need a stronger ptr::read that always[2] deinitializes the place IIRC; the current one semantically performs a byte copy and leaves the existing bytes intact. Then copy elision can be done.

IIUC, MIR doesn't pass arguments by pointer (I don't know at what point this is done, exactly), so it can't be an MIR opt either. That said, at the LLVM level, it would look similar to your (2) – read out then write undef – so it should optimize the same as what you happen to get from (2).

(But it's still UB! Holding the uninitialized value is UB per Rust semantics.)


  1. it is only not UB it mem::uninitialized() is not UB, because this is what you've done. This is sound only for zero sized types, MaybeUninit<T>, and [MaybeUninit<T>; N]. ↩︎

  2. including for Copy types, IIUC! ↩︎

2 Likes

Good catch. I believe ptr::replace(f as *mut MaybeUninit<F>, MaybeUninit::uninit()).assume_init()() would be the correct way of expressing this? That said, this version doesn't make any difference to the generated code either.

What advantage are you seeking by writing uninit, versus just leaving the old value after ptr::read? The memory is no longer usable either way...

The difference is that the compiler doesn't know that in the ptr::read case. It's perfectly fine to mem::forget(ptr::read(p)) (for a valid pointer to a valid object) and that doesn't invalidate the object behind the pointer.

That said, I fully expect (pseudo IR)

%temp = alloca sizeof $F
memcpy src: %arg0, dst: %temp, len: sizeof $F
memcpy imm: undef, dst: %arg0, len: sizeof $F
call $F::call_once, arg0: %temp

to be immediately optimized to eliminate the undef store

%temp = alloca sizeof $F
memcpy src: %arg0, dst: %temp, len: sizeof $F
call $F::call_once, arg0: %temp

and give exactly what's gotten from using ptr::read.

The ideal pseudo IR @dylanede is after is instead

call $F::call_once arg0: %arg0

where the bytecopy of the impl FnOnce is elided. This is a valid optimization only in the case where the argument pointee is deïnitialized; IOW, this is a valid transformation of my initial IR, but not of the middle IR.

(Caveat: of course, this is only really meaningful when the ABI of F is to pass by pointer to memory, not when it's passed in registers or on the stack. Since the Rust ABI isn't using extern "thiscall" (IIUC) and F is being passed by-value, I don't know how often this is the case.)

2 Likes

Do you perhaps mean

memcpy imm: undef, dst: %arg0, len: sizeof $F

?

Yes :upside_down_face:

I made up an ugly SSA IR and immediately typod it. There's a reason I use Rust....

I think I've found a way to achieve what I'm after. The downside is that it introduces a dependency on the alloc crate, the unstable allocator_api feature, and despite not using GlobalAlloc, pulling in alloc makes the compiler insist that you define a #[global_allocator] if you're on #![no_std]. With that out of the way, given this code:

use core::{alloc::{AllocError, Allocator, Layout}, mem::MaybeUninit, ptr};
use alloc::boxed::Box;

struct NoOpDealloc;

unsafe impl Allocator for NoOpDealloc {
    #[inline(always)]
    fn allocate(&self, _: Layout) -> Result<ptr::NonNull<[u8]>, AllocError> {
        panic!()
    }
    #[inline(always)]
    unsafe fn deallocate(&self, _: ptr::NonNull<u8>, _: Layout) {
        // No-op
    }
}

you can now invoke an F: FnOnce() mutable pointer in place by using Box::from_raw_in(f, NoOpDealloc)(). I've checked that this elides the copies onto the stack.

A possible caveat with that is that Box adds noalias on its pointer, IIRC. That might even be the reason it's able to avoid copies as you want, but it will be UB if that's violated.

It looks one idea I had for a while would help here: have a reference type that borrows value for infinite (not 'static!) lifetime. Let's call it "owning ref" here. It combines properties of Box and &mut T such that the code that creates such reference can only create it once - as if creating the reference moved the value but simultaneously the reference has lifetime as if it was just a normal &'a mut T. Additionally the reference is responsible for dropping - it's own Drop impl unconditionally calls drop_in_place() on its pointer.

Note that this is somewhat similar to Pin. To soundly create Pin on stack one shouldn't allow the value to be accessed after creating it. There's even a macro to do just that. The main difference between this and Pin is that moving out of owning ref is allowed, while moving out of Pin isn't. It seems that safe stack pinning could be built on top of this but I'm not sure.

@dylanede for your use case the compiler could auto-impl FnOnce for this kind of owning ref and you'd use that instead of raw pointer.

This is typically called &move or &own.

1 Like

Do you have some more resources on existing proposals to add this?

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.