sync::Once, per instance


#1

sync::Once::call_once requires &'static self. Why not &self?

On several occasions I needed something like this to run once per instance of an object, but this API is too limited for this:

impl Handle {
    fn do_stuff(&mut self) {
        self.once.call_once(|| self.stuff.lazy_initialization());
    }
}

so I roll my own with Mutex<bool>, which isn’t as elegant, and probably isn’t as efficient as it could be.


#2
    pub fn call_once<F>(&'static self, f: F) where F: FnOnce() {
        // Fast path, just see if we've completed initialization.
        if self.state.load(Ordering::SeqCst) == COMPLETE {
            return
        }

        let mut f = Some(f);
        self.call_inner(false, &mut |_| f.take().unwrap()());
    }

    fn call_inner(&'static self,
                  ignore_poisoning: bool,
                  init: &mut FnMut(bool)) {
        let mut state = self.state.load(Ordering::SeqCst);

        'outer: loop {
            match state {
                // If we're complete, then there's nothing to do, we just
                // jettison out as we shouldn't run the closure.
                COMPLETE => return,

                // If we're poisoned and we're not in a mode to ignore
                // poisoning, then we panic here to propagate the poison.
                POISONED if !ignore_poisoning => {
                    panic!("Once instance has previously been poisoned");
                }

                // Otherwise if we see a poisoned or otherwise incomplete state
                // we will attempt to move ourselves into the RUNNING state. If
                // we succeed, then the queue of waiters starts at null (all 0
                // bits).
                POISONED |
                INCOMPLETE => {
                    let old = self.state.compare_and_swap(state, RUNNING,
                                                          Ordering::SeqCst);
                    if old != state {
                        state = old;
                        continue
                    }

                    // Run the initialization routine, letting it know if we're
                    // poisoned or not. The `Finish` struct is then dropped, and
                    // the `Drop` implementation here is responsible for waking
                    // up other waiters both in the normal return and panicking
                    // case.
                    let mut complete = Finish {
                        panicked: true,
                        me: self,
                    };
                    init(state == POISONED);
                    complete.panicked = false;
                    return
                }

                // All other values we find should correspond to the RUNNING
                // state with an encoded waiter list in the more significant
                // bits. We attempt to enqueue ourselves by moving us to the
                // head of the list and bail out if we ever see a state that's
                // not RUNNING.
                _ => {
                    assert!(state & STATE_MASK == RUNNING);
                    let mut node = Waiter {
                        thread: Some(thread::current()),
                        signaled: AtomicBool::new(false),
                        next: ptr::null_mut(),
                    };
                    let me = &mut node as *mut Waiter as usize;
                    assert!(me & STATE_MASK == 0);

                    while state & STATE_MASK == RUNNING {
                        node.next = (state & !STATE_MASK) as *mut Waiter;
                        let old = self.state.compare_and_swap(state,
                                                              me | RUNNING,
                                                              Ordering::SeqCst);
                        if old != state {
                            state = old;
                            continue
                        }

                        // Once we've enqueued ourselves, wait in a loop.
                        // Afterwards reload the state and continue with what we
                        // were doing from before.
                        while !node.signaled.load(Ordering::SeqCst) {
                            thread::park();
                        }
                        state = self.state.load(Ordering::SeqCst);
                        continue 'outer
                    }
                }
            }
        }
    }

I personally don’t see any reason for the 'static bound. Here’s the routine extracted for ease of reference.


#4

Fun git blame dive:

  • std: Second pass stabilization of sync (Dec 2014)

    -    pub fn doit<F>(&'static self, f: F) where F: FnOnce() {
    +    #[stable]
    +    pub fn call_once<F>(&'static self, f: F) where F: FnOnce() {
    
  • libstd: use unboxed closures (Dec 2014)

    -    pub fn doit(&'static self, f: ||) {
    +    pub fn doit<F>(&'static self, f: F) where F: FnOnce() {
    
  • std: Rewrite the sync module (Nov 2014)

    -    pub fn doit(&self, f: ||) {
    +    pub fn doit(&'static self, f: ||) {
    
     The second layer is the layer provided by `std::sync` which is intended to be
     the thinnest possible layer on top of `sys_common` which is entirely safe to
     use. There are a few concerns which need to be addressed when making these
     system primitives safe:
    
       * Once used, the OS primitives can never be **moved**. This means that they
         essentially need to have a stable address. The static primitives use
         `&'static self` to enforce this, and the non-static primitives all use a
         `Box` to provide this guarantee.
    

The author of this diff is @alexcrichton. sync::Once contains only a pointer to (privately hidden) Waiters, which are all stack-allocated. The 'static bound to sync::Once is thus unnecessary to guarantee that any OS primitives are non-relocatable.

I’ve submitted this PR to change this, along with this reasoning as to why it was 'static and why it shouldn’t need to be. I was unable to find any other discussions within rust-lang/rust. At the very least the PR should provide prior art and documentation for why the 'static bound is required.


#5

For a workaround, consider using parking_lot's Once, which takes &self.

Differences from the standard library Once

  • Only requires 1 byte of space, instead of 1 word.
  • Not required to be 'static.
  • Relaxed memory barriers in the fast path, which can significantly improve performance on some architectures.
  • Efficient handling of micro-contention using adaptive spinning.

#6

The PR has been merged! std::sync::Once::call_once now does not require 'static self on master!


#7

Looking at this, does anyone know why sync::Once uses SeqCst ordering? Release/Acquire is strong enough for the kind of synchronization done here.


#8

It’s almost certainly just because it was more “obviously correct” during the initial implementation. We should be able to relax it.


#9

That, and load(SeqCst) generates the same assembly as load(Acquire), so there is no performance degradation on the hot path (Once has already been executed).


#10

oO how can that be? There should be a fence, even on x86!


#11

See https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html.

Since x86 loads and stores implicitly have Acquire and Release semantics, you only need to add a fence to stores to make them SeqCst.


#12

Ah, that makes sense. You need a fence in either loads or stores, not both - and then it makes more sense to have them on stores, I guess.

On other platforms (e.g. ARM) there will be more differences, though.


#13

Stores are less frequent than loads and delayed stores have less impact on any processing pipeline than would delayed loads.


#14