Trying to understand correctness of synchronization primitives

I’m looking at the WaiterQueue used in std::sync::once, in particular I was curious about how thread::park and Thread::unpark can be used correctly.

In this use-case, park and unpark are used together with an AtomicBool like this:

parking:

// the AtomicBool is in `node.signaled`
while !node.signaled.load(Ordering::Acquire) {
    thread::park();
}

(source)

unparking:

(*queue).signaled.store(true, Ordering::Release);
// ^- FIXME (maybe): This is another case of issue #55005
// `store()` has a potentially dangling ref to `signaled`.
queue = next;
thread.unpark();

(source)

As far as I can tell, #55005 is about something different, unrelated to what I’m discussing here.

To come to my question, I’m trying to understand the logic here, in particular the code for unparking. Let me quote the nomicon:

Intuitively, an acquire access ensures that every access after it stays after it. However operations that occur before an acquire are free to be reordered to occur after it. Similarly, a release access ensures that every access before it stays before it. However operations that occur after a release are free to be reordered to occur before it.

Now, on the unparking side, an Ordering::Release is used, so I would naively want to assume that the compiler could legally re-order this into something like

thread.unpark();
(*queue).signaled.store(true, Ordering::Release);
queue = next;

except it can’t (I hope) since otherwise, we could have this kind of interleaving

// unparking thread:
thread.unpark();
// CONTEXT SWITCH
// parked thread, leaving its `thread::park()` call
while !node.signaled.load(Ordering::Acquire) // -> `signaled` still false, loop again
thread::park(); // we’re parked again, waiting for the next `unpark`
// CONTEXT SWITCH
// unparking thread:
(*queue).signaled.store(true, Ordering::Release);
queue = next;
// ...

where the parked thread is not woken up successfully.

So my question is: What kind of atomics-intricacy prevents this reordering? Or is there a bug in the standard library and what’s actually needed is something like the following?

(*queue).signaled.swap(true, Ordering::SeqCst);
queue = next;
thread.unpark();

I’m getting really insecure about atomics these days—perhaps the thread.unpark() call would even need be changed to logically depend on the result of the .swap() to really ensure that no reordering happens?

Edit: Iʼm only just now noticing that the example on the thread::park docs uses the same kind of AtomicBool flag with a Release store. Also: sorry if this thread seems like a better fit for URLO; I thought it should fit IRLO, too, because Iʼd like to potentially get some bugs fixed in the standard library or at least have its documentation improved by adding more explanation for why this example code is safe.

2 Likes

I believe what makes the stdlib not incorrect (currently?) is that Thread::unpark performs a SeqCst access, preventing reordering. I can't speak authoritatively for whether this is actually enough, or whether this property is guaranteed to user code, but intuitively I think it does prevent the bad reordering / interleaving you show here.

I'm not really up to speed today mentally, so I don't know if there's anything preventing any reordering.

However, I can say that SeqCst doesn't in itself prevent much. It's just AcqRel with the additional property that it is well ordered to other SeqCst accesses even on other atomics. Unless there's some other SeqCst on some other atomic, it adds nothing on top of AcqRel.

Edit: Take it with a grain of salt, due to my lack of caffee, but I believe the thing that makes it correct is the access inside parking/unparking. Because unpark does SeqCst, which is superset of AckRel, which is superset of Release, that one ensures that the changes to signal are visible to other threads (in other words, the signal.store does not prevent unpark getting before it, but the unpark prevents the signaled.store getting after it, not beacuse ordering between atomics due to SeqCst, but because in addition to being atomic, it is also an ordinary memory access which is synchronized by that Release).

I think you are correct. The gymnastics with signaled piggybacks with park/unpark, which uses the bigger hammer, and therefore signaled may use just Acquire and Release.