`unordered` as a solution to "Bit-wise reasoning for atomic accesses"

Heh. That does remove some of the urgency, so to speak. Does anyone know of any other examples of Rust code that intentionally uses non-atomic loads instead of Relaxed ones [edit: in situations where a race is possible]?

Or even examples of code that does use Relaxed but could potentially go faster with Unordered. This includes loads that, after inlining, are:

  • Completely unused
    • Might be rare; I'm not really sure
  • Performed repeatedly in a loop, but would be okay to hoist out of the loop (i.e. the value is not expected to change during the loop, or the code doesn't care if it does)
    • Though hoisting is only possible if the compiler can prove there are no aliasing writes within the loop, which is often hard, especially with the noalias woes
  • Performed multiple times in succession, when one load would suffice
    • But this could also be done on Relaxed

None of the above low-likelihood examples appear to be worth the risk of adding a new memory ordering mode that lacks provable theoretical foundations.

1 Like

The most recent tokio blog post includes snippets of code that do an unsync_load.

loop {
    let head = self.head.load(Acquire);

    // safety: this is the **only** thread that updates this cell.
    let tail = self.tail.unsync_load();

    if tail.wrapping_sub(head) < self.buffer.len() as u32 {
        // Map the position to a slot index.
        let idx = tail as usize & self.mask;

        // Don't drop the previous value in `buffer[idx]` because
        // it is uninitialized memory.
        self.buffer[idx].as_mut_ptr().write(task);

        // Make the task available
        self.tail.store(tail.wrapping_add(1), Release);

        return;
    }

    // The local buffer is full. Push a batch of work to the global
    // queue.
    match self.push_overflow(task, head, tail, global) {
        Ok(_) => return,
        // Lost the race, try again
        Err(v) => task = v,
    }
}

From the article I don't think Self::push_overflow writes self.tail, but it does "include stronger atomic operation [than Acquire]".

Thanks, but I just edited my last post to clarify – I meant non-atomic loads that can race. In that case, if the comment is correct, there are no potentially racing stores so no UB.

1 Like