`unordered` as a solution to "Bit-wise reasoning for atomic accesses"

comex · October 15, 2019, 12:25am

Heh. That does remove some of the urgency, so to speak. Does anyone know of any other examples of Rust code that intentionally uses non-atomic loads instead of Relaxed ones [edit: in situations where a race is possible]?

Or even examples of code that does use Relaxed but could potentially go faster with Unordered. This includes loads that, after inlining, are:

Completely unused
- Might be rare; I'm not really sure
Performed repeatedly in a loop, but would be okay to hoist out of the loop (i.e. the value is not expected to change during the loop, or the code doesn't care if it does)
- Though hoisting is only possible if the compiler can prove there are no aliasing writes within the loop, which is often hard, especially with the noalias woes
Performed multiple times in succession, when one load would suffice
- But this could also be done on Relaxed

Tom-Phinney · October 15, 2019, 3:23am

None of the above low-likelihood examples appear to be worth the risk of adding a new memory ordering mode that lacks provable theoretical foundations.

CAD97 · October 15, 2019, 3:38am

The most recent tokio blog post includes snippets of code that do an unsync_load.

loop {
    let head = self.head.load(Acquire);

    // safety: this is the **only** thread that updates this cell.
    let tail = self.tail.unsync_load();

    if tail.wrapping_sub(head) < self.buffer.len() as u32 {
        // Map the position to a slot index.
        let idx = tail as usize & self.mask;

        // Don't drop the previous value in `buffer[idx]` because
        // it is uninitialized memory.
        self.buffer[idx].as_mut_ptr().write(task);

        // Make the task available
        self.tail.store(tail.wrapping_add(1), Release);

        return;
    }

    // The local buffer is full. Push a batch of work to the global
    // queue.
    match self.push_overflow(task, head, tail, global) {
        Ok(_) => return,
        // Lost the race, try again
        Err(v) => task = v,
    }
}

From the article I don't think Self::push_overflow writes self.tail, but it does "include stronger atomic operation [than Acquire]".

comex · October 15, 2019, 4:54am

Thanks, but I just edited my last post to clarify – I meant non-atomic loads that can race. In that case, if the comment is correct, there are no potentially racing stores so no UB.

steffahn · December 22, 2024, 5:03pm

This topic was automatically closed 540 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using LLVM's "unordered" reads and writes	1	833	May 19, 2019
Need input regarding very weak memory models (GPUs, VMs...)	12	1671	April 5, 2020
Defining (some) unsynchronized writes Unsafe Code Guidelines	2	217	January 31, 2025
Loads and stores to/from outside the memory model language design	36	3434	September 27, 2019
Bit-wise reasoning for atomic accesses Unsafe Code Guidelines	39	3710	May 25, 2019

`unordered` as a solution to "Bit-wise reasoning for atomic accesses"

Related topics