Is this a data race?

This video took very long for ulimately showing a neat example for a non-locking stack, in a very short time :smiley:

1 Like

Another subtlety here is "the same as what?" Thread 2's 92 is the same as thread 1's 92, but it's different from the initial value of 0. So in the specific example at the top of this thread, both store()s are changing a value which might be either 0 or 92 to being definitely 92, which we can't plausibly describe as "not a write". At best that's "maybe a write; you'll find out at runtime".

If the initial value was also 92, then things at least seem interesting again. But then CAD97's earlier post and my previous post and so on apply again almost verbatim, so we definitely still need to consider it a write/mutation/whatever for data race purposes even though the value doesn't change.

This is definitely a data race and the program has UB. A colleague once showed me a real optimization that gets broken by considering such code non-racy, here is what they wrote:

Consider the following example

1 {
2 X_na = 1;
3 X_na = 0;
4 Y_rel = 1;
5 X_na = 0;
6 }
7 || // this means "run the two blocks in parallel"
8 {
9   if(Y_acq){
10      a = X;
11  }
12}

In this program a = X in the second thread (line 10) races with the X_na = 0 in the first thread (line 5)

and can read from X_na = 0 writes in the first thread (line 3 and 10).

So in this case a=X should return 0 instead of undef according to the proposed assumption.

Now, it is possible to perform an elimination X_na = 0; Y_rel = 1; X_na = 0; ~> Y_rel = 1; X_na = 0;

where the X_na = 0 in line 3 is eliminated.

In this target program a=X may read two different values (1 or 0 from lines 2, 10) and in race with X_na = 0 (line 10).

As a result a=X should return undef and result in more behaviors in the target program.

This shows that if the program contains a "data race where the write doesn't change the value stored at that location", the optimization is incorrect. Thus, it is crucial that this is indeed considered UB.

@josh you are confusing "data race" and "race condition". A race condition is when two accesses (at least one store) can occur in different orders in multiple executions, introducing non-determinism. Those can be okay. A data race is a particular kind of race condition where at least one access is "non-atomic". Data races are always problems, as the compiler assumes that non-atomic accesses are never subject to race conditions -- IOW, a data race is undefined behavior.

12 Likes

I'm not arguing that the original example is not a data race, but the X_na example seems fundamentally different to me. In the original example, all writes to X are atomic and the read from X is not atomic; while in the X_na example it is the writes that are not atomic.

Well, per the definition of a data race and its many clarifications in its thread, which of the operations is non-atomic clearly doesn't matter; all the reads and writes have to be atomic or else it's UB.

So what exactly do you mean by "fundamentally different"? Are you trying to ask why it doesn't matter which operations are non-atomic? Are you trying to imply there's some better alternative model we haven't already considered in this thread? Without any other context it seems like just another "that's not intuitive" to which the answer is "yeah, some parts of this low-level concurrency stuff aren't intuitive, and they can't be made intuitive without giving up significant performance."

What I mean is not that the original example is not a data race as per definition; I explicitly said I'm not arguing that. What I mean is that the "real optimization that gets broken by considering such code non-racy" does not apply to the original example.

The proposal in this thread (if I understood correctly) was to "not count" the original example as a data race because the racing write puts the same value into the location as is already stored there. The example I posted shows that adjusting data races to exclude such cases would be a mistake. Of course there might still be ways to declare the original example as non-racy, but those would have to be more complicated than "the write doesn't actually change the value that is stored".

3 Likes

Based on my understanding of the C++20 memory model, nothing here prevents the non atomic read from being UB. If you really want to read non atomically you need to enforce some ordering that make all concurrent stores and reads impossible. Something like:

static X: AtomicUsize = AtomicUsize::new(0);

// setup thread
let LOCK = Arc::new(Mutex::new(()));

// thread 1
{
    let _ = LOCK.lock().unwrap();
    X.store(92, Ordering::Relaxed);
}

{
    let _ = LOCK.lock().unwrap();
    let non_atomic_x: &usize = unsafe { transmute::<&AtomicUsize, &usize>(&X) };
    assert_eq!(*non_atomic_x, 92);
}

// thread 2
let _ = LOCK.lock().unwrap();
X.store(92, Ordering::Relaxed);

Generally non atomic reads on atomic variables are a very bad idea and whenever possible should be replaced with at least relaxed reads, as they should and with LLVM these days generally do generate the same code. As someone mentioned earlier, one reason for keeping this UB, is staying abstracted away from cache coherency, alignment and other hardware details.

Onto another populare misconception. Let's say we 'fix' our code, and transform the non atomic read into a relaxed atomic read, like this:

// thread 1
X.store(92, Ordering::Relaxed);
assert_eq!(X.load(Ordering::Relaxed), 92);

// thread 2
X.store(92, Ordering::Relaxed);

Based on the C++20 memory model, it's totally legit for that assert to fail. Nothing prevents the load from being reordered before the store. A semicolon does not create a sequence point, atomic barriers and other mechanisms do.

People from the C++ community made a tool to explore the memory model and the resulting complex interactions.

http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/ here is the tool, no https :frowning:

Enter this code and click run:

int main() {
    atomic_int x=0;
    {{{
        {
            x.store(92, memory_order_relaxed);
            x.load(memory_order_relaxed);
        }
        |||
        {
            x.store(92, memory_order_relaxed);
        } 
    }}}
    return 0;
}

As you can see there are 18 different valid ways for this 'simple' program to be executed. And in 6 of them x.load sees 0, see Execution candidate no. 1 - 6. Expressed with the red rf(read from) relationship to the original value and the subsequent c:Rrlx(Read relaxed) x=0

Ignore non consistent results.

If you want to learn more about the tool and it's background, take a look at this talk https://youtu.be/VogqOscJYvk.

2 Likes

Hm, this is not true? They act on the same memory location, so "program order" applies. That is, I read what you've said as that compiler can reolder these two statemetns:

x = true;
assert!(x);

which is I hope is not true.

1 Like

I think the tool shows all interleavings, even those which are forbidden by the memory model. Those that are actually allowed are marked "consistent". Among "consistent" executions, the assert never fails.

To get at this from a different angle, tool shows a zero returning (but inconssiten) execution for this snippet:

int main() {
    int x = 0;
    x = 92;
    x;
}
1 Like

Huh, that actually makes a lot of sense. Guess I got confused by the allowed reordering of independent variables. Always wondered what consistent meant in this context, your theory seems very plausible. Skimming through the original paper https://www.cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf doesn't really make it more clear what consistent it is towards.

1 Like

Note that the load in the program by @Voultapher happens with x.load(memory_order_relaxed); and not, as in the example given in this thread, without a memory order. That is the critical problem, mixing atomic and non-atomic stores and loads. Indeed, if you replace the load with a non-atomic one this, the tool indicates a data race (Edit: I'm no longer sure if the tool does what I think it does. This is a problem inventing ad-hoc C-like syntax and giving neither a complete glossary of operations and semantics):

int main() {
    atomic_int x=0;
    {{{
        {
            x.store(92, memory_order_relaxed);
            // Actually, I'm completely sure if this is what I think it is. 
            // The statement `x` alone is also indicated as a race...
            *x;
        }
        |||
        {
            x.store(92, memory_order_relaxed);
        } 
    }}}
    return 0;
}
1 Like