Atomic cmpxchg with volatile semantics


#1

(I waffled on whether this belonged on Users or Internals; send me off to Users as needed.)

In an embedded application, there is a 32-bit (word-sized) memory-mapped register. It contains several bit fields. I want to update one of them atomically without accidentally clobbering changes to others.

In C++ I might model this problem as follows:

extern std::atomic<uint32_t> volatile register;

void update_register() {
    bool success = false;
    do {
        auto val = register;
        success = register.compare_exchange_weak(
            val, val & ~mask | value);
    while (!success);
}

This does what I want: the volatile std::atomic preserves accesses/ordering, and the compiler produces (ARM) code using ldrex / strex to get correct atomic semantics.

Rust, like LLVM, models the concept of “volatile accesses” at the operation level instead of the type level, using the volatile_load and volatile_store intrinsics.

Separately, Rust has atomic access intrinsics, seemingly modeled after LLVM’s atomic operations.

However, to implement cases like this, LLVM operations that can access memory sport a volatile flag, among other attributes (memory ordering). Rust’s atomic intrinsics capture some of the cartesian product of these attributes (e.g. there are 17 compare-exchange intrinsics representing a flattening of the attributes of a single LLVM instruction), but not the concept of access volatility.

As a result, it’s not clear how I get atomic operations with LLVM/C11 volatile semantics in Rust.

I am aware that I could work around this by

  1. Wrapping accesses with a lock instead of using atomic updates, and
  2. Bypassing the compiler and using inline assembly,

but I’d like to avoid both if possible.

Thanks!


#2

On ARM system, ldrex/strex do not work on Device memory, only on Normal memory. Therefore a volatile atomic doesn’t really make sense in this case. You can use ptr::write_volatile and ptr::read_volatile to perform non-atomic volatile accesses to Device memory, or just plain atomics if you are accessing Normal memory.


#3

Fascinating. A subtle bullet point in the ARMv7-M architecture reference manual agrees with you. It appears I’ve got quite a bit of code in the wild that is technically undefined behavior. (For the record, ldrex/strex work fine on device memory in Cortex-M parts through the M4, but it’s technically illegal and might not work on future parts.)

Does this imply that there is no way to write an interrupt-safe read-modify-write of a device register on ARMv7-M without disabling interrupts? I think it might.


#4

Ah, after looking at the reference manuals, it seems that this is implementation defined rather than undefined. Also, my experience is mostly on ARMv7-A, not ARMv7-M, so there might be some further differences.

Anyways, back to your question: Rust doesn’t currently expose volatile atomic operations, however LLVM currently performs very little optimization on atomics, so you might be able to get away with the existing primitives. Alternatively, you can use inline asm to implement volatile versions yourself.


#5

[details=ARM-specific digging, background, and conclusions for those who are interested (not particularly Rust-specific)]I did some forensic computer architecture on this.

The modern atomic instructions were added in ARMv6, before the A/M/R profile split. Initially they carried the caveat “LDREX and STREX operations shall only be performed on memory supporting the Normal memory attribute.” (ARM ARM circa 2005, section A2.9.)

At some point during the evolution of ARMv7-A, this was loosened to its current version: “It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be performed to a memory region with the Device or Strongly-ordered memory attribute.” (Wording present as of Issue C of the ARMv7-A ARM, circa 2012.)

ARMv7-M was not updated and contains the ARMv6 wording precisely.

In practice, the Cortex-M3 and M4 processors appear to follow the ARMv7-A model: ldrex/strex work fine against Device memory. (Contrast with some v7-A processors that straight-up fault in this condition.)

Moreover, this pattern is the only way I can find to perform efficient interrupt-safe read-modify-write sequences on device registers from unprivileged code on ARMv7-M, because unprivileged code cannot efficiently disable/reenable interrupts. (And many applications do not allow interrupts to be disabled.)

So I’m going to keep doing things the illegal way.[/details]

Thanks! I am using inline assembler for now, much as I did in C++ before volatile atomics became reliable there.