Pre-pre-RFC: Exploring API design space for volatile atomics

Continuation of Volatile atomic operations? - help - The Rust Programming Language Forum.

C/++ atomics model allows for volatile atomic operations. They’re required when using atomics in volatile memory, e.g. when atomic is in shared memory and is used to synchronize between processes. Currently Rust doesn’t allow volatile atomics, so the only way to do volatile atomic accesses is by using inline assembly. That feels suboptimal.

I see several ways to support volatile atomics in the language:

Just duplicate everything

We could duplicate all the atomic types, i.e. make VolatileAtomicU8-like types. Alternatively, we could duplicate all the methods, i.e. make .fetch_add_volatile()-like methods. Both these paths lead to even more code duplication, and, worse, docs duplication near atomics.

Wrapper type

Make a Volatile<T> wrapper type that makes accesses to the wrapped value volatile. This could allow for Volatile<AtomicU8> to implement volatile atomic methods. It still kinda duplicates all the methods (unless we add a (private?) trait to abstract over atomics), but it will look better in docs. Potential source for confusion: Volatile<u8> being a wrapper and AtomicU8 being a separate type may feel kinda arbitrary at the first glance. Upside: such a wrapper could be useful with non-atomic values too.

Special orderings

pub struct OrderingPlusVolatileness {
    pub ordering: Ordering,
    pub volatile: bool,
}

impl From<Ordering> for OrderingPlusVolatileness {
    fn from(ordering: Ordering) -> Self {
        Self { ordering, volatile: false }
    }
}

impl Ordering {
    fn volatile(self) -> OrderingPlusVolatileness {
        OrderingPlusVolatileness { ordering: self, volatile: true }
    }
}

impl AtomicU8 {
    fn fetch_add(self, val: u8, ordering: impl Into<OrderingPlusVolatileness>);
}

That’s not quite backwards-compatible (consider e.g. use_fn_ptr(AtomicU8::fetch_add)), but I think it could be done in the new edition (crates with old editions will see old non-generic methods).

Pros: looks kinda nice at the call site (x.fetch_add(1, Ordering::Relaxed.volatile())), almost backwards-compatible, doesn’t create API duplication. Cons: conflates two nearly unrelated things into one value, not really backwards-compatible.

Fences

volatile_load_fence(&x); // Ensure _next_ load is volatile.
x.fetch_add(1, Ordering::Relaxed);
volatile_store_fence(&x); // Ensure _previous_ store is volatile.

I’m not sure this is easy/possible to implement? But this option has by far smallest API surface, which I think is nice. It’s also completely backwards compatible and not that unpleasant to use (although most useful atomic operations would require two fences, as shown above).

Variation:

// Adds fences above and below.
x.with_volatile(|| {
    x.fetch_add(1, Ordering::Relaxed);
});

Don’t Do Anything

Always an option. Volatile atomic accesses remain only accessible (pun not intended) via inline assembly, which is non-portable. Upside: API remains simple and discoverable for the by far most common usecase (non-volatile atomics).

Anything else?

I’m quite interested in pushing this story forward. I’m willing to write an RFC and try to implement this, although I’ll probably need a mentor for a change this complex (I think it would involve libs, MIR optimizations and codegen?).

2 Likes

This doesn't require an RFC, just create an ACP

I'm leaning towards just adding extra volatile methods. It's not like the documentation page for the atomic types are crazy long already. Duplicating methods is something we commonly do, look at the various integer arithmetic modes.

That said, there are a couple other options:

  1. Add an optional generic parameter to each method
pub fn fetch_xor<const VOLATILE: bool = false>(&self, val: bool, order: Ordering) -> bool

abool.fetch_xor(false, Ordering::SeqCst);
abool.fetch_xor<true>(false, Ordering::SeqCst);

To make it more readable, you could use two unit structs implementing a common trait

trait Volatility {
    const VOLATILE: bool;
}
struct Volatile;
struct NonVolatile;

impl Volatility for Volatile {
    const VOLATILE: bool = true;
}
impl Volatility for NonVolatile {
    const VOLATILE: bool = false;
}

pub fn fetch_xor<V: Volatility = NonVolatile>(&self, val: bool, order: Ordering) -> bool

abool.fetch_xor(false, Ordering::SeqCst);
abool.fetch_xor<Volatile>(false, Ordering::SeqCst);

Or just wait for adt_const_generics and use an enum.

  1. Put the Volatile methods on an accessor struct
pub fn volatile(&self) -> VolatileAtomicU8

abool.fetch_xor(false, Ordering::SeqCst);
abool.volatile().fetch_xor(false, Ordering::SeqCst);
2 Likes

We only need a new type if volatile memory differs by representation. Are there situations where non-volatile / volatile mix is not only wrong but results in data races (by the memory model)? That's not been implied anywhere so I'd find the introduction of an (owning) type a little strange.

There's good motiviation to have different in the form of methods in any case. The idea to have a borrowing wrapper type to group those methods more clearly is interesting. It does fracture the documentation a little bit, but removes a bunch of redundant qualification. Hm.

1 Like

We only need a new type if volatile memory differs by representation. Are there situations where non-volatile / volatile mix is not only wrong but results in data races (by the memory model)? That's not been implied anywhere so I'd find the introduction of an (owning) type a little strange.

It doesn’t matter from the safety perspective. For shared memory nobody is stopping someone from just producing regular &AtomicU8 and using it. Separate type is just a usability thing: docs get separated, autocomplete doesn’t suggest useless methods etc.

Volatile atomics were discussed recently on Zulip: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/volatile.20atomic.20in.20Rust.3F

Correct me if I'm wrong: doesn't the default type for the V type param in fetch_xor violate the invalid_type_param_default rule? See: Tracking issue for invalid_type_param_default compatibility lint · Issue #36887 · rust-lang/rust (github.com)

Ah probably. I forgot about that rule.

Can you say more about how atomics and volatile are used together? When I think volatile I think of things where "every read matters" because it's a special address, not memory, but when I think atomic I think things like CAS loops, where there's lots of spurious uninteresting reads.

Or is it just "I have to use an atomic write because I have no synchronization", and things like CAS doesn't make any sense with it?

The problem is highlighted in the original thread: with non-atomic volatile accesses compiler can just optimize them away when it thinks that accesses are unobservable in the current program. If something outside the program is interacting with memory (for example, when sharing memory between processes or perhaps when dealing with MMIO), this optimization is incorrect, so we need a way to prevent it.

My question is why the combination, though. Why does sharing memory between processors need more than normal atomics? In what situation does a write to that shared memory need to be volatile too?

(I understand the uses for each of them individually.)

1 Like

Because compiler assumes that if the atomic operation is never observed in the current program, it’s never observed at all and can completely optimize that operation out: https://godbolt.org/z/9dTc6dPE. This assumption is wrong in the presence of shared memory.

My thought was that this was when it can prove the accesses are unobservable, which doesn't apply to things like shared memory which come from outside the observable memory space.

This likely prevents this problem in practice now, but I feel like this logic is kind of brittle. What if the atomic value is in static (mut) variable that was pinned to a specific address with linker magic? How does compiler decide what’s inside the observable memory space? If there was an API to create a fresh provenance, which could allow porting mmap() to strict_provenance, would the memory accessed through a pointer with that fresh provenance be “observable” from the perspective of the compiler? Reasoning like this relies havily on the poorly-defined semantics of Rust’s (and LLVM’s) abstract machine. On the other hand, volatile accesses are quite simple to reason about.

1 Like

I doubt this, my understanding is that what volatile even means is part of those poorly-defined semantics, it's used in a lot of places it's not needed in C due to cargo-culting, and incorrectly used as an alternative to atomics very often.

1 Like

Will the volatile wrapper type work on normal integer types too? That would be very useful, since currently you need to either go through pointers or use crates to work with volatile memory mapped registers in embedded.

Whatever solution we come up with should ideally be composable so you can have either volatile, atomic, both or neither. I feel like this is actually a spot where C++ has a better story (atomic<T> as opposed to many separate types). It would be great if we could at least have Volatile<u8> and Volatile<AtomicU8> just do the right thing.

What was the reason why there's AtomicU8 instead of Atomic<u8>? Will the same issue apply to Volatile<T>?

3 Likes

This particular pitfall can’t be triggered by adding volatile atomics.

As far as I understand it, volatile means that access counts as an I/O effect, similarly to foreign function call, i.e. compiler isn’t allowed to assume its behaviour. Am I wrong in this? Anyway, I feel like properly explaining and understanding one part of the memory model is still easier than properly understanding the entire memory model.

I actually can’t remember. I’ll try to find this discussion before opening a proper RFC/ACP.

1 Like

Before we can add this to libs this needs approval by... probably T-opsem and maybe T-compiler? I.e. it needs to be specified first what these operations even mean and whether they can be supported.

Then we can talk about the name. AIUI it's unclear whether volatile-for-shared-memory vs. volatile-for-mmio are the same thing or should have separate names. But that can't be decided if the semantics of those operations hasn't been defined yet.

1 Like

Is it possible to introduce a function fn(*const T) -> *const T or fn(&T) -> &T which tells the compiler that writes through the resulting pointer are observable?

let my_atomic = AtomicBool::new(true);

// might be optimized out
my_atomic.write(false, Ordering::Relaxed);

// always done
red_bikeshed(&my_atomic).write(false, Ordering::Relaxed);

This means that the state of &my_atomic does not escape if it is passed into red_bikeshed. Only reads and writes through the returned reference matter.

This would on one hand give you a guaranteed mechanism for mmap and friends or even embedded programming that certain writes to a memory location are always done. And it gives you fine grained access when this is the case.

From what I understand about LLVM (not much) this can be implemented.

I’ve looked into the implementation details a bit. It seems like fence-like approach, as well as this one, would be harder to implement. To make access volatile, we need to mark instruction as volatile by adding a call to LLVMSetVolatile() somewhere around here: https://github.com/rust-lang/rust/blob/6316ac83d770fb86dc581751b24cddf622376899/compiler/rustc_codegen_llvm/src/builder.rs#L1083-L1098. The obvious way to do this is to pass the “volatile” flag through the call stack, which means that atomic method needs to “know” whether it’s volatile or not.