[Pre-ACP?] Volatile Atomics and weak/strong Volatiles

Shinribo · August 1, 2025, 12:42pm

Continuation of

Pre-pre-RFC: Exploring API design space for volatile atomics

Rust currently has Atomics and Volatile Accesses; However it lacks Volatile Atomics and potentially differentiation between weak/strong Volatile Accesses.

Why volatile Atomics?

Volatile Atomics are mainly used in System and Bare-Metal programming, such as OS-Kernels. In these Areas the need arives to perform memory operations atomically but at the same time these atomic operations should act on data structures for communicating wit hardware, which are not regularly read from. An example is the Memory Management Unit (MMU) in x86 CPUs. A bare metal programm managing the page tables used by the MMU to translate virtual addresses into physical addresses usually only writes into these data structures, potentially allowing the Compiler to optimize those accesses away. This means we need volatile Accesses, but the x86 MMU is also allowed to write to these datastructures to set state bits (Accessed and Dirty), inadvertenly leading to a race condition between the programm and the MMU; The only reason this is not instant UB on x86 is because of the strong memory model x86 uses. On other Architectures such as ARM and RISC-V this could actually be UB. To solve this we need volatile Atomics. There are other (weaker) examples such as Inter-Process-Communication, where you are writing the code for IPC and cant rely on a underlying OS.

Tl.Dr there is a need for volatile Atomics for System/Bare-Metal Programming in Rust.

Pre-pre-RFC: Exploring API design space for volatile atomics discusses multiple ways how this could be expressed in the API. While its a subjective view, prepending the function names with “volatile_” looks reasonable:

pub fn store(&self, val: usize, order: Ordering)

pub fn volatile_store(&self, val: usize, order: Ordering)

Aditionally we may need to think about volatile as a concept again. Currently it is defined as “no reorder, no tearing, no read/write aggregation, access must happen“, However this may hinder optimizations if its too strong for the current need.

Thats why i propose splitting volatile into strong and weak variants: Strong: Same gurrantees as current volatile. Weak: “access must happen“

Especially when writing IPC Code the accesses only need the “Access must happen“ gurrantee, as the other required gurrantees are satisfied by synchronization mechanisms (Eg. Locks,…).

Vorpal · August 1, 2025, 12:50pm

While I agree this area is underdeveloped, you need to take into consideration the multitude of previous discussions on this topic. Here, on zulip as well as at

How does your proposal relate to these? What makes it possible to move forward with this, unlike the other proposals and efforts?

Jules-Bertholet · August 1, 2025, 2:03pm

See especially ## Pre-Pre-RFC: `core::arch::{load, store}` and stricter volatile semantics · Issue #321 · rust-lang/unsafe-code-guidelines · GitHub

Shinribo · August 12, 2025, 8:41am

I likely missed some discussions, but those that i read and event interacted with boiled down to either not making progress without a visible reason or i was brushed off that my example (same as in post above) were solveable without volatile rust. There isnt really pressure to work on volatile_atomics with some people even arguing that its useless. From my view bolting volatile onto atomics looks non invasive especially as volatile atomics are supported by llvm, i dont see any real reason not to add that. The only reason i can think of is that the lang devs want to get things right the first time to prevent features becoming severe breaking changes, but at least in this case it could be solved by gating volatile_atomics behind a nightly feature flag until the whole atomics, volatile and memory model thing is solved.

talchas · August 15, 2025, 2:10am

Yeah, my impression is that it's some amount of wanting to get it right, but a lot it just being low priority due to the very few use cases, and even fewer use cases when compilers have not been reliable about generating the expected code for at least regular volatile accesses (I don't know offhand if llvm actually screws up volatile atomics, I believe the specific example I saw was both gcc and non-atomic, but it certainly wasn't valid for any hardware control via volatile). And the cases which do still need it are generally very close to the hardware and often are using asm already, so using it for one more case doesn't really hurt (and if you need a special barrier like RISCV, that's probably not going to be part of the volatile atomics code generation).

And both of those examples in your initial post can use regular atomics. You don't need volatile atomics for memory shared with the MMU (unless there's archs out there which limit the instructions you're allowed to use to access it, in which case you'd additionally need the compiler to make guarantees about what instructions are used, whether volatile or not). The compiler cannot optimize the writes away entirely, because the pointer to the page table has been shared with asm (which could do anything, like spawn a thread that reads it) at some point. The thing that non-volatile atomics permits in that situation is combining multiple writes, if you flip a bunch of flags in sequence or such (and don't have any special asm in between), which is not a problem for editing a page table.

Similarly, IPC doesn't have a problem unless normal atomic instructions just don't work for some reason, and in that case volatile atomics probably will not help and just use the same instructions. mmap() or whatever other operation you used to get your shared memory could have spawned a thread, and thus the compiler can't break things.

The one case I know of where volatile atomics could be useful if compilers were entirely trustworthy about what instructions volatile accesses generate, is DMA situations where you write data to shared memory using normal non-atomic accesses, and then you want to do some Release operation(s) on some magic hardware address to actually start the operation. In this case combining two stores would be wrong (because it's not really just a memory location; each write is a separate message to the hardware), and thus volatile is needed. But see RISCV needing special fences, and compilers screwing up codegen around volatile, so kernels just use asm anyways.

Shinribo · August 21, 2025, 5:56pm

Is there some way to prevent the compiler from doing false dead store elimination optimizations to my page table code other than hoping that it just works when the root pointer is obtained from assembly? Maybe a list of things that cause the compiler to not try those optimizations?

talchas · August 21, 2025, 9:00pm

A compiler which believed that it was dead would be completely wrong. At that point there's really no hope for any sort of correctness.

Topic		Replies	Views
Pre-pre-RFC: Exploring API design space for volatile atomics language design	44	1583	March 11, 2024
Need input regarding very weak memory models (GPUs, VMs...)	12	1815	April 5, 2020
Add volatile operations to core::arch::x86_64 language design	21	2335	October 26, 2019
Atomic cmpxchg with volatile semantics	5	2511	March 25, 2019
Volatile and sensitive memory language design	100	21874	April 19, 2021

[Pre-ACP?] Volatile Atomics and weak/strong Volatiles

pub fn store(&self, val: usize, order: Ordering)

Related topics