I see that Rust's documentation says this, but I don't understand why. The first version of the C++ paper about deprecating useless parts of volatile lists "Shared memory with untrusted code" (i.e. my use case) as the first item on its list of legitimate use cases.
As I understand it, volatile guarantees that the compiler won't invent reads that were not in the source code and that the compiler doesn't assume that two volatile reads from the same memory location yield the same value. AFAICT, these constraints should be enough to rule out UB (but not garbage values) at least on ISAs where non-atomic loads/stores and relaxed loads/stores use the same instructions. Is the UB issue something that would only manifest on Itanium, or is there a more practical reason why concurrent accesses need to be UB with volatile? (I understand that volatile isn't well-suited for intentional communication between threads, but that's different from whether adversarial concurrency can cause UB (as opposed to mere garbage values) for the thread using volatile.)
I think that optimization is permitted even if the function is visible outside the crate, because with C++11 atomics, the compiler is allowed to reduce the set of possible executions.
Ah, indeed. I gather that Itanium is the only architecture for which the relaxed memory order affects the generated instructions in addition to affecting optimization compared to non-atomic.
I don't need to support Itanium, and I'm very confident I won't have to start supporting Itanium. I have to care about x86, x86_64, armv7 and aarch64. Support for POWER8 and higher, MIPS, and RISC-V would be nice but not essential at this time.
How can that be, considering that unordered
is supposed to model non-volatile
Java (for Java's meaning of volatile
)?
These are all stronger guarantees than what I need. Both volatile and relaxed are too strong as well: I'm pretty sure the optimization constraints I need are only 1) the compiler must not invent reads to the memory locations that the source code tells it to write to (i.e. the compiler can't use that memory as a spill space prior to writing what the programmer asked to write, since reading back the spilled values could go wrong) and 2) if the compiler generates instructions to read the same memory location twice, it must not assume that it gets the same value both times (but it is OK to optimize away the second read and reuse the value already read).
Relaxed is too strong, because it provides indivisibility. In addition to Rust not having a way to generate relaxed SIMD loads/stores, when using C to get LLVM to generate them, LLVM generates a library call that takes a lock to provide indivisibity.
Volatile is too strong, because it prohibits reordering of write operations within a sequence of writes and reads within a sequence of reads as well as combining operations to wider reads/writes. Optimizations like that should be fine for my scenario, since I don't care what garbage values an adversarial thread observes and in the presence of adversarial writes, I'm OK with reading garbage as long as the compiler doesn't use assumptions about the consistency of the garbage to eliminate any safety checks.
(Last week, I discussed this with SpiderMonkey and Cranelift developers with the premise that volatile would be UB with concurrency (and, therefore, excluded from consideration) and settled on relaxed and relaxed ruling out SIMD. Then I read the C++ paper about selectively deprecating volatile and it giving my use case as the first item on the list of legitimate uses, so now I'm trying to understand if I could get SIMD back with volatile...)