Volatile and sensitive memory

Rust does not have volatile variables, so I'm not sure that applies.

1 Like

It doesn't apply. Mixing volatile and non-volatile accesses to the same location is fine as far as the language is concerned. But of course, if you accidentally do a non-volatile access to MMIO memory, that can cause all sorts of trouble (but no UB).

3 Likes

What are the semantics then? If it's not UB, the behavior is implementation defined (by rustc), and should be spelled out somewhere. The docs point to C11, which implies that it is undefined. The docs also spell out that volatile_read and volatile_write are guaranteed to always commit reads and writes to memory, but I'm not sure I believe that's true at the intersection of mixed volatile/main memory accesses and LLVM's questionably documented semantics (IIRC it assumes that if a pointer accessed as main memory, it is always accessed as if in main memory).

I disagree that this is "not the language's problem". If we provide volatile intrinsics and guarantees thereof, we've already made it our problem. If we aren't careful, we'll wind up in the same position as C: https://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

2 Likes

AFAIK, the key problem here is that volatile is about tightly controlling the stream of hardware load/store instructions targeting a given memory location, whereas the mere existence of a single non-volatile load/store operation (or indeed, of an unused reference to that memory location) is enough to allow rustc and LLVM to generate virtually any extra loads and stores they like in addition to the volatile ones for the purpose of prefetches, register spills, optimized struct field readout/modification, etc.

1 Like

Right, but my understanding is that if LLVM figures out it can generate those extra loads and stores, it will also figure that it dosn't need to maintain the ordering or commit-to-load/store semantics of the volatile load/stores (certainly, the as-if semantics of generating extra loads and stores are enough to allow this).

I believe that in practice gcc (not sure about clang) does something like this in the event of mixed-access UB:

extern volatile int my_register;
int foob() {
  // This cast causes GCC to infer that |my_register| is 
  // main memory, not volatile, and erases the cv-quantification
  // from the global.
  *((*int) &my_register) = 0x42;
}

It was either earlier in this discussion or in another volatile discussion where someone observed that LLVM likes to de-volatile-ize loads and stores if it observes that that address is accessed like a normal pointer (e.g., marking its address as dereferenceable). I think the paper I linked makes more reference to this kind of stuff (it's also generally worth reading just because of how hilarious it is).

Edit: coincidentally, a friend sent me a cppcon talk about lolvolatile, and perhaps the bit about LLVM deleting "volatile" markers in the IR might be made up? I'm not an LLVM wizard- I'll leave adjudicating that point to someone who isn't making things up.

1 Like

If you access a volatile object created from C or C++ through an lvalue to non-volatile, the behavior is undefined. As rust code that calls to C code is held to C's invariants, I would assume that extends to making a non-volatile access to a volatile object. Theoretically, it should be possible to avoid problems by wrapping the variable in some Cell wrapper that only loads/stores through the intrinsics, but as far as I can tell, the fact that UnsafeCell<T> has the same layout as T isn't well-defined (and such a wrapper would probably rely on UnsafeCell).

This is a question about what LLVM, not clang, does. LLVM provides a much larger volatile surface than C++, and potentially different definedness semantics (LLVM does not have the utterly insane notion of a "volatile object", but rather, much like Rust, volatile loads and stores (and volatile memcpy apparently???).)

I think the bit about FFI interactions with "volatile objects" (which, LLVM's view point, is a C/C++ fiction, and which Rust knows absolutely nothing about) is perhaps more subtle than anyone but the UCG folks can pass judgement on.

3 Likes

Volatile objects are rather useful to consider, and certainly volatile semantics can be necessary to require and enforce. How this is chosen to be enforced is going to be language dependent. C and C++ use volatile objects to enforce volatile semantics in the case where they are needed.

Volatile objects are not quite as useful as they sound. Volatile only affects loads and stores, and, as such, is only meaningful behind a pointer. For example:

void poke(volaitle int device);

does not put volatile semantics on device. The correct and useful notion is that of a volatile pointer, or perhaps volatile address. Objects (or, perhaps more pedantically, lvalues) in C do not need to have observeable addresses unless their addresses are taken.

One could also imagine other problems with this view. For example, under one aggressive reading of the as-if rule, volatile objects with automatic storage duration (i.e., volatile stack variables) could be assumed to be in main memory, and their loads and stores unobserveable (since the compiler has free reign to hammer the stack however it wants already). In general, any volatile object which did not come from an extern or static global, or which is behind a pointer crated by the compiler, not forged by the programmer, cannot (to my knowledge) be observed to be volatile.

2 Likes

So far, this part of Rust isn't specified yet.

My thinking is that we specify it somewhat along the lines of what I outlined here. Basically, in the Rust Abstract Machine, volatile reads/writes are considered externally observable events, akin to syscalls. Unlike syscalls, however, they are severely limited in the effects they are allowed to have -- the exact limits here will have to be carefully determined. Furthermore, in the platform-specific part of the rustc documentation, we would then specify the effect of that externally observable event in terms of the low-level memory model of that specific platform.

I have not seen this claim before, so I don't think it was in this discussion. Do you have a link? That would be a serious problem for the desired volatile semantics for Rust.

Oh God please no, can we stick to Rust-only volatile for now?^^

I would say the correct and useful notion is that of a volatile access. LLVM got that right, Rust copied it from LLVM, and C/C++ are moving in that direction. Everything else is a library/syntactic sugar built on top of that, where accesses through some pointers are automatically marked as volatile.

3 Likes

You still need to say what accessing a region of memory through both volatile access and any other kind of access (i.e, using @HadrienG's VolatileT[1], passing it by-value, wrapping it in a NonNull, and reading it), which I don't think your proposal specifies. Edit: Actually, is the read I'm supposed to get from this "as long as some subsequence of the program's loads and stores matches the volatile loads and stores, the implementation conforms"? I mean, you could do that, but I feel it would be strictly more reasonable to declare "volatile and non-volatile access to the same address" as UB, less so for optimization opportunities and more so because mixing such accesses is always a programmer error.

I'm also somewhat concerned with whether forging a pointer and doing volatile accesses on it is UB or not. Forging a pointer is the standard way of accessing an MMIO device (mostly because you can't always expect to control the linker script).

[1] I still think this is the wrong syntax sugar for volatile; you really want it to wrap a raw pointer and have the constructor accept a usize address, but I digress.

I don't. I looked for a reference and I couldn't find one? I might be wrong here, better to check with someone who works on LLVM. =/ I would still be willing to believe that selling LLVM on an addressing being like a C++ T& is enough to destroy volatile semantics though, even if all of the volatile reads and writes, as written, still occur.

I agree, only in that volatile access is the appropriate primitive for an SSA IR (insofar that your IR should feel like a sugary RISC with infinite registers). At the user level, though, I still argue volatile pointer is more reasonable (though I only really believe in *volatile word).

1 Like

AFAIK, the nasty thing is that this is always a programmer error in some use cases of volatile (e.g. hardware MMIO), but not in other use cases.

For example, I could totally see myself doing something like this pseudocode:

// Request a page of memory from the OS.
let memory_page: &mut [MaybeUninit<u8>] = os_api::allocate_page();

// Make sure that allocation actually occurs before benchmark starts,
// working around the lazy allocation stupidity of some OSes.
memory_page.as_mut_ptr().write_volatile(MaybeUninit::new(42));
compiler_fence(Ordering::SeqCst);

// Benchmark actual workload, without memory allocation timing noise.
// Volatile is neither necessary nor desirable here, we don't want to
// needlessly inhibit the compiler's memory access optimizations.
run_microbenchmark(|| do_something_with(memory_page));
1 Like

That is a necessary but not sufficient condition for a conforming implementation. But yes, I see no issue in the spec with mixed accesses -- it is up to the platform-specific documentation to explain how exactly hardware-level loads and stores interact with the Rust Abstract Machine Memory Model, but presumably the most reasonable option is to say that, e.g., a volatile store followed by a non-volatile load will load the value previously stored, and vice versa.

UB is the worst possible kind of "lint" for a programmer error. This is the equivalent of "your car had a dent, so I helpfully scrapped it to fix that problem for you".

Also, I don't agree that it is always an error. For example, there is a crate (the name eludes me right now) that uses volatile writes to overwrite a variable with zeroes before it gets dropped. This is useful for security purposes (don't leave behind secrets), and typically there will be non-volatile accesses to the same location.

It is UB to access a dangling pointer, volatile or not. But a pointer to MMIO-backed memory is not dangling. In general, a compiler cannot deduce that such a forged pointer is dangling, for the reasons you mentioned (with the explicit exception of 0/NULL, which is always considered dangling). The compiler can, however, consider a pointer dangling that was allocated via a language allocation operation (malloc in C, things like Box::new in Rust) and then moved beyond the bounds of the allocation, or used after deallocation.

Also, the access is UB if the address does not actually exist. This is more relevant for non-volatile accesses where optimizations could change when and where e.g. a page fault happens (the compiler is not required to preserve that, it may assume no such fault happens).

Sure, VolatileUsizePtr makes for a more useful abstraction. This is similar to how we have AtomucUsize and similar types, but the actual underlying primitive after atomic accesses, not atomic objects. As we are discussing language semantics here, we are concerned only with the underlying primitives, not with the higher-level abstractions.

5 Likes

Oh, yeah, that's fine; I believe C explicitly allows that? My wording this morning was just wrong. I'm not happy about the idea of "this is originally volatile, touching it as non-volatile is ok", but any sane compiler probably cant observe that, since your primitive is going to be, like you said, volatile load/store. BoringSSL does something like this, but they seem to do the hilarious asm volatile("") trick instead...

@HadrienG Maybe we really want volatile versions of copy_nonoverlapping and write_bytes? I have no idea what the semantics of volatile llvm.memcpy are though.

Is this written down somewhere? The understanding I've always taken in C/C++ is that forging pointers that C/C++ didn't initialize is UB, but in practice you either don't care or you pass --im-a-kernel-author-i-know-what-im-doing-thanks or whatever. (Unfortunately, my copy of C11 is buried under all my moving boxes and I don't think control-f'ing around the PDF is worth answering this question.)

The impression I get, the more I talk to people about this, is that "volatile" is an obnoxiously overloaded term that means a lot of things to different people. They all seem to be some variant of "optimization barrier around load/store", but it is not clear we all want the same type of barrier...

2 Likes

I think this is basically specified by omission: the key point is that the compiler can not assume to know all allocations. It can set the rules for allocations it "understands" (like those created via language primitives), but it cannot dictate those rules on the rest of the world.

I wouldn't know where to search for this in the C standard. In a proper formal definition, I would look at the correctness theorem and search for the initial state: the Abstract Machine on which the source program runs, and which is simulated target binary, has to be correctly simulated not just for the empty initial state, but also when there already exists some "stuff" in the Abstract Machine, such as preallocated memory. I would also look at the model of syscalls to get an idea for what things like mmap are allowed to do (allocation functions that the compiler does not understand).

I can't say anything authoritative for C/C++, but I can tell you that when the time comes for writing a Rust spec I will argue strongly against such an approach. :wink: But in my interpretation of C/C++, doing so is legal. At least I have yet to have someone point me at a part of the spec that would forbid this.

I don't think this is a great place to be in... some obnoxious C/C++ standards bugs stem from the difficulty of proving a nonexistence theorem in the language of the standard. Strict aliasing is especially bad in this regard.

I tend to find that it's safer to read C/C++ as a list of everything that is not UB, due to its emergent nature.. Thankfully, the UCG wg is not a dysfunctional international committee, so I expect whatever standardese you wind up writing won't have that caveat. I'm sure you'll be hearing from me when such a time comes and the semantics of this sort of kernel code stuff are finally specified. =)

2 Likes

This is probably zeroize.

This crate isn't about tricks: it uses core::ptr::write_volatile and core::sync::atomic memory fences to provide easy-to-use, portable zeroing behavior which works on all of Rust's core number types and slices thereof, implemented in pure Rust with no usage of FFI or assembly.

1 Like

Oh, totally agreed. There's many things about the C/C++ standard that are very much not a great place to be in. :wink:

2 Likes

When we finally get to the point of having a "complete" Rust spec, I'm sure this would be stated more explicitly somewhere, if only in a non-normative note saying "notice we never said X, that's intentional because ...". But I don't think there's any short-term Rust-specific worry here.

2 Likes

The semantics that @mcy assumes would require the C standard to state that an implementation is allowed to assume that only those allocations that it can "observe" are valid. The C standard does not state this anywhere, and the reason for this are kind of obvious (e.g. that would completely break separate compilation, among many other things).

That is, in C, if one casts the address 42 into an int * and dereferences it, the C standard allows an implementation to assume that this pointer points to a valid allocation, dereferenceable for sizeof(int), aligned to an alignof(int) boundary, etc. An implementation that can prove that this is not the case, can assume that such an execution path is unreachable. Implementations cannot, in general, prove this, except for very obvious cases, e.g., if you cast 13 to an int* and dereference it, then it is clear that the pointer is not properly aligned, and such that code is unreachable. In particular, if you have a valid allocation behind some memory address, and an implementation proves that this is not the case, then that implementation is incorrect because its proof is incorrect. That's in a nutshell how MMIO with "magic" hardcoded addresses in C works. I'm not sure how Rust could do anything differently here.

1 Like