Volatile and sensitive memory

In Rust, there is no volatile modifier. Instead, there are unstable “volatile read” and “volatile write” primitives. And, it seems like those will be stabilized soon. So, since there is no volatile modifier, does that mean that the Rust compiler must assume that all memory is volatile by default? In particular, in general given x: &u64, the compiler won’t know if x is a reference to value stored in volatile memory, right? Thus, it will never be able to do speculative reads of x and in general it must make sure it uses ACCESS_ONCE [1] style access to every such value, unless it can prove that x is not a reference to volatile memory, which is in general very hard. Is that right?

It seems like it can’t be right, because this seems like it would be horrible for performance. But, I can’t figure out how the proposed “volatile read” and “volatile write” functions work otherwise. In particular, “volatile read” and “volatile write” seem to only make sense under the assumption that there are no speculative reads/writes to the memory they are writing to, but there’s no way to indicate to rustc that no speculative reads/writes are allowed.

[1] https://lwn.net/Articles/508991/

You cannot volatile_store or volatile_load an object that is pointed to by a reference; that would be a violation of the aliasing rules, just like having an exclusive reference and a shared reference to the same object is.

1 Like

I’m not sure what you mean. Here’s some code. Is the compiler allowed rewrite easy into easy_optimized? Is it allowed to rewrite tricky into tricky_extra_load?

#![feature(core_intrinsics)]

fn easy(x: &u64, y: &u64) -> u64 {
    if *x != 0 {
        *x
    } else {
        *y
    }
}

fn easy_optimized(x: &u64, y: &u64) -> u64 {
    // The compiler replaces two loads of |*x| 
    let ret = *x;
    if ret != 0 {
        ret
    } else {
        *y
    }
}

fn tricky(x: &u64, y: &u64) -> u64 {
    if unsafe { std::intrinsics::volatile_load(x as *const u64) } != 0 {
        unsafe { std::intrinsics::volatile_load(x as *const u64) }
    } else {
        *y
    }
}

fn tricky_extra_load(x: &u64, y: &u64) -> u64 {
    // The compiler adds a spurious read of |*x| when compiling `tricky`.
    let _ = *x;
    if unsafe { std::intrinsics::volatile_load(x as *const u64) } != 0 {
        unsafe { std::intrinsics::volatile_load(x as *const u64) }
    } else {
        *y
    }
}

fn main() {
  let x = 1;
  let y = 2;
  easy(&x, &y);
  easy_optimized(&x, &y);
  tricky(&x, &y);
  tricky_extra_load(&x, &y);
}

I think the assumption is that if memory is volatile, it will only be used via the volatile read/write primitives – and if you mix accesses, we are free to do whatever we want to the other accesses.

2 Likes

Niko, does that mean that the Rust compiler will never insert a read anywhere where the Rust code doesn’t put one? I.e. no speculative loads?

I don’t think “volatile memory” is a concept in the C spec. According to (the last draft of) the C11 spec, the volatile modifier on a variable/memory location means that expressions involving it are evaluated strictly as required by the abstract machine. That is, as far as I can tell, it just controls uses of the variable, not the actual storage itself.

This is exactly what {read,write}_volatile offer, explicit control over the uses of a variable. These allow/are designed for building an abstraction like, which is fairly similar to the volatile qualifier:

struct VolatileCell<T> {
     x: T
}
impl<T> VolatileCell<T> {
    fn get(&self) -> T {
        ptr::read_volatile(&self.x)
    }
    fn set(&mut self, x: T) {
       ptr::write_volatile(&mut self.x, x)
    }
}

(In practice it may make more sense to have this use UnsafeCell internally and &self for set.)

2 Likes

Note that in your example, the compiler is allowed to insert a spurious read of *x because &T and &mut T parameters to a function are marked as dereferenceable in LLVM IR, which means they can be safely dereferenced. If you had used *const T or *mut T instead then the compiler would not have been allowed to add a spurious read.

1 Like

No, that's not what I mean. It means that if you have a volatile load, it won't be speculatively loaded in advance. And if you want to ensure no speculative loads etc are done for some memory, you should only be using volatile accesses. As @huon said, addresses are not volatile -- accesses are. Declaring a variable as volatile is really just a way of saying "all accesses to this variable are volatile". I believe LLVM and C have roughly this same model. Here are LLVM's docs, for example. (I'd be interested to see counterexamples though.)

2 Likes

Let’s back up. Consider this:

fn foo(_x: &u64) { }

In Rust, Is the compiler allowed to rewrite it to this?:

fn foo(x: &u64) {
   let _ = *x;
}

Yes it is, because you are using a &u64, which guarantees that it points to a valid instance of u64. Now, this isn't actually specified in the documentation, but Rust (at least the way it is currently implemented) will also assume that it points to normal memory where reading the value multiple times will not result in any side-effects and will always return the same value.

Raw pointers do not make any of these assumptions, so Rust would not be allowed to rewrite this:

fn foo(_x: *const u64) { }

into this:

fn foo(x: *const u64) {
   let _ = unsafe { *x };
}
1 Like
int main() {
    volatile int not_really_volatile;
    volatile int *p = &not_really_volatile;
    (void)*p;
    return 0;
}

This gets compiled to:

main:                                   # @main
        xorl    %eax, %eax
        retq

Even though not_really_volatile is declared volatile, the C compiler is allowed to use its intrinsic knowledge of the fact that it allocated it in non-volatile storage to optimize away the load.

[quote="Amanieu, post:10, topic:3188"] Yes it is, because you are using a &u64, which guarantees that it points to a valid instance of u64. Now, this isn't actually specified in the documentation, but Rust (at least the way it is currently implemented) will also assume that it points to normal memory where reading the value multiple times will not result in any side-effects and will always return the same value.[/quote]

So, would it be fair to say that a Rust reference is equivalent to a non-NULL pointer-to-non-volatile object?

So, would it be fair to say that a Rust pointer is equivalent to a C pointer to volatile object?

In particular, if this is all true, it means that no function could ever safely use reference types and then later cast them to pointers for the purpose of calling volatile_load or volatile_store, as the compiler may insert speculative loads ahead of the cast from reference to pointer. This has the pretty amazing consequence, AFAICT, that one cannot create a safe (doesn't require the use of unsafe) API that ever uses volatile memory.

If so, that actually sounds pretty OK to me for the time being. However, it would be nice to also have a safe (i.e. doesn't require the use of unsafe) API for volatile memory in Rust, because pointers are too unsafe. AFAICT, that would require an extension to the type system, or at least the creation of volatile types analogous to Rust's atomic types.

Also, IMO it is very important that it is clearly documented in the Rust reference that references are non-volatile and pointers are volatile, if this is all true.

Dereferencing doesn't imply reading ("lvalue-to-rvalue conversion") in C/C++, try to use something like int unused = *p; and you'll observe a read operation generated.

Edit: Or maybe not, there's some pedantry involved, I need to check in the standard if conversions to void invoke lvalue-to-rvalue conversions.

Update: As it often happens with C++, everything is more interesting - the rules have changed between C++03 and C++11. *p invokes lvalue-to-rvalue conversion for volatile glvalue expressions (Clause 5 Expressions [expr]). You can observe the difference by running Clang with different -std options on your snippet. I haven't checked various versions of the C standard, they may differ from each other and from C++ as well.

Update 2: But my point is that Clang certainly doesn't "use its intrinsic knowledge of the fact that it allocated it in non-volatile storage to optimize away the load".

You are actually correct, the code above generates this warning:

warning: expression result unused; assign into a variable to force a volatile load [-Wunused-volatile-lvalue]

Yes.

Yes.

@Amanieu

I’ve updated my comment, everything is more interesting.

Yes.

Not quite. Regular reads and writes are not volatile, so you'll need to use the volatile functions if that's what you want.

How do you figure? I can have a reference to a raw pointer, and the aliasing rules on the reference don't affect the raw pointer's aliasing rules at all. Similarly, I can have a newtype around a raw pointer, have it use volatile access under the hood, and none of that will be affected by the fact that accesses to the newtype are non-volatile.

Yes

No, a Rust pointer is equivalent to a plain C pointer. A C compiler is not allowed to insert a spurious load if a pointer is never dereferenced (volatile dereferences don't count). This is because, in C, dereferencing a (non-volatile) pointer requires the pointer to point to a valid object in normal memory, otherwise undefined behavior happens. Since the compiler assumes undefined behavior never happens it can assume that the pointer is safely dereferencable. Rust pointers works the same way.

The only use for volatile is to access memory-mapped I/O, which is always unsafe since you're writing to an arbitrary memory location outside of Rust's control. You can create a safe wrapper around a raw pointer to allow safe access to a specific I/O memory location, but the actual access will still need to use unsafe code.

OK, let me make sure I understand:

Given this declaration of a C function:

uint64_t volatile *f();

Is this the proper Rust FFI declaration for this function?:

extern {
    fn f() -> *mut u64;
}

Can we assume that as long as Rust code never attempts to load or store from the result of f, the compiler will never attempt any loads or stores?

Let’s say I want to allocate a buffer that Rust will never load or store from. Is there a way to do that? For example, consider:

struct Foo {
   buffer: std::cell::UnsafeCell<[u8; 64]>
}
impl Foo {
   fn new() -> Foo {
     Foo {
       buffer: std::cell::UnsafeCell::new([0u8; 64])
     }
   }
}

fn main() {
   let _ = Foo::new();
}

It seems like the compiler is allowed to insert spurious loads and stores into the buffer. Is there any way to avoid that? In particular, I would like to pass this buffer to some non-Rust code so that it can use it for whatever it wants, but I want Rust to be responsible for the allocation. Is this possible? In C, we would use volatile uint8_t buffer[64].

Perhaps this is a better way of phrasing the second part: What is the proper FFI declaration for this?:

struct Foo {
    volatile uint8_t buffer[64];
};
#[repr(C)]
struct Foo {
    buffer: ???
}