Include racy reads in Rust memory model with `MaybeInvalid<T>`

Are the proofs in question publicly available? It would be interesting to look at them to see whether the new operation would invalidate any of the existing steps in the proof. (I had intended to prove it to be a sound extension to the memory model by quantifying over all possible implementations and showing that none could be affected, but it would probably be easier to check the existing proofs to make sure that they still work.)

https://plv.mpi-sws.org/scfix/paper.pdf is the paper with a model that comes closest to the C++/Rust memory model. Strangely they don't seem to have a link to the actual proofs, I will ask the authors about that.

1 Like

Ah I found it, it was just a bit hidden: Repairing Sequential Consistency in C/C++11. The "full paper with appendix" has the proofs.

3 Likes

So it turns out that this paper already proves a model in which racing an atomic read against a non-atomic write produces an undefined value rather than undefined behaviour, in a very trivial way – it makes no distinction between relaxed-atomic writes and non-atomic writes at all (they compile into the same code on every system it mentions, and none of the proofs treat non-atomic writes and relaxed writes any differently). It does make a distinction between non-atomic and relaxed-atomic reads for performance purposes, but the distinction is specific to reads (the proof basically works by proving that any cycle that they want to disallow contains a raced non-atomic read, and thus would be undefined behavior under C11 – but the fact that raced non-atomic writes are also undefined behaviour under C11 is not involved in the proof).

As such, under this memory model, seqlocks should be possible to implement simply by using a relaxed-atomic read in the potentially racy step (a byte at a time for opsem purposes, but optimisable into reading all the bytes at once), because the write that it reads from can be interpreted as though it were a relaxed-atomic write (even if it were non-atomic in the source code, it would have been compiled as a relaxed-atomic write because the memory model makes no distinction between the two cases). In the situation where no race occurred, it will read the correct value; and in the situation where there was a race, it will read some arbitrary value (without violating any assumptions of the memory model proofs) which is then ignored by the program, so the actual value read doesn't matter.

That said, the paper only proves the correctness of its own compilation scheme (together with that of a few simple optimisations). Atomics have two reasons to exist (ensuring that the hardware doesn't break the semantics of the program and ensuring that compiler optimsiations don't break the semantics of the program), so in addition to showing that the memory model allows an atomic read to race against a non-atomic write (like the paper does), you would also have to show that the compiler also allows this. Fortunately, LLVM is intended to do so (its documentation specifically states that a raced read reads undef rather than producing undefined behavior), so this is another problem which turns out to already have been solved in advance.

3 Likes

Reading arbitrary values can be UB if the values have validity invariants, such as bool. The model would have to promise only ever reading a value that has ever been written. Or seqlocks would have to be limited to MaybeUninit or perhaps zerocopy::FromBytes types.

At the risk of rehashing one of our previous discussions, I believe that inline assembly is a perfectly valid escape hatch for SeqLock. In other words, something like this could be considered sound in the context of SeqLock:

unsafe fn racy_read<T>(src_ptr: *const T) -> MaybeUninit<T> {
    let mut dst: MaybeUninit<T> = MaybeUninit::uninit();
    unsafe {
        core::arch::asm!(
            "<read T from src_ptr into dst>",
            in(reg) src_ptr,
            in(reg) size_of::<T>(),
            in(reg) &mut dst,
            // Ideally, this also would use `readonly` and `pure`,
            // but it's impossible to return a generic `T` from `asm!`
            options(preserves_flags),
        );
    }
    dst
}

Yes, we can not properly describe what the block does in terms of AM, but we can say that it reads something into dst and the resulting data can be interpreted as T (by calling assume_init) only if the SeqLock validity check has passed. The write probably also should be accompanied by an (empty) asm block to "observe" the write.

Validity of this reasoning relies on "external" knowledge of knowing that the assembly block does on the given CPU arch, i.e. we rely on racy reads being well defined on the assembly level.

Whether such operation should be or could be exposed on the language level is a separate question.

1 Like

The idea is that you read into a MaybeUninit (using a special command designed to do possibly racy reads, that contains any necessary synchronization) and only assume_init it once you determine that the read is unraced.

Doing this is technically UB according to C11, but nothing seems to actually rely on the UB (the proofs about the memory model don't rely on it and the compiler doesn't exploit it either), so it would be possible to define Rust to make it not UB without needing any actual changes to anything other than documentation. This thread is, from my point of view, discussing whether it should continue to be UB or whether we should change the rules.

Storing the value as initialised (rather than MaybeUninit) without checking to see if the read has been raced on would be UB even under the current compiler, pretty much for the reasons you mention.

3 Likes