I would describe it as something like a CAS version of a SIMD masked load on a u8
vector (I think we only need byte-level precision) -- we never even load the other bytes so if they are uninit it doesn't matter. That avoids having to talk about freeze
.
However, even then the implementation as a loop does not have the same liveness properties as the AM semantics: if one thread tries to do such a CAS while another thread constantly does atomic writes with random values to the same memory, then in the AM the first thread would be guaranteed to terminate. However, a loop-based implementation (even if we only introduce the loop in machine IR, entirely avoiding all optimization-related questions) could make the first thread loop forever.
IOW, this pseudocode would always terminate under AM semantics (assuming a fair scheduler), but the produced binary could fail to terminate:
static X;
static FLAG;
thread::scope(|s| {
s.spawn(|| {
do_a_masked_cas(&X);
set_flag(&FLAG);
});
s.spawn(|| {
while !get_flag(&FLAG) {
write_random(&X);
}
});
});
So no, this does not solve the liveness concerns.
Also, a static mask indicating which bytes to compare and which bytes to ignore would not suffice to permit CAS on enum
s with fields, as for them the padding mask depends on the enum discriminant.