Pre-RFC: core::ptr::simulate_realloc

Oh 100% useful, I mean it looks like we've reached a consensus on what the lowest level helper should look like and the discussions are just over things on top of it specific to use-cases. Very relevant ofc, just not a blocker per se.

1 Like

Okay, so would a compiler fence work to prevent these moves of atomics? An actual CPU fence instruction is of course not ideal as that is not actually needed.

Another possibility would be a empty inline assembly block (taking the relevant pointers as inputs) which should also act as a black box and prevent code movement.

If you want synchronized sequencing between an atomic write and an atomic read of a different location, you need SeqCst ordering to establish that. A compiler fence might work, but it's unclear. But given my current understanding, which is that a compiler fence works like an atomic fence, but only for code running on the same physical core (e.g. with an interrupt handler), it could potentially be sufficient.

How you'd model it would roughly be that a write to a mirrored memory location spawns a new logical AM thread to mirror that write to any mirrored locations with the same atomicity, except that the mirrored writes must occur immediately after the originating write in the global SeqCst ordering if the originating write is sequenced in that ordering. (I.e. they happen in the same "physical core" so fences apply, and are sequenced-after the originating write. Tying SeqCst ordering together is more difficult, and the existing model has no way to give multiple writes the same sequence point, which is what is actually desired.)

I don't know whether compiler fences are actually considered strongly enough for that model to be fully respected, but it seems relatively close enough. I do believe SeqCst is required to establish the ordering between different memory locations still, however, even if I can't conceptualize how AcqRel and SeqCst would practically differ.

Side note: This strategy bans global program analysis optimizations from programs that use IPC shared memory. Something to remember for everyone proposing one of those.

Some global analyses are still possible, but yeah a global program analysis has to model mmap and similar operations in some accurate way.

I have now posted the re-written RFC at the previously linked URL - here. Feel free to continue the synchronisation discussion in this thread if anyone wants to of course.

Considering that, aren't memory mapped circular buffers UB in any C/C++ code? Or is it okay for those languages to contain ub like that?

They are at least highly questionable. All the concerns about optimizations breaking this code apply, so I would not consider such code correct in C/C++ either unless the compiler documents some support for this pattern.