This issue uncovered the following very interesting method in bytes
:
#[inline]
fn kind(&self) -> usize {
// This function is going to probably raise some eyebrows. The function
// returns true if the buffer is stored inline. This is done by checking
// the least significant bit in the `arc` field.
//
// Now, you may notice that `arc` is an `AtomicPtr` and this is
// accessing it as a normal field without performing an atomic load...
//
// Again, the function only cares about the least significant bit, and
// this bit is set when `Inner` is created and never changed after that.
// All platforms have atomic "word" operations and won't randomly flip
// bits, so even without any explicit atomic operations, reading the
// flag will be correct.
//
// This function is very critical performance wise as it is called for
// every operation. Performing an atomic load would mess with the
// compiler's ability to optimize. Simple benchmarks show up to a 10%
// slowdown using a `Relaxed` atomic load on x86.
#[cfg(target_endian = "little")]
#[inline]
fn imp(arc: &AtomicPtr<Shared>) -> usize {
unsafe {
let p: &u8 = mem::transmute(arc);
(*p as usize) & KIND_MASK
}
}
#[cfg(target_endian = "big")]
#[inline]
fn imp(arc: &AtomicPtr<Shared>) -> usize {
unsafe {
let p: &usize = mem::transmute(arc);
*p & KIND_MASK
}
}
imp(&self.arc)
}
This got flagged by miri because of the transmute from an &UnsafeCell<T>
to an &T
(which is not okay, because the shared pointer expects to point to something frozen). But that’s not even the problem: The justification for the non-atomic access here just doesn’t hold. The C++11 model and also LLVM’s model say that a read-write race is either UB or yields undef
, but it certainly won’t give you correct values for some bits. LLVM’s undef
-on-race is explicitly happening on a byte-for-byte basis.
Any thoughts?
@carllerche One thing I am wondering, have you benchmarked using a relaxed load? EDIT: D’oh, you did, sorry for not reading properly. That is somewhat surprising, in particular on x86, but I guess LLVM is just too conservative around relaxed accesses.
Also, from the comment in your code, it is not clear to me whether you are saying that this is not UB, or whether you are saying that this is UB but does not cause trouble in practice. I am convinced that, if anything, it is the latter – and the comment should clearly say so.