Types as Contracts

unsafe fn is_ub_1012A() -> u32 {
    let mut x = 0;
    let n = (&x as *const u32 as usize);
    unsafe {
        // this is UB - `x` was never mutably borrowed,
        // but it is somehow aliased.
        *(n as *mut u32) = 1;
    }
    x
}

unsafe fn is_not_ub_1012B2() -> u32 {
    let mut x = 0;
    let n = (&x as *const u32 as usize);
    unsafe { 
        // this is not UB, but if we just CSE p with n, and
        // dead-code eliminate the casts, we get to
        // the first example, which is UB.
        let mut p = (&mut x as *mut u32 as usize);
        // (desugared `assume`)
        let eq = p == n;
        if eq {
            *(p as *mut u32) = 1;
        } else {
            std::intrinsics::unreachable(); // trigger UB
        }
    }
    x
}

I think 1012A/1012B example shows that we can't have all of the following in a single IR:

  1. If every mutable borrow of an address is only used locally, the value behind that address can't be modified except directly through that borrow (this is implied by the vague principle that "every mutation must happen through a mutable borrow").
  2. equal integers are equivalent - if 2 integer values are numerically equal at every potential dynamic execution, they can be interchanged while preserving semantics.
  3. DCE of integer equality.
  4. Removal of a conditional with 1 branch being immediate UB ("assume-based optimization") - other conditional optimizations work too.
  5. Dead code elimination of both int-to-ptr and ptr-to-int casts

I think (1) is an important (and as your examples show, it might be replaceable with many other aliasing rules), and (2) (3) and (4) are important "integer" optimizations I would like to keep, so (5) is the odd one out.

If we want 1 IR to do all the high-level optimizations, I don't see a good way of getting around this proof. We could have a "2 IR" semantics - say, non-constant branches can never be eliminated in MIR and are the "point of broadcast", while a lower IR can remove them. We're probably going to have a "2 IR" semantics anyway - a peephole optimizer doesn't have to lower a dead int -> ptr cast to anything - but we need an "higher level" IR for inlining, and I think we want it to have 1 through 4.