Types as Contracts

arielb1 · October 16, 2017, 1:28pm

unsafe fn is_ub_1012A() -> u32 {
    let mut x = 0;
    let n = (&x as *const u32 as usize);
    unsafe {
        // this is UB - `x` was never mutably borrowed,
        // but it is somehow aliased.
        *(n as *mut u32) = 1;
    }
    x
}

unsafe fn is_not_ub_1012B2() -> u32 {
    let mut x = 0;
    let n = (&x as *const u32 as usize);
    unsafe { 
        // this is not UB, but if we just CSE p with n, and
        // dead-code eliminate the casts, we get to
        // the first example, which is UB.
        let mut p = (&mut x as *mut u32 as usize);
        // (desugared `assume`)
        let eq = p == n;
        if eq {
            *(p as *mut u32) = 1;
        } else {
            std::intrinsics::unreachable(); // trigger UB
        }
    }
    x
}

I think 1012A/1012B example shows that we can't have all of the following in a single IR:

If every mutable borrow of an address is only used locally, the value behind that address can't be modified except directly through that borrow (this is implied by the vague principle that "every mutation must happen through a mutable borrow").
equal integers are equivalent - if 2 integer values are numerically equal at every potential dynamic execution, they can be interchanged while preserving semantics.
DCE of integer equality.
Removal of a conditional with 1 branch being immediate UB ("assume-based optimization") - other conditional optimizations work too.
Dead code elimination of both int-to-ptr and ptr-to-int casts

I think (1) is an important (and as your examples show, it might be replaceable with many other aliasing rules), and (2) (3) and (4) are important "integer" optimizations I would like to keep, so (5) is the odd one out.

If we want 1 IR to do all the high-level optimizations, I don't see a good way of getting around this proof. We could have a "2 IR" semantics - say, non-constant branches can never be eliminated in MIR and are the "point of broadcast", while a lower IR can remove them. We're probably going to have a "2 IR" semantics anyway - a peephole optimizer doesn't have to lower a dead int -> ptr cast to anything - but we need an "higher level" IR for inlining, and I think we want it to have 1 through 4.

Topic		Replies	Views
Blog post: Thoughts on trusting types and unsafe code Unsafe Code Guidelines	46	6276	February 9, 2017
"Tootsie Pop" model for unsafe code language design	107	12421	June 15, 2016
Pointers Are Complicated II, or: We need better language specs	146	10614	January 17, 2021
Stacked Borrows: An Aliasing Model For Rust Unsafe Code Guidelines	48	8184	October 18, 2018
Add rustc flag to disable mutable no-aliasing optimizations? compiler	125	8533	April 10, 2021

Types as Contracts

Related topics