Blog post: observatonal equivalence and unsafe code


There’s an annoying clash of notation here, P ⊑ Q is read as “P is less deterministic than Q”, or “P is less defined than Q”, which rather annoyingly means it has more behaviour. In many models, P ⊑ Q iff ⟦P⟧ ⊇ ⟦Q⟧. Sigh. (If you want the history for this, it’s essentially the Smyth powerdomain which is the model of nondeterminism you get for observational refinement.)

1 Like

Good point. I vote we try to refer to this with the standard powerdomain notation, or it’ll be lifetime bound variance all over again…

I guess the way to think of it is that a safe Rust N is isomorphic to a function F: P -> B that maps programs to observed behavior, and thus is naturally contravariant over what types of programs are allowed.

(Of course, as Rust is Turing-complete, we can think of any number of whole-program transforms that allow Rust N to simulate Rust N + 1 , such as rewriting the program into continuation-passing style to avoid using the heap–but obviously that’s a non-local transformation and breaks all our guarantees about what’s allowed and what isn’t.)


You can restrict the power of a system language to make it easier prove the correctness of the code, keeping the performance reasonable enough to write file systems for a microkernel OS. As explained in this talk, “Refinement through Restraint: Bringing Down the Cost of Verification”:

I think it’s quite related to the tower of Rust0, Rust1, Rust2, etc, you write about.


You responded too fast. That was supposed to take longer.

So the major thing I have to turn over in my head in this “asymmetric” model is of course that unsafe code wants to assume limits on its context (e.g., my caller won’t be able to observe that I use setjmp/longjmp, because they are not poking about at my stack; or, my caller won’t use setjmp/longjmp themselves). I guess that in your post this is what you meant by part (B) here, right?


I’ll try to be more sluggish in the future.

Yes, when proving that some unsafe code appears safe, we would like to able to do this just quantifying over safe contexts, e.g. which can’t mess around with the call stack. We’d like to know that if P is apparently safe wrt safe contexts, then P is apparently safe to apparently safe contexts.


There’s another ingredient typically used when doing refinement – not only does the context have to be safe, it also has to use the “hole” (i.e., the unsafe code) at a particular type. The same piece of code can be safe at one type and unsafe at another type. The type, in some sense, describes the interface between the safe and the unsafe code, it says what the safe code is allowed to do. (This is btw. how I often describe the virtue of Rust to type system people – even though the type checker is fairly limited in what it can understand, and there’s lot of code you can’t write if you follow its rule, the type system is actually pretty expressive without being excessive in terms of number of concepts or so. It can concisely describe complicated contracts like “As you iterate over the Vec, you must not mutate it” or “After you release the Mutex, you must no longer use any pointers you acquires to its interior while you held it”. Then we can put those types at the holes of a safe context and fill the hole with unsafe code that we merely claim has this type; the type checker can check that the user follows the interface we write without having any clue about the subtle invariants it expresses. To me, that’s the true power of Rust.)

In all the papers I read on contextual refinement, P ⊑ Q means "Every behavior of P is also a behavior of Q" / “P has fewer behaviors than Q” (which is why P is “more refined”), so it corresponds to ⟦P⟧ ⊆ ⟦Q⟧ and the two inclusion actually match up (assuming those semantic brackets give the set of possible behaviors of the program). For example, see page 7 of this paper. A typical way to use this would be start from a “rough sketch” of the program you want, maybe just pre- and postconditions, and then you stepwise refine this until you got a program with the efficiency you want. (There’s people literally doing that in a proof assistant, with the proofs going all the way up to some assembly code.) Clearly, if you do this, you’d only want to remove behaviors and hence make sure that your final program actually satisfies the spec you started with. So, “more refined = fewer behaviors”.

However, I “grew up” in a world dominated by operational, small-step semantics. It is possible that domain theorists use an inverted notation.


Yes, observational refinement/equivalence are typed notions, in that you care about the type of the hole. Part of the challenge here is that unsafe code can bypass the type system, which is part of why unsafe contexts are more powerful than safe ones.

You can probably tell I grew up in a denotational world, I probably got the inverted ordering on refinements from the failures-divergences model of CSP, e.g. p. 5 of Brookes and Roscoe (1985).

closed #28

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.