Blog post: observatonal equivalence and unsafe code


I wrote this blog post:

Here’s just a few paragraphs (snipped somewhat out of order) that try to summarize what it’s about:

But I want to bring this back to the core point: the focus in the unsafe code guidelines has been on exploring what unsafe code can do “in the small”. Basically, what types it ought to use to achieve certain kinds of aliasing and so forth. But I think it’s also very important to nail down what unsafe code can do “in the large”. How do we know whether (say) abomonation, deque, and so forth represent legal libraries?

The TL;DR of this post is that I think we can limit the capabilities of unsafe code to be “things you could have written using the safe code plus a core set of unsafe abstractions” (ignoring the fact that the safe implementation would be unusably slow or consume ridiculous amounts of memory). This is a helpful and important thing to be able to nail down.


Rust3a: converting from &mut T to &Cell<T>

It seems as fundamental as what Rayon does.


Indeed, these are some very good questions – thanks for bringing them up. I thought about some of it myself, while trying to find a formal definition for what the Rust types “mean”. In some sense, this is approaching the problem from the other end: Once I found a definition I am happy with (that’s still work in progress), this will carve out a precise subset of “things unsafe code can do” – namely, all those things which we can still prove falls into the “meaning” of types, and nothing else. I will try hard to make it support at least what you called “Rust5” plus at least some forms of abomonation. Of course this is just a first proposal, and it may turn out later that there are things considered “safe” by the community which my model rules out.

On another note, if I understood some recent discussion about async/await properly, then this is incompatible with scoped threads. That would then be another example of two extensions to what is deemed “safe” (i.e., neither of them can be simulated with just safe Rust5) that are mutually exclusive.

One thing confused me on first reading (it was clear only when finishing the post): You start out explaining that setjmp/longjmp and Rayon are mututally exclusive, but that these are qualitatively different extensions: setjmp/longjmp grants new powers, but Rayon does not. If this were true, then setjmp/longjmp would have to be unsafe even without Rayon; after all, Rayon would not constitute an “extension” and hence it can’t possibly break other unsafe extensions. You explain later that Rayon actually is an extension, but I still think is confusing because it actually says the opposite of what is the entire point of the post: If an unsafe library does not grant new powers on top of “base Rust” (e.g., Rust5), then it cannot possibly be mutually incompatible with other unsafe libraries; and hence only libraries that truly grant new powers have to be careful about this point.

Btw, I assume Mutex and RwLock are also in Rust5?


@RalfJung Actually async/await is compatible with scoped threads:

  • If you are using stackless coroutines (ie just a struct with a state machine) then this isn’t an issue becauses crossbeam::scope runs in a lambda, which you can’t await from (you can only await in the top-level async function).

  • If you are using stackful coroutines then all you need to do is make sure you unwind the coroutine’s stack before you destroy it. This will run the scope destructor, which will in turn block until scoped threads have finished running.


A very nice and well-structured post, as always. Small nit: You’re a word in “Not quite how to think about that yet”.

The ‘language levels’ description seems to be very relevant to my electrolysis project, which is all about reasoning about safe code while replacing unsafe code with (hopefully) observationally equivalent code.

The only unsafe parts I encountered when verifying [T]::binary_search were mem::swap and [T]::len/index. I’d be curious at what level of the hierarchy you would place those. [T] is, of course, already special in itself, but replacing it with a trait containing only len and index_mut should enable safe Rust to provide observationally equivalent impls of it. Because that trait would basically describe any type of sequential list, I’ve axiomatized it as such.

mem::swap is probably more interesting because it feels like we could just make it a trait fn and auto-implement it for any safe type, which would work except for anything containing &mut. It would probably be easiest to declare it an intrinsic just like most other fns in mem. Because I don’t have to worry about affine types on the theorem prover side (and currently assume type parameters not to be instantiated by anything containing &mut), I can have a straightforward implementation.

By now, I’ve moved up to Rust1, completely axiomatizing Vec - again, as a simple inductive list. Which makes me curious about your claim about simulating Vec just via Box. By using the latter to create a singly-linked list type, and given the abstracted [T] trait I’ve described above, shouldn’t contiguous and non-contiguous lists be observationally equal to safe Rust (except for pointer comparisons, which I assume would break many other supposed equalities)?

Moving up to Rust2 would be as easy as assuming Rc<T> = Arc<T> = T for me, which, again assuming T does not contain any &mut, should be a safe axiomatization. There are of course a few fns revealing information about reference counts that I couldn’t implement, but the same would be true for any hypothetical safe Rust implementation.

Rust3 (non-atomic mutation) is probably right out of the window for me because the no-aliasing &mut guarantees are the whole premise of my project. Rust4 (asynchronous threading) on the other hand shouldn’t be a problem for me exactly because of that premise. And even if I weaken the premise at some point (by, say, optionally introducing a heap managed by separation logic), I will certainly not touch Rust5 (communication between threads and processes) any time soon because that is one of the hardest problems known to formal verification.


Someone posted a good example of this issue in pure Rust somewhere recently, but I can’t even remember which venue it was posted in. Maybe someone else will know. In ‘safe’ code, with one of the coroutine libraries (probably coroutine-rs) and crossbeam, you can trivially introduce memory unsafety by using a coroutine yield to bypass the scoped thread’s destructor scope.

The problem is that there’s a trade off between different language extensions, which are safe in the presence of the base language (RustN of some N) but not safe in the presence of each other. I think asking whether they can be observationally emulated by lower levels is a good heuristic for distinguishing between these extensions.


@withoutboats I posted this example, maybe that’s what you are referring to? The solution is simple: libfringe needs to unwind the stack of the coroutine when it is dropped (PR). It’s just a bug in the current implementation of coroutines and not a language-level issue.


(Note: the current signature of join would probably prohibit this, since it does not reflect the fact that the first closure is known to execute in the original thread, and hence requires that it close over only sendable data, but I’ve contemplated changing that.)

It’s off-topic, but just a reminder – the first closure doesn’t execute in the original thread if you’re calling join from outside the thread pool (as we discussed in rayon#20).


One concern I have with this article is that I feel like it doesn’t explicitly state that these rules only apply to abstraction boundaries between safe and unsafe code. That is, I should hope that setjmp / longjmp are usable within an unsafe function (since the caller is then responsible for making sure that it doesn’t do anything like what Rayon does).

I also think you are missing some useful unsafe abstractions. Page table remapping can allow two physical addresses to point to the same memory, an important optimization in some new in-memory databases (breaks various sorts of optimizations that depend on exact address comparison, among other things). Nontemporal streaming loads and stores, as well as burst mode writes, have extremely weird consistency semantics and are probably also not replicable in safe code. Certain types of DMA likely enforce new restrictions on unsafe code. And so on.

I recall that mem::swap also adds a new capability (the ability to call drop on a type T when you only have &mut T, possibly? It was a while ago). Not sure if something else in your article already implies it (it wouldn’t surprise me if Vec can emulate swap).

I also think there are a handful of hardware features that can increase the amount you can safely abstract. In particular, on hardware with known cycle-accurate timing and/or direct-mapped scratchpads, you can safely interleave instructions across hardware threads on the same data without the possibility of data races (especially if you can know that you’re the only process pinned to a core–something else that it is currently not possible to express in safe Rust, to my knowledge).

To be clear, I’m not convinced all of these can be wrapped in a completely safe abstraction (at least, not a process-level abstraction), but I definitely think they are useful things to be able to do in Rust.


I’m not sure it makes sense to talk about these things as “levels” - Cell does not require heap allocation for example. They are separate “features” which go above and beyond Rust0, and in some cases they are related. It could be the case that one feature builds upon another, but it could equally be the case that two features are mutually exclusive, and that you can only choose one.

For example, scoped_thread_local, as it was implemented, was unsafe in the presence of coroutines, and yet both of those things are “valid” extensions to Rust0 on their own. To avoid an explosion of marker traits like Send, rust will either have to become opinionated (via the unsafe code guidelines) about which of these exclusive features are valid, or else have it be configured at compile time and accept that you can’t combine two libraries relying on incompatible features.


A minor nitpick: rather than observational equivalence, I think we might want to work with observational refinement. This defined in a very similar way to observational equvalence: P ⊑ Q whevever, for any context C, any observable behaviour of C[Q] is an observable bahvaviour of C[P].

The reason for being interested in observational refinement is that there are some optimizations that are only valid in one direction, for example let x=a.load(Relaxed); let y = a.load(Relaxed); ...let x=a.load(Relaxed); let y = x; ... is a perfectly valid optimization, as it is removing possible behaviour.

A possible way to go would be to say P appears safe if there is some safe Q such that Q ⊑ P. This depends on language level in two ways: a) we are asking Q to be safe, and b) when we are quantifying over C, we might want to restrict ourselves to safe contexts.

One nice property of this defn is that (er I think) it’s compositional: if P appears safe and Q appears safe, then P∥Q appears safe.

Also, if we define CHAOS to be the most nondeterministic safe program, then any safe Q has CHAOS ⊑ Q, which means that P appears safe iff CHAOS ⊑ P.


I feel like the details matter here a lot. Do you have a link to said discussion, or can you say more about what made async/await incompatible? Maybe you are referring to this Reddit thread talking about coroutines? (which I also linked to from the post)

In any case, what @Amanieu wrote seems correct:

This seems right. You can’t free the stack frames until you unwind, though you could leak them forever.


This confuses me a bit. [T] is not “unsafe” Rust, right? I guess you mean the various methods that exist (like len()) on [T], which are implemented unsafely? You’re right that I overlooked those, I considered them part of Rust0 I suppose. =)

Yes, it seems like there are a small number of such functions (similar to len) that are probably a better Rust1 than “access to the heap”.

Well, my point was that you can (in Rust0 – with no unsafe code at all) create a &[T] and dereference into it, like so:

fn foo() {
    let x: [i32; 5] = [0, 1, 2, 3, 4];

fn bar(x: &[i32]) {

Doing this coercion requires implementing the Unsize trait, but that is not unsafe to do. The actual code for performing said coercion is builtin into the core Rust language (Rust0), as is the code for bounds checking and indexing. So it seems to me that you really do have to provide adjacent memory, at least if you want to mirror the current setup. But it seems to me to be a minor point – and I could imagine some of this code moving out of the compiler, if we ever found a nice way to do it.

I’m confused by a few things here. First, why can you assume T does not contain &mut? Also, how do you address the point I raised in my post: that Rc<T> can be applied to any T, not just T: Clone, and that this cannot be implemented in general without true sharing?

Maybe I don’t really understand your project though =). I guess it is trying to axiomatize a subset of Rust, more than the full Rust language?


Heh, I had forgotten that (again), but yes. =) Still, you could (in principle) change the signature, if you wanted. It’s just probably a bad idea.


Are you saying that this is a problem with my post? I think this is precisely what I was trying to get at. I also think that in drafting unsafe code guidelines, we are going to have to make some choices. I don’t yet know how to deal with irreconcilable unsafe code (e.g., setjmp vs rayon). I can imagine that in some projects one or the other might be essential, so it’s not entirely clear that we can pick a winner (and, as I pointed out, some constraints might not be relevant in all contexts).

But perhaps drilling more into these areas can give us a framework by which unsafe code can declare its compatibility requirements, so we can at least tell you when two extensions are incompatible. And then we might have a kind of “core set” of unsafe abstractions that programs opt into by default – if you are incompatible with this level, that will narrow the set of people that can use you.

This seems to me to be not unlike libstd or any other similar portability hazard. I feel like I keep coming back to this same basic approach in a number of contexts.


Ah yes, I actually meant to write about that specific point, but I guess I forgot to do so. =) Perhaps in a follow-up.

Indeed, I made no claim to completeness. =) Those are good examples though. It seems obvious that we won’t want to include every random memory model thing that hardware can produce though.

It seems like here you are talking about tricks unsafe code could play without fear of “detection” by safe code? Statements like this last one feel a bit ambiguous to me. =)

In any case, diving into some of the more esoteric frontiers of what hardware has to offer like this seems interesting to me. I was not aware of a number of these extensions you mention. Certainly my biggest concern with trying to enumerate what unsafe code can do is that we limit ourselves – and yet if we don’t enumerate what unsafe code can do, then of course we are also limiting ourselves.

Having some notion that we will describe ways for unsafe code to declare its compatibility does seem to offer an escape valve.

It’s not clear to me how urgent this problem is, but then again it has already arisen (in the form of coroutines). Perhaps a reasonable starting point would be to try and enumerate the various capabilities we know of across libstd and unsafe code, and encourage people to think about which of those capabilities would be sufficient to model the unsafe code in question, even if this is just documentation to start. This would be a starting point at least.


Agreed, it’s more of a DAG.


I think I was really just re-stating your post more than anything else. :slight_smile:


[quote=“nikomatsakis, post:13, topic:4148”]

This confuses me a bit. [T] is not “unsafe” Rust, right?[/quote] I’d argue it’s still special in that there’s no safe way to construct a &[T] (except for the array coercion, I forgot about that). If we replaced it with the mentioned trait, we could safely implement [T] for [T; $N] in Rust0 and for Vec<T> in Rust1, with trait object coercion replacing the builtin array/Vec’s Deref coercion. This might not be very useful (even if safe Rust cannot distinguish it from contiguous memory, we’d still like to have that assumption for unsafe code) and, as you said, a minor point, it’s just something I was thinking about to make Rust0 smaller.

Yeah, I’m focusing on a reasonable subset of safe Rust that actually allows me to verify a few programs in 6 months’ time. If I ever have the chance in the future, I’d love to do something like seL4’s AutoCorres project, starting at a full semantics and memory model of Rust (as soon as we have those things :slight_smile:), then doing verified transformations to a simpler and simpler language. But for this masters’ thesis, I’ll just start at the topmost abstraction level and see how far I can get without an explicit heap.

I’m not sure why I mentioned &mut here. As long as T doesn’t contain any UnsafeCell (which it won’t, since there’s no way it can axiomatize that one), I can assume it’s immutable inside Rc/Arc and axiomatize Rc::clone as just returning self - there are no affine types in my target language Lean.

All in all, my project probably isn’t too relevant to the Rust semantics discussions since it only touches the parts of the semantics everyone is already agreeing on. Still, I wanted to show that there basically exists an implementation of the language levels design, so (to me) it’s not a purely theoretic discussion (though still academic :slight_smile:).


I have more questions, but let me start here to buy myself time to think. =) Do you have this backwards: “any observable behavior of C[Q] is an observable behavior of C[P]”? I am just surprised because the P ⊑ Q notation suggests “P subset of Q” to me, but it seems that the behavior of Q is in fact a subset of P?