Stacked Borrows: An Aliasing Model For Rust

Hmm, that makes sense.

Some other notes / nitpicks:


Given a function argument of type &T, I think your model already requires that the argument is a valid pointer (aligned, not trapping, not null), even if the function body never dereferences it: otherwise, the retagging and barrier operations wouldn’t make sense. Thus, rustc is justified in applying LLVM dereferenceable, as it currently does. However, it might be worth stating this more explicitly.

On the other hand, given a function argument of type &&T, if the function body never dereferences it, the inner pointer could be invalid. So if LLVM hypothetically offered a “deferenceable N times” attribute, rustc couldn’t use it.

De-reference-ification optimization

Another hypothetical optimization I’ve talked about in the past is: among function arguments that are immutable references to word-sized values (e.g. &usize, &&T), automatically detect cases where the function implementation doesn’t care about pointer identity for that argument, and transform the ABI-level signature to take the value directly (usize, &T). Of course, it isn’t always possible to statically determine whether the function cares about identity. But in many cases it is possible; in particular, if the function body passes the reference as an argument to another function, rustc can propagate the “doesn’t care” flag from the callee to the caller.

If &&T can point to an invalid pointer, there would be some restrictions on that optimization:

  • It would still be okay to do the transformation to pass &T directly, but the new argument couldn’t be marked dereferenceable.
  • If T is itself word-sized, it would not be okay to do the transformation twice and pass T by value.
  • On the other hand, the compiler could assume validity if it additionally detected that the argument is unconditionally dereferenced (again by the function body or its callees). But that isn’t always the case; e.g. if the function has an early exit (perhaps using the ? operator) before it happens to touch that argument, it’s not an unconditional dereference.

Probably not the end of the world.

Unsized types

You mentioned "mem::size_of<T>() many bytes"; I guess that for unsized T, that should that be mem::size_of_val(t) many bytes? …I can’t actually think of any interesting or problematic subtleties related to that, but I’m mentioning it anyway just for the record.

Reinterpreting memory

Unfortunately… I agree with gereeter that transmuting between pointers and references, or transmuting &&mut T to &&T as they suggested, or doing the equivalent of a transmute by copying bytes, etc. should be legal. I suspect there’s a fair amount of existing code that uses such transmutes, and it would be surprising (in the “surprise! UB” sense :wink: ) if they didn’t work – especially considering that they’re legal in C.

For this to be accommodated, I guess it would have to be possible for arbitrary values to carry tags, not just reference-typed ones – including raw pointers, usize, and even u8. I suppose this contradicts your desire to make non-reference types as “non-magical” as possible. But I think the alternative is worse.

Local variables


fn foo() -> i32 {
    let mut x = 2;
    let _ = &mut x as *mut i32; // does not escape

Can the compiler optimize this to always return 2?

If x were immutable, that optimization would arguably be justified in your model, under the assumption that an immutable local variable acts like an immutable reference. But not if x is mutable.

In some sense this is logical. If the raw pointer were passed to bar() rather than thrown away, the optimization would clearly be invalid; but you want raw pointers to be non-magical and untracked, so the fact that it’s thrown away shouldn’t make a difference and it should be invalid here as well.

However, from a compiler perspective, I think that would be quite surprising. The raw pointer cast seems like dead code, but removing it would change semantics. And LLVM is definitely going to want to remove it, so you’d have to use some special mechanism to trick LLVM into thinking the pointer value escaped.

I think it would be better to make this optimization valid by introducing an additional rule, something about values that are completely non-escaping.

Such a rule would also allow something like

fn no_op(p: &mut i32) -> &mut i32 {
    unsafe { &mut *(p as *mut i32) }

to be optimized into a true no-op.

This really is a interesting example, and I think it does in some sense “feel” like it or something like it ought to be allowed. I think it is generally fine to say that your specific code is UB (in my mind at least is pretty flagrantly mutates something that has had a shared reference taken which is invalid) as long as there is a clear guideline for what should be done instead. Specifically, regardless of how ugly the resultant code is, it should definitely be possible to create a value of that specific Node type (no changes like requiring it to be more generic or giving it repr(C)) using unsafe code that is cyclic in the way you describe.

This is the best I was able to come up with. As long as UnfinishedNode and Node are layout-compatible (which is a guarantee I think we can choose to provide, though we don’t right now), this should do the same thing as your code without undefined behaviour.

I think I would be in favor of a strong enough guarantee about struct layout to make this sort of code work (possibly with some extra annotations). I don’t think this would be at all onerous, though it might get in the way of struct layout randomization; I don’t actually think it will though. As a specific strawman proposal designed to be extra conservative:

Two monomorphised types S and T are layout-compatible if any of the following hold.

  • S = T
  • S is a repr(transparent) wrapper around S2 and S2 and T are layout-compatible.
  • T is a repr(transparent) wrapper around T2 and T2 and S are layout-compatible.
  • S and T are both built-in pointer types (&mut, &, *mut, *const) and the pointer payloads are either
    • Both Sized,
    • Both slices,
    • The same trait
  • S and T are both structs with the same number of fields and, when paired up in declaration order, every pair of fields
    • Contain layout-compatible types
    • Have the same name (this is a bit extreme, but I wanted to be conservative)
  • S and T are compatible according to repr(C) or other specific representation rules (e.g. packed, repr on enums, etc.)

If we made those guarantees, I think the cyclic data-structure use case is not too heinous to deal with.

The primary difficulty is that it is really hard to maintain data dependency chains and make sure that you don’t track things the way they were at the source level. Yes, it is painful to not be able to remove the dead cast, but that will be easily cleaned up later, and without it, you have to be sure to not break a different part of the chain. The traditional (well, I mean the one I’ve seen in multiple places; this isn’t really old enough to have tradition) example is

fn foo() -> i32 {
    let mut x = 2;
    let ptr = &mut x as *mut i32 as usize; // does not escape
    bar(ptr - ptr);

Now bar in some sense has been passed the pointer, and unless you want to deal with some complicated questions of what information content certain values leak about the pointer, you need to let bar access x. But then you can’t rewrite ptr - ptr to 0. Complicating pointer-to-integer casts is annoying, but it seems to me like the least bad option.

See also the sort-of-tangent to the Types as Contracts discussion which I think did a great job of demonstrating the tradeoffs.

Additionally, I don’t think it would cause any confusion past the AST, as I’m imagining that the lowering from AST to MIR would lower a reference-to-pointer cast as two statements in the MIR: a (brand-new) PushRaw operation and then a transmute to do the actual cast. Then, the compiler would be free to remove the cast, just not the PushRaw which has side effects. The PushRaw in some sense acts like a compiler barrier - it prevents certain movement of operations, but when MIR gets lowered to LLVM, it would just disappear.

This is just a MIR-level model, so it is fine if LLVM removes the cast - we just can’t remove it (or, well, the push of Raw, which is a bit different) before lowering to LLVM. We do need to be careful to compile to valid LLVM, but as long as MIR undefined behaviour is a superset of LLVM undefined behaviour, which it should be since we throw away a bunch of extra information, then LLVM will generate valid code.

Indeed, but as specified it is not a superset. LLVM will assume that alloca’ed variables whose addresses do not escape are not mutated externally; in the example I gave, it actually does optimize the function to always return 2. That’s why I said it would be necessary to trick LLVM into thinking the address did escape.

As long as UnfinishedNode and Node are layout-compatible … I think I would be in favor of a strong enough guarantee about struct layout to make this sort of code work (possibly with some extra annotations). I don’t think this would be at all onerous

I tried that approach too, in the real problem this example was derived from. But it turned extremely burdensome at the source code level, because in fact Node is not a simple type at all; it’s a collection of fairly complex types. (I not actually deserializing a graph, I’m making immutable, thread-safe copies of the heap in a scripting language implementation.) But it is possible.

in my mind at least is pretty flagrantly mutates something that has had a shared reference taken which is invalid

I’m not sure that’s true (in the sense that I think there’s a reasonable aliasing model for which it’s ok). From this example : you could argue that the lifetime (in the “define what’s UB sense”, not the static checker sense) of the reference created on line 31 ends immediately, since it’s never actually used. It is reachable from another reference but it’s never actually reached. (And we can’t outlaw the mere existence of a chain of references that would point to mutating memory if someone were to follow them, because there are plenty of totally safe ways to do that.)

Then the ‘launder’ on line 37 discards all the old references and creates a fresh one, so the compiler can’t assume anything about (transitive) loads through it from before line 37.

The model that fits my intuition is analogous to data dependent reads in a processor memory model, and (I think) a close fit for ‘noalias’ in clang. For shared refs:

  • Assume a global clock, and timestamps of last write to each address (stealing your idea here)

  • When a new ref is taken (or a ref is cast from a pointer, or from an &mut ref), it gets a timestamp tR. If any load through that ref observes a memory timestamp tM > tR, it’s UB.

  • When a ref is created by loading through another ref, it gets the parent ref’s timestamp.

  • Thus the compiler can assume that, given some shared root references, nothing loaded through any transitive ref path will mutate, and any load can be moved to as early as the initial ref creation or as late as the last load. But it can’t make any assumptions about data that is never loaded.

(I don’t know how mutable refs would fit in this model, but hopefully your stacked borrows would adapt

An unrelated observation (I made it earlier but now realize that it applies more broadly). For owning types like Rc<T>, Vec<T> and especially Box<T> - if you start with an immutable root, the whole graph is immutable, and the compiler should be able take advantage of that just like with plain refs. But the model falls back to the free-for-all ‘raw pointer’ case because they these types use pointers in their implementation. Does this inhibit possible optimizations or sanitizer checks? (If so, maybe special rules for std::ptr::Unique would fix that?)

1 Like

I think weak memory is of little concern here because all potentially racing accesses must happen through &UnsafeCell, so there is no freezing and no checking going on – everything is Raw.

Why do you think this is ruled out? The second reference would re-freeze, but that’s okay.

Uh, good question. The return value would be activated, and then that would be UB because the location is not frozen.

Aha! I agree. I want that, too, and I hadn’t thought about what happens.

If we would do things like freezing recursively through pointer indirections, things would work out as the inner T would be frozen anyway. But with the model as I described it, I think this cast would actually not be legal because, well, it’s a transmute of &mut T to &T, and then see above.

We could say that a shared reference, upon retagging, starts a freeze if things are not already frozen, but I am worried that this might weaken shared references because now we can no longer know that stuff has already been frozen.

Or we could make creating a shared reference (from an &mut) special in the sense that it is the only operation which proceeds recursively and freezes everything. But that also seems surprising, when nothing else recurses through references.

Note that at least in my mind, the tag is just a timestamp, it does not know whether it is Uniq or Frz. So transmuting a &mut T to a &T actually “reinterprets” the tag. Maybe the tag should come with that extra information as you seem to assume it does, and maybe that helps – retagging would activate the Uniq, but then freeze. I think that solution sounds best, from the ones I have seen so far.

I do not follow. The inserted write is visible to whoever passed you the &mut – someone else will have access again when you are done. Spurious writes are only okay if the reference is never used again by anyone else in the future. Am I missing something?

However, this means that we can only add noalias when we can show that the corresponding reference is never dropped.

Correct. I was focusing on aliasing in this post; the next post will be about which party can assume which invariants on types to hold when, and that’s when I am going to come to this point.


Ah, that is an interesting one – and a good case for doing some kind of validation recursively through references.

On the other hand, we have plenty of unsafe code that hands around references to not-completely-initialized data. So I am a bit worried about how such code might interact with this kind of optimization – not very well, probably.


Making integers carry tags invalidates all sorts of arithmetic optimizations, so that is a no-go.

Raw pointers carrying tags that are being ignored would be fine though. However, that may or may not be enough for this situation, I would have to look at some examples.

This is a hard question in C, and it is juts as hard in Rust. We recently wrote a paper about it. I do not plan to solve those questions with this model.

The intent is that the compiler may optimize, but that has nothing to do with Rust’s reference types any more, this is the usual reasoning that you would also do in C about not being able to guess addresses.

Thanks, this helps a lot!

As @gereeter wrote, this code is UB:

    let a: &'static mut Node = alloc(Node {
        name: "a",
        next: &EMPTY,
    let b: &'static mut Node = alloc(Node {
        name: "b",
        next: unsafe { &*(a as *mut Node) },
    }); = b;

You are creating a shared reference to a, and then mutating it. I think allowing this would inhibit most of the optimizations we want to do around shared references.

UnfinishedNode with an UnsafeCell is one possibility to get around this. Another one might be to use raw pointers, though transmuting a raw pointer to a shared reference is also somewhat subtle…

The launder does not affect the data behind the reference though – it launders the outer & in &'static Node, but it does not have any effect on the references stored inside that struct.

And if it did, the a as *const _ – casting a shared ref to a raw ptr – will activate the source reference, so it is like a use.

Oh, data dependency models are usually pretty much impossible to use for a source language. (E.g.: does x * 0 depend on x?)

This is an interesting suggestions. I find it a bit odd, TBH – I’d expect it to maintain its own timestamp. But I think one could adapt my model just a tiny bit to have a very similar effect (otherwise it already seems to have a lot in common with yours): When a shared ref is loaded, we don’t activated it – we just retag (give it a new timestamp and initiate that). You would probably want to do the same for mutable references.

My intuition was that you would want the ref you load to already “have been valid”.

However, this might not interact too well with the barrier… not activating the reference being loaded means that it might actually not be derived from the parameter… for example

fn foo(x: &mut i32, z: usize) -> i32 {
  *x = 12;
  let y : &mut i32 = bar(usize); // imagine this aliases x
  *y = 13;
  *x // can we optimize this to 12?

Under a model that doesnt activate incoming references, y might actually push its own Uniq on top of x, and then later when x is used again that gets popped. So the above could would be allowed. That does not seem right to me.

So, I am not sure I like “loading a ref inherits the parent timestamp”. This makes it completely irrelevant where the loaded ref is coming from, i.e., how it was computed; that’s a lot of information that would go missing. Maybe it works if we do this only for shared references, I would have to think more about that.

The compiler should understand that things are immutable once the & surfaces. Before that, I find it hard to communicate this intent – but yeah, maybe there is a way to make Unique mean something. I am not sure what it would mean, though.

Question for my own edification: as far as I gather, this model does not deal very well with “transmutes”. What about the “types as contracts” model - is it any better at handling them?

You are creating a shared reference to a , and then mutating it. I think allowing this would inhibit most of the optimizations we want to do around shared references.

I thought your model was being careful work only in terms of actual loads and stores, and not say anything about data which was reachable through the type system but never actually reached. In this example the mutation to some data reachable from ‘a’ is never observed.

The launder does not affect the data behind the reference though – it launders the outer & in &'static Node , but it does not have any effect on the references stored inside that struct.

I guess I’m confused about the precise terminology in use here. Immediately after let p = some_address as *const (&'a i32, &'a i32); let r = &*p;, is there one reference or three? I was assuming the former–i.e. references that are part of some data structure sitting in memory aren’t part of the formal model until they’re loaded–but apparently that was wrong?

I think my first example had unnecessarily long ref lifetimes which confused the issue. Here is a cleaner version, where the work-in-progress nodes are stored in pointers instead of refs. Is there UB in this version?

Edit: yes, I was mixing up a half-baked model in my head with your model. I think your model does define what happens for all of my examples. Bits of the data-dependency-like model were stuck in my head because that’s the only way I can see to reasonably deal with transmutes. Cases like what to do with pointer * 0 fortunately aren’t actually a problem if you’re only tracking references.

The trouble with transmutes comes from having extra data stored at pointers, so there is a representation difference between mutable references, shared references and raw pointers. “Types as Contracts” had the explicit goal to not do that, so it did not have this problem.

However, it made lifetimes matter. I am not sure which is worse…

Then you are misunderstanding. My model explicitly performs “activation” of all references that enter or leave a function – passed in as argument, passed out as argument, loaded from memory, stored to memory, passed in as return value, passed out via return. This gives us some nice invariants to work with – it is needed, for example, for the dereferencable annotation. Also we do retagging on references coming in, we can’t do that without looking at the data. The model is mostly access based, but not 100% so.

However, when doing “activation”, I propose to not recurse through references. So for example, when a function receives (as incoming argument, incoming return value or loaded from memory) a (&mut i32, &mut i32), then these references are both activated. However, when it receives a &mut &mut i32, the inner reference is not touched at all. IOW, we look at the data coming in (two references in case of the pair, one reference in the second example), but we do not go into memory to perform some recursive action.

I do not understand how you are counting references. The second line is a raw-ptr-to-shared-ref, so it will do the following to all locations “covered” by the ref – the ref has type &(&i32, &i32), so that would be 16 locations (with 8 bytes pointer size):

  1. Activate a Raw (so some_address must be frozen or have a Raw in the stack)
  2. Bump the clock and assign the newly created shared ref the current timestamp.
  3. Initiate Frz(current_time), which will freeze the location.

The values stores in these locations (which happen to be references again in this case) do not matter at all here.

Yes. Let’s look at the most important steps happening:

    unsafe {
        (*a).next = &*b; // <--
        (*b).next = &*a;
        &*(a as *const Node)

At the marked line, we create a shard ref covering whatever memory b is stored in. This follows the steps I described above, so in the end, that memory is frozen. In the next line, is mutated again, which is okay because there is a Raw in the stack – but will end the freezing for that part of memory (the last 8 bytes of b), because you cannot mutate something frozen.
Finally, later aka is accessed. This will activate that shared reference, asserting that memory is still frozen – triggering UB because memory is in fact not frozen any more.

Taking a step back, at (*a).next = &*b; you are creating a shared reference. That is a promise that the memory pointed to by this reference will never be mutated while this reference is valid. The reference points at wherever b sits in memory. The next thing you do is mutate b. The only way now to maintain your promise is if we say that this invalidate the reference you just created. (With types as contracts, btw, this would already be UB – types as contracts takes lifetimes into account, so the moment you create an &'static pointing at something, that memory is marked as read-only forever. Which makes sense, because that’s literally what &'static is.)

We have to find a way to make this work without mutating memory pointed to by a shared reference. One way is to opt-out of the promise I mentioned above by using Cell or UnsafeCell. So if you replace &'static Node by Cell<&'static Node> or UnsafeCell<&'static Node>, you should be good. Would that be an acceptable solution? It changes the type throughout the program, but it also reflects better the fact that the memory here is being mutated while there is a shared ref pointing at it.

1 Like

I thought your model was being careful work only in terms of actual loads and stores

… However, when it receives a &mut &mut i32 , the inner reference is not touched at all. IOW, we look at the data coming in (two references in case of the pair, one reference in the second example), but we do not go into memory to perform some recursive action.

That’s what I meant to say - the model is “shallow”, so reference-typed values that are just sitting out in some data structure in memory don’t have any special properties in this model until they’re loaded. (As opposed to the “types as contracts” model, where they do have special properties and my example is obviously UB.)

But your description of how the example would trigger UB doesn’t seem to be following this “shallow” model:

Finally, later aka is accessed. This will activate that shared reference, asserting that memory is still frozen – triggering UB because memory is in fact not frozen any more.

This will activate what shared reference? No reference is used more than once in the example, and all references are gone by the end of the unsafe block; a: *mut Node is the only live variable of any type at that point. New references are later derived from it (to access a.b.a), but those will have later timestamps. If that triggers UB, then so would let mut x = Box::new(5); &*x; *x = 6; *x. It’s the same sequence of steps: Start with a root raw pointer; derive a shared ref, freezing the memory; mutate the memory, unfreezing it; later, derive a new shared ref from the root pointer and use it to read the memory.

BTW I’m not arguing that my example should necessarily be legal; as long as there is some way to create cycles of references, and there seems to be general consensus that that’s desirable. But I think this model actually allows quite a lot of questionable stuff. E.g.:

    // of course lifetimes won't let this compile, but assume the same structure was created
    // by unsafe pointer casts or transmutes or whatever.
    let mut i = 0;
    let r1 = &mut &mut &mut i;
    let r2 = &mut &mut &mut i;
    ***r1 = 1;
    ***r2 = 2;
    println!("{}", ***r1);

This clearly should be UB. But it appears to be allowed by the stacked borrows model, because ***r1 loads a chain of new references from memory, so they get fresh timestamps and pushed onto the stack.

This is fixable for shared refs, by having the derived ref take its ancestor’s timestamp instead of a fresh timestamp (so the memory must have been immutable since the initial root shared ref was created), but I’m not sure how to fix it for unique refs without tracking the whole graph of which ref was created from which.

The model is shallow, but when the code goes “down into the data”, the model follows along.

No, why would they? The timestamp is stored together with the reference in memory and fetched again when the reference is loaded from memory.

No, this is fine. There is no shared reference that lives across the mutation. It is a very different sequence of steps.

Your code matches let mut x = Box::new(5); let ref = &*x; *x = 6; *ref. Or, slightly closer

let mut x = Box::new(5);
let shr = Box::new(&*Box::into_raw(x)); // put a shared ref into memory
*x = 6;
**shr // load the shard ref again and then deref it

That is UB.

It is not allowed, and (I think I said this above but it seems that got missed :smiley: ) you do not just get fresh timestamps when loading from memory. Instead, you first load the old timestamp, activate it – meaning make it the top of the stack, or cause UB if that does not work – and then retag (assign a fresh timestamp).

It was you who first talked about “just” assigning a fresh timestamp when loading. I replied that I find this dubious for &mut (search for “interesting suggestion” in this post). Thanks for providing an example :slight_smile:

So, to repeat – reference identity, including the timestamp, is generally preserved by a memory round-trip. I did not say that explicitly in the post because I would never have thought of doing anything else.^^ Many steps in the model would not make any sense otherwise. Also we probably want to optimize a store immediately followed by a load into just re-using the value; if memory round-trips changed the value that would not be correct.

Ah. I see what you meant now. Sorry for the misunderstanding. As @RalfJung mentioned, this is justified on the LLVM side by citing the nondeterminism of allocation (in their Twin Memory Allocation paper). We could theoretically port this argument over directly to Rust, but it adds a line of reasoning that I thought we were trying to avoid. In essence (if I understand correctly), that paper argues that if there is a set of nondeterministic choices that result in undefined behaviour, then the code is undefined code, and so we can assume that this doesn’t happen. To quote the paper (section 3.2):

Since there is at least one execution where the program would trigger UB, the compiler can assume q cannot alias with p (or with any object at all).

This is an explicit departure from the notion that UB should be machine checkable, since we are declaring the code undefined even if the particular execution it takes is perfectly fine. Now, I might be willing to make this small concession, since the only type of nondeterminism relevant to this question is where allocations get located; things like I/O, pseudorandom number generation, etc. don’t have to count. However, it is a big leap from where we are today.

That might simplify things? But that guarantee can only really be made with safe code, I think (yes, it is UB even outside of safe code, but since we’re trying to define UB, things feel like they get trickier). Besides, many of the problems show up when working purely immutably with data. The problem is that the freezes and such from creating references act as writes in the abstract, possibly creating (handle-able) races from something that isn’t writing at all.

But if the two threads have incomparable timestamps, then thread 1 could freeze, have its freeze overwritten by thread 2’s freeze, and then try to access the data. The access would be undefined since thread 2’s freeze would not have come before the creation of thread 1’s reference. Perhaps for clarity I should have written the creation of the references and the access of the references as separate steps.

Oof. That seems quite undesirable. I would instinctively want to say that since there is an implicit coercion that doesn’t even change the runtime representation of the references going from &mut to &, it should be perfectly safe.

Yeah, I think this gets at why I wanted unique pointers to have an ID separate from the timestamp - it completely dodges all ugly questions about reinterpreting things. Also, yes, I think I was imagining that there was some sort of (abstract) runtime tagging of whether a pointer was unique or shared. As long as we have tags, we might as well add more information to them, I guess.

The spurious writes I’m thinking about are writing the value that was already present, i.e.

*ptr = *ptr;

These definitely access the memory location, but there should be no problem if there are no aliasing/racing accesses and Uniq(ptr) is at the top of the memory’s stack. It does nothing to the value of memory and nothing to the stack.

I don’t actually think this is a good direction to follow, but I don’t think the “never dropped” analysis would be that hard in most cases - mutable references passed to functions are reborrowed, generally, not moved, so they just aren’t dropped until the end of the function.

Yes, that's where I was misunderstanding. I understood "shallow" to mean that references sitting out in memory had no special properties at all (in particular, no identity). The only per-memory-address metadata was your struct MemoryByte, with the stack of borrows / freeze state of that address, but no special metadata for a reference that might be stored there. So loading a value of type &i32 would have to get a new timestamp from somewhere. Hence my talk about "root pointers" - if all you have is an &&&i32, there is just one reference in the model until you start following the chain.

I'm pretty sure this model would actually work fine for shared references. But yeah, it's not at all the same as your model. Thanks, now it makes much more sense.

This is a much stricter model, in some ways more strict than "types as contracts", since transmuting or casting a pointer won't let you change the restrictions on indirectly reachable references. So how the model handles transmutes is important.

(That's another reason I was confused for so long--I assumed the model had to let you fill in a chunk of memory by some unspecified means, then declare "this a *mut T with lifetime starting now" and use it as such, as long as the bits in that memory are valid for a T.)

I hope we can avoid such a limitation. Rust's current support for transmutes and working with the with the bit representation of types is really useful, and a big selling point for low-level programming. (In contrast to the C/C++ model, where the standard effectively refuses to admit that objects are made of bits, and so makes things that should be trivial either UB or impossible unless you go the linux kernel route and tell the compiler to follow different rules for your codebase.)

1 Like

I have wondered about this optimization too. It seems though that if we are analyzing to see whether the &&usize escapes the function, we could also analyze to see whether the usize value is actually used, no? In that case, we could do the optimization without caveats.

Well, I suppose the concern is that there would be cases where it is “maybe used”…? I suppose that without termination analysis that is eminently possible. So never mind perhaps. =)

My first thought was – what if integers carry tags but arithmetic does not propagate them, so only straightforward copies were allowed?

But that’s not enough, since, as you know, there’s still the problematic optimization of "if we know that a == b, then replace b with a".

Still, at the risk of repeating myself, I really think that a model that doesn’t allow copying values through memcpy is unacceptably surprising.

I’m curious whether and how LLVM intends to fix the aforementioned issue for C, and whether the resulting model will be sane/predictable enough that it could be used here.

From the LLVM bug thread:

Nuno Lopes 2017-11-07 11:44:34 PST

Ralf is part of the memory model team btw. We will share a document + patches starting late next week.

Do you know if that document ever got shared, and if so do you have a link? :slight_smile:

There is no such thing as undefined code. Undefined behavior is a property of an execution. If a program has multiple executions because of non-determinism, then if any execution has UB, that is sufficient license for the compiler to optimize. This is the same for Rust, C and LLVM. I do not see any other way to define this – we cannot require all executions to exhibit UB, that would be prohibitive for optimizations. For example:

fn main() {
  let x = Box::into_raw(Box::new(0)) as usize;
  let y = 0x42440usize as *mut i32;
  println!("{}", unsafe { *y });

This could be defined behavior if x is really allocated at that address. Still we say the program has UB because it can have UB.

Yes and no. It is, in principle, still checkable by exploring all executions. Again, this is not new – for example, once concurrency enters the picture, this is also the case even if you entirely ignore integer-pointer-casts.

Now, the question is whether this is still feasible to check. And if the number of executions is large, it is not. Which is why I am trying to reduce the extend to which we rely on non-determinism for this. However, for the kind of reasoning LLVM does around pointer-integer-casts, I do not see any alternative. You do not even have to go all the way to twin memory allocations to see this, the following example is sufficient:

fn main() {
  let x = Box::into_raw(Box::new(0)) as usize;
  let y = Box::into_raw(Box::new(1)) as usize;
  if x < y { /* cause UB */ }

Essentially, the moment the actual integer address of an allocation matters, you have a huge amount of non-determinism and no way to explore it exhaustively.

And, again, I do not follow your terminology. What is “undefined code”? UB is a per-execution property.

This was talking about a transmute from &mut to &. These do have different runtime representations in my model – and anyway coercions do not imply “same runtime representation”, consider e.g. unsizing.

Oh then I misunderstood. What you are asking for is for the thing tagging & and the thing tagging &mut to be distinguishable. If one was a timestamp and one an ID they’d still both be 64bits of data. OTOH, if we sacrifice a bit or two to introduce a distinction, we can do that and still use timestamps for &mut.

Essentially, currently I plan to use the Borrow enum from my post as tag.

I see. Yes inserting those for &mut should be fine, not sure why you’d want to do that. :wink: (We’d have to take care that this does not require the value at that location to be valid, but that should be doable.)

The reference metadata is part of that reference. There is no reference without the metadata. So this is not per-memory-address metadata, it is just part of the data that is stored in memory.

Oh sure, I think one should be able to copy through memcpy. What does not work – and is, in fact, broken in LLVM (though I do not know of a miscompilation) is doing that while using an integer type in memcpy. You want a type that can hold arbitrary data, including (part of) a pointer with its metadata – a type that you cannot do anything with, no arithmetic, no deref, just write it somewhere else. C intends to make char that type, but that does not actually work because C also allows arithmetic on char. Unfortunately, LLVM has no type suitable for this purpose either – and without a concrete miscompilation, it will be hard to convince them to change that. :confused:

That document is the twin allocation paper that has been mentioned here multiple times :slight_smile:

Could you explain briefly why freeing the compiler to optimize if there is UB is a desirable goal? I would think it would be better to simply abort/fail to compile for UB. Why is well-defining what UB is, then using that definition to allow for optimizations useful? If something is UB, and I optimize it, isn’t it still UB? How is a program with UB useful?

EDIT: As I think about this more, I think I understand: The “compiler” like LLVM can optimize UB in the LLIR presented to it because it can assume that the higher-level compiler has already ruled out the UB case and that it will never occur in practice, therefore, the LL compiler can optimize “knowing” that the UB will never actually occur even though it can’t statically prove that from the information (structured IR) presented to it. Is that roughly correct?

I’m sure Ralf will have a better answer here, but since I think I understand this stuff I’d like to double-check that I do understand it by writing down what I believe the answers are in case I need correction.

This is a bit of a strange question because license to optimize is the only reason why we want unspecified, implementation-defined and undefined behavior to exist in the first place. I’m not sure there’s a more direct/helpful answer than “if that optimization wasn’t desirable, then that code wouldn’t be UB.”

I think what’s happened here is some confusion over the compiler knowing for sure that a program will exhibit UB, and the compiler making optimizations based on the assumption that UB will not be exhibited. When the compiler actually does know UB is going to happen, that absolutely should be a compiler error (afaik this is totally uncontroversial). But when we ask the compiler to optimize a function that takes two shared references, we don’t expect the compiler to go check all the call sites of that function and prove the references cannot possibly alias, which it likely can’t do in the presence of sufficiently complex unsafe code or external crates. Instead, we say it’s UB for shared references to alias, so it can go ahead and compile the function with the obvious optimizations you’d expect, and the presence of unsafe/external/etc code near one of its call sites does not magically deoptimize anything.

Pretty much the same as the first question. If you have a desirable optimization that can’t be done without declaring something UB, then we can’t actually make that optimization work in practice with other people’s unsafe code unless the authors of that unsafe code know what is and isn’t UB.

Well, yeah. Rust-level UB is different from LLVM IR-level UB is different from x86-level UB. I think we’re mostly talking about Rust-level UB, which is just a property of the Rust source code, not a property of the final executable. How much optimization was done by the Rust compiler has nothing to do with whether the Rust source code had Rust-level UB, unless the compiler is literally editing your source code for some reason. If you meant “does the x86 executable have x86-level UB if compiled from Rust source code with Rust-level UB?” then the answer is “it may or may not; Rust-level UB means the Rust compiler is allowed to do anything at all with it.”

A program that actually exhibits UB is never useful in theory. In practice, it’s only useful if your compiler happens to be acting as if it wasn’t exhibiting UB, either because you configured it to do so or because it’s not fully exploiting that license to optimize yet.

I think this is the correct intuition for why LLVM IR has more complex (and harder to auto-check) rules for UB than we would like surface Rust to have. Though surface Rust still needs some notion of what UB is to justify all the optimizations we want around arbitrary unsafe code.


Note that the “nondeterminism” here is not what a system programmer would call “nondeterminism”.

  1. For a system programmer, nondeterminism generally refers to “environment nondeterminism” - which random value is read from /dev/urandom, which request arrives first on a socket, et cetera. The compiler is not allowed to make arbitrary choices here - if a code has a buffer overflow when it gets requests in the wrong order, the program should still be well-defined as long as requests arrive in the correct order.
  2. For twinned allocation, nondeterminism is something that the compiler can choose - it can place stack arguments in which location it wants, and leave “traps” in arbitrary memory locations, to “catch” code that performs “random” memory accesses. This nondeterminism essentially only happens in theory - if code mangles memory, the compiler can retroactively claim that the address the mangling code accessed was a “trap”, and therefore UB happened. This has nothing to do with “environment” nondeterminism.

I’m trying to differentiate this model from the ACA model (, to explore model space.

One significant difference is that ACA has lifetimes, while this model doesn’t. However, I don’t think this is such a big difference as it seems.

Axis 1 - lifetimes

I think the barrier + use semantics are fairly close to making a borrow’s lifetime last as long as it is either live or kept alive by a barrier. This is not necessarily a valid “Rust” assignment of lifetimes - it can violate “artificial” lifetime constraints (in ACA, you could fix a lifetime to anything you want by using lifetime constraints), and “interior borrow constraints”, and of course it is a dynamic, rather than a static, notion.

By “interior borrow constraints” I mean the following:

fn foo(x: &mut Option<&mut u32>) { unsafe {
    let x_ptr: *mut Option<&mut u32> = x;
    let t = &mut *x;
    let x_inner: &mut u32 = match t {
        &mut Some(ref mut x_inner) => &mut **x_inner,
        None => return
    *x_ptr = None;

By ACA, this is UB - the reborrow in x_inner asserts ownership of x for the entire lifetime of x_inner, which conflicts with the write *x_ptr = None;. By the stacked borrow model, t is not live after the match statement, so there is no conflict.

Axis 2 - lifetimes of raw pointer

This is a bit similar to axis 1.

fn foo(x: &mut u32) -> u32 {
    let raw: *mut u32 = x;
    let t = &mut *x;
    *t = 0;
    unsafe { *raw } // is this UB?

In the stack-based model, if I understood it right (did I), creating t pops of the Raw from the stack, which makes further accesses from a raw pointer UB.

In the ACA model, this depends on how the lifetimes are assigned, so the situation is more vague. If the lifetime of the raw pointer is the same as x, this is OK, because as long as the raw pointer is alive, x does not protect the memory behind it.

The intent IIRC was that the compiler will be conservative and make it OK.

Axis 3 - stack vs. cactus stack

The ACA model is OK with creating a “cactus” of mutable references, as long as only 1 of them is used in the end. I think that this works less well with the “stacked borrow” model, but that this is orthogonal to the “strict liveness” restriction (i.e., it is possible to create a “cactusy” lifetimeless system).

I think the “cactus” behavior is essential for “unsafe mutability polymorphism”, as in

fn foo(a: bool, r: &mut u32) {
    let r1a = &mut *(r as *mut u32);
    let r2a = &mut *(r1a as *mut u32);
    let r1b = &mut *(r as *mut u32);
    let r2b = &mut *(r1b as *mut u32);
    if a {
        *r2a = 0;
// using r2b is now illegal
    } else {
        *r2b = 0;
// using r2a is now illegal