Documentation should only refine types (or why Pin::get_unchecked_mut is a ticking bomb)

This is a summary of the thread. The original post is below the line at the end of this summary.

Let's use the following notations:

  • [T].typ the semantics of the type T, also called the safety invariant (see Ralf's blog)
  • [T].doc the documented semantics of a particular occurrence of the type T, usually in the context of an unsafe function (Ralf's blog also calls this the safety invariant)
  • [T].val the validity invariant of the type T (see Ralf's blog)
  • [T].opt the semantics of the type T that the compiler is allowed to use

We have at least the following relations for compiler correctness:

General observations:

  • [T].val should only grow (otherwise UB may be introduced in existing well-defined programs) thus we should choose it as small as possible (see Ralf's blog)
  • [T].opt should only shrink (otherwise optimization may regress in existing programs)

My claim is that we want [T].doc ⊆ [T].typ (and implicitly [T].typ = [T].opt). This can be split in 3 parts:

  • [T].doc ⊆ [T].val which is already true (see above)
  • [T].typ = [T].val which claims that we don't want to prevent optimizations
  • [T].val = [T].opt which claims that the compiler should be able to use as much information as possible

The counter-argument to [T].typ = [T].val is that existing code already assumes [T].val to be bigger than [T].typ. For example, [&mut T].val doesn't prevent invariants to refer to the pointed content without going through that pointer, i.e. it is not possible to repurpose the pointed content without calling drop first just based on the validity invariant.

The counter-argument to [T].val = [T].opt is that we want to keep some wiggle room to adjust [T].val higher if we realize existing code assumes even higher [T].val than we were already aware of.

I'd say that both of those observations are the consequence of Hyrum's Law, the lack of a well-defined Rust semantics, and the lack of expressiveness in the type-system. Programmers just used the most simple to use type when writing unsafe, even if it was not a refinement of its semantics. And because the program was compiled to something sensible, they started relying on this behavior.

I'd say that for the types for which [T].typ ≠ [T].val, we can't do anything. And we might eventually introduce new replacement types highT and lowT such that [highT].typ = [highT].val = [T].typ and [lowT].typ = [lowT].val = [T].val as well as the appropriate sub-typing rules to be compatible with legacy code. (This might be technically challenging, so might never happen in Rust, even future editions.)

For [T].val = [T].opt, if we don't need to plan for existing misuse, then we can take this equality for free.

TLDR: Eventually, compilers will use as much type information as possible. Since they don't see documentation, if documentation extends the set of allowed behaviors compared to what the type allows, then the compiler may miscompile by assuming that those additional behaviors can't occur. In other words, documentation is not just for soundness in the source language, because types are used to optimize compilation, so to preserve the behavior from the source to the target language, the program behavior (defined by its documentation, not its type) should always be included in its type, and thus only be a refinement.


This post is about Rust, but to simplify we will work under some assumptions:

  • We assume an arbitrary target language called Target.
  • We assume a syntax for Rust. It is a set of constructs. A grammar describes how to combine constructs. Valid combinations form trees (e.g. expressions, types, statements, functions, structs, traits, modules, libraries, and eventually binaries).
  • We assume a semantics for Rust. It associates a set of behaviors to syntax trees. Behaviors are deliberately left unspecified. We write [a] the semantics of a.
  • We assume a syntax and semantics for Target. We use the same notation for the Target semantics (since we assume the syntax of Rust and Target are distinguishable). We assume Rust and Target behaviors are the same (to avoid working modulo bisimulation).
  • We assume a type system for Rust. A type system associates a typing rule for each syntax construct. A typing rule is a set of premises and a conclusion about the language construct of the rule. Premises and conclusions have the form of a judgment a : A where a is a syntax tree and A is from the type system syntax (e.g. e : Γ ⊢ τ). Judgments also have a semantics (e.g. a : A means [a] ⊆ [A]).
  • We assume a particular set of behaviors (e.g. programs that don't exhibit undefined behaviors). And we assume the Rust type system is sound wrt to that set, i.e. the semantics of well-typed programs is included in that particular set (e.g. well-typed programs don't exhibit undefined behaviors).
  • We assume a compiler from Rust to Target. A compiler c translates a program p correctly iff [c(p)] ⊆ [p] i.e. it doesn't introduce new behaviors. A compiler c is correct if it correctly translates all well-typed programs.

When writing Rust, there are 2 ways to express program behavior:

  • Types: The advantage is that the compiler automatically checks them before translating, the programmer doesn't need to worry about proofs. The disadvantage is that only set of behaviors that are the semantics of a type can be expressed.
  • Documentation: The advantage is that any set of behaviors may be expressed. The disadvantage is that the compiler doesn't understand them, so the programmer needs to prove them. Note that this is a restrictive meaning of the word documentation, as it only refers to the semantic aspect of documentation (i.e. the aspect that describes program behavior).

We'll consider 3 different semantics for syntax trees. When we have a : A, i.e. a of type A, we'll write:

  • [a].typ its semantics according to its type, i.e. [A]
  • [a].doc its semantics according to its documentation (this is usually defined as a diff wrt [a].typ)
  • [a].opt its semantics as seen by the compiler when translating (in particular when optimizing)

In the absence of documentation (i.e. [a].doc = [a].typ), we want [a].typ ⊆ [a].opt to avoid miscompilation due to aggressive optimizations. Similarly, in the presence of documentation, we want [a].doc ⊆ [a].opt.

I argue that we actually want [a].doc ⊆ [a].typ i.e. documentation only refines types. The first reason is that compiler correctness in the presence of documentation naturally follows from compiler correctness is the absence of documentation. The second reason is that compilers tend to use as much type information as possible to optimize. This reads as [a].opt converging towards [a].typ (being less and less an over-approximation) and thus not leaving much freedom for [a].doc to extend [a].typ. In particular, if ever [a].opt = [a].typ then we actually need [a].doc ⊆ [a].typ for compiler correctness.

One would then argue that enforcing documentation to only refine types hinders their advantage of expressiveness. But this is why Rust provides lower-level types. Those types extend the semantics of high-level types. For example *mut T is the low-level type of &mut T. When one wants to extend the semantics of &mut T, they actually use *mut T instead and restrict its semantics.


&mut T, NonNull<T>, and *mut T

I'm only consider the lifetime and non-null aspects of those 3 types for simplicity. The functions below are the identity function. They just change the type and documentation of their argument.

fn from(reference: &mut T) -> NonNull<T>;
// Here, we forget about the lifetime:
//   [reference].typ ⊆ [result].typ

fn as_ptr(self: NonNull<T>) -> *mut T;
// Here, we forget that the pointer is non-null:
//   [self].typ ⊆ [result].typ

/// Post: result is non-null.
fn as_ptr(self: NonNull<T>) -> *mut T;
// Here, we preserve the semantics thanks to the documentation:
//   [self].typ = [result].doc ⊆ [result].typ

/// Pre: ptr is non-null.
unsafe fn new_unchecked(ptr: *mut T) -> NonNull<T>;
// Here, we preserve the semantics and make it fully apparent in the type:
//   [ptr].typ ⊇ [ptr].doc = [result].typ

/// Pre: *self is not aliased for 'a.
unsafe fn as_mut<'a>(self: &mut NonNull<T>) -> &'a mut T;
// Here, we preserve parts (the actual lifetime could be longer) of the semantics and make it fully apparent in the type:
//   [self].typ ⊇ [self].doc ≃ [result].typ

In all functions, the documentation is a refinement of the type. In particular we have the following hierarchy: [&mut T] ⊆ [NonNull<T>] ⊆ [*mut T]. Using refinement we can document a *mut T to make it look like a NonNull<T> or even a &mut T. We didn't do the opposite, i.e. document that a NonNull<T> can actually be null, or that a &mut T has a longer lifetime.

Note in the last examples that the unsafe keyword is needed because the set of behaviors is filtered, and if this filtering is incorrect, the program might be unsound. It is okay (and preferred) to omit the unsafe keyword when only correctness is at stake when the filtering is incorrect.

Pin<&'static mut T> and &'static mut T

The functions below are the identity function. They just change the type and documentation of their argument.

fn static_mut(r: &'static mut T) -> Pin<&'static mut T>;
// Here, we forget that we can repurpose the memory before drop:
//   [r].typ ⊆ [result].typ

/// Post: *result is dropped before its memory is repurposed.
unsafe fn get_unchecked_mut(self: Pin<&'static mut T>) -> &'static mut T;
// Here, we preserve the semantics but don't make it apparent in the type.
//   [self].typ = [result].doc ⊇ [result].typ

In the second function, the documentation is not a refinement of the type. In particular we have the following hierarchy: [&'static mut T] ⊆ [Pin<&'static mut T>]. There's a few ways to see why:

  1. The first function is safe.
  2. Let's try to define the semantics of Pin<&mut T>. We need to introduce a notion of memory and typing environment. So we actually define [Γ ⊢ Pin<&mut T>] the semantics of Pin<&mut T> under a given typing environment Γ which models among other things the notion of memory. We have that [Γ ⊢ &mut T] is the subset of pointers of [Γ ⊢ Pin<&mut T>] whose content are not referred by other invariants in Γ, such that moving out of them doesn't break invariants that relied on that content. Reciprocally, [Γ ⊢ Pin<&mut T>] is thus the set of pointers with correct alignment, aliasing, initialization, etc but without any requirement on other parts of the environment mentioning their content. (Note that this means that it should be possible to unpin a pointer given that its content is not referred to anymore, essentially meaning the program behavior doesn't depend on this memory being pinned.)
  3. We can use the fact that if [τ] ⊆ [σ] then we could pass a τ to functions that take a σ. Let's assume by absurd that [Pin<&mut T>] ⊆ [&mut T], then we could pass a pinned pointer to mem::swap. This is not allowed, so the assumption is indeed absurd. (Note however that it is also possible that [Pin<&mut T>] and [&mut T] are not comparable, i.e. neither is a refinement of the other.)

Similarly as before, the unsafe keyword is used on the second function because it filters out the set of behaviors. And doing this wrong has soundness implications. The difference with before, is that it filters out behaviors because of its post-condition instead of its pre-condition. This is a dangerous pattern.

This is probably not problematic currently (and might never be) if there's no optimization pass that rely on &mut T being movable (that's probably related with the problem of Pin on #[repr(packed)]. But that's a risk increasing over time.


  • Use *mut T instead of &mut T as the return type of Pin::get_unchecked_mut? Or possibly another low-level variant of &mut T.
  • Document in the compiler that [&mut T].opt is actually trying to over-approximate [Pin<&mut T>].typ and not [&mut T].typ.

The way it currently works is that for a ?Unpin pointee, &mut Pointee does not get the noalias attribute. Just like with & and UnsafeCell, the properties that a reference asserts of the pointee depends on the actual pointee type.

So &mut T where T: Unpin guarantees what we normally think of as the &mut properties. There's an extant suggestion to replace the "!Unpin hack" with an explicit "UnsafeMutCell" as well, which will make the application of weaker &mut more obvious.

Interesting, so if I understand correctly, this means that when we write &mut T (which means &mut T where T: ?Unpin), the semantics doesn't have noalias. While &mut T where T: Unpin does. In other words, [&mut T where T: Unpin] ⊆ [&mut T where T: ?Unpin] = [&mut T]. (This wasn't the case before Pin introduction right? Did this have a performance impact?)

However, I'm not sure this is enough to fix the issue. Since the property that the optimizer may think the pointer has that it doesn't, is that the pointed content is referred by outside invariants and thus can't move or be deallocated without drop (or more precisely without adapting those invariants, which drop is supposed to do). Unless noalias also means that.

noalias is the LLVM annotation that means roughly "no other pointer is used to mutate the pointee while this pointer is still used." Without a noalias annotation, the LLVM pointer is just a pointer to a location that we know nothing about.

Of course, the Rust Abstract Machine can and does know more about locations and references/pointers (see: Stacked Borrows), and how the Abstract Machine semantics is defined (see: MIR and Stacked Borrows) impacts the optimization potential of given types.

Correct, but IIRC we didn't actually apply noalias until after Pin was a thing due to extant LLVM miscompilations with noalias that took a long time to resolve. The exact interaction between Unpin and optimizations is still an open question.

But the fact is that we already have [&T where T: Freeze] ⊆ [&T where T: ?Freeze] for UnsafeCell. Freeze is the compiler-internal trait for "has UnsafeCell before indirection", and rustc emits LLVM noalias for &T where T: Freeze but not for &T where T: ?Freeze.

Both of these do have optimization implications; if you look at the LLVM IR of this example, you will see that &mut dyn Trait does not get a noalias annotation, whereas &mut dyn Trait + Unpin does.

pub trait Example {}

pub fn nonoalias(_: &mut dyn Example) {}
pub fn noalias(_: &mut (dyn Example + Unpin)) {}
; playground::nonoalias
; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind readnone uwtable willreturn
define void playgrond::nonoalias(
    {}* nocapture noundef nonnull readnone align 1 %data,
    [3 x i64]* noalias nocapture noundef readonly align 8 dereferenceable(24) %vtable,
) unnamed_addr #0 {
  ret void

; playground::noalias
; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind readnone uwtable willreturn
define void playground::noalias(
    {}* noalias nocapture noundef nonnull readnone align 1 %data,
    [3 x i64]* noalias nocapture noundef readonly align 8 dereferenceable(24) %vtable,
) unnamed_addr #0 {
  ret void

Thanks for the infos! So the way I understand it, Rust is mitigating the issue by keeping [&mut T].opt strictly larger than [&mut T].typ (essentially the second suggestion in the initial post). This relies on 2 assumptions:

  • Rust doesn't optimize when translating to LLVM (which eventually translates to Target and optimizes that translation). By optimize I mean "the translation is correct only if some behaviors are impossible".
  • Rust deliberately uses loose types in LLVM, e.g. c(&mut T) = c(T)* and not c(T)* noalias. This means [&mut T].opt = [c(T)*].typ ⊇ [&mut T].typ. This leaves a lot of room for the documentation to extend the set of possible behaviors for programs of type &mut T as long as they fit [c(T)*] in LLVM (i.e. they may alias).

Rust still manages to keep some performance when the type is further constrained with Unpin (similarly with Freeze). In those cases, documentation cannot extend the set of behaviors as much (in particular they can't alias anymore) and one would need to use *mut T if they needs to.

I personally find this a risky strategy because to prove that a documentation won't get miscompiled, one needs to know how Rust translates to LLVM and what the LLVM semantics is. But I guess that's also the most pragmatic solution given that Rust semantics is actually defined by how Rust translates to LLVM and what the LLVM semantics is. I just hope Rust will eventually have a semantics and the compiler may use as much type information as possible to optimize (i.e. [T].opt = [T].typ) and thus documentation should only refine types which is the safest and most natural option.

Still, the fact remains that Pin::get_mut_unchecked exists and is stable, so deprecating it as "this cannot be used soundly" would be very unfortunate... and Pin<&mut T> is also just a newtype wrapper around &mut T, so it necessarily means that &mut-validity-at-the-language-level must not imply that the pointee can be moved.

This also ties into reference stability, in a way: it's not officially guaranteed, but in practice it's a good property to maintain (and everyone would like to) that given a place x, ptr::addr_of!(x).addr() remains consistent. Reference stability means that the compiler cannot transparently move a location.

I don't think it's ever valid for the compiler to use &mut T in a way incompatible with it being pinned.

1 Like

Pin<&mut T> is also just a newtype wrapper around &mut T

Yes, I also think the design of Pin<T> (i.e. a wrapper around a pointer-style type T instead of a wrapper around a lower-level pointer type) forces on all pointer-style types T to have their semantics in optimization context be strictly larger (allowing behaviors where invariants may depend on the pointed content without going through that pointer, which means it is unsound to repurpose the pointed content before dropping the content) than their actual language semantics (where it is impossible for invariants to mention the pointed content without going through the pointer, and thus it is possible to use mem::swap safely).

it necessarily means that &mut-validity-at-the-language-level must not imply that the pointee can be moved

It depends what you call language-level. I would state it like that instead:

  • &mut-validity-at-optimization-level (i.e. [&mut T].opt) doesn't imply that the pointee can be moved.
  • &mut-validity-at-language-level (i.e. [&mut T].typ) implies that the pointee can be moved (so the opposite of what you say).
  • &mut-validity-at-documentation-level (i.e. [p].doc where p : &mut T) may or may not imply that the pointee can be moved depending on the documentation (this is probably what you meant by language-level, but I would argue that documentation is by definition not part of the language otherwise it would be a type).

This also ties into reference stability

I agree this is related. If it ever becomes guaranteed, then it would be a good way to fix this issue by documenting that [T].opt (where T is a pointer-style type) contains behaviors that forbids moving the content out before drop (essentially second suggestion of initial post).

I don't think it's ever valid for the compiler to use &mut T in a way incompatible with it being pinned.

Yes, we are now stuck with that because of Pin design. And this is not just about &mut T but all pointer-style types.

Ideally, pointers would have a single type *T that can be refined with a composable set of properties:

  • variance
  • lifetime
  • aliasing
  • alignment
  • initialization
  • etc

With that design, Unpin would just be a property of the pointer. Without this property the pointer is pinned. With such design, we don't need Pin::get_unchecked_mut anymore, since you can already call all functions you would be able to on the pinned pointer. And the type system prevents you to call mem::swap.

But this might need some research to make sure inference works good enough in practice. And this is probably not an short-term solution.

This is a fairly common pattern recently, I think -- making the construction or declaration of a given value unsafe rather than each use. For example, there are proposals to make extern static unsafe to declare but safe to use -- the declaration makes safety promises about all use sites. (I think there were other examples that I cannot think of right now.)

The same happens here, I think. I agree the pattern is somewhat risky, and I personally would rather make each use site unsafe, but I can accept that other people disagree and consider it more ergonomic to make the declaration/construction sites unstable.

(Note that I skipped over most of your type theory, since it seems unrelated to the point you are making about Pin here. It would help to focus on the point you are trying to make, rather than prefacing it with an entire lecture. :wink: It should not be necessary to understand that entire lecture in order to understand what is problematic about get_unchecked_mut. We all have limited time so a point should be made in as much space as necessary and no more, or else fewer people will have the time to be able to actually read and understand the point. )

Indeed this is just a temporary hack. Eventually, some !Unpin types will still have aliasing guarantees. Also the OP talked about things being "movable" but not about aliasing. So this seems orthogonal to me.


making the construction or declaration of a given value unsafe rather than each use

The pattern I'm describing is not about where to put the unsafe keyword. But rather about what the programmer is allowed to document without risking miscompilation. In particular making mem::swap (and other moving usages of &mut T) unsafe would not fix the issue. This is also neither about soundness nor the unsafe keyword (although in practice it's related).

Let me try a very generic description which hopefully gets to the point:

  • We assume 2 types T and S, and a property P, such that T refines S with P.
  • Let F be the identity function from S to T with documentation that the result doesn't really have P (in practice, this might require unsafe somewhere).
  • We assume a function A taking T that the compiler is able to optimize because T satisfies P. (Note that A doesn't need to rely on P and may document that it doesn't, in practice T might have been chosen because it's easier to write than S and it's usually called when a T is available, or other historical reasons.)
  • Let B be the composition of F followed by A.
  • The compiler will miscompile B. In particular calling B with arguments that don't satisfy P might not behave as expected (in practice, this might be a soundness issue).
  • Note however that at source level, the program is correct. The problem only occurs because of compilation.

Now let's take:

  • T = &mut X
  • S = Pin<&mut X>
  • P = no invariant refers to the pointer content without going through this pointer
  • F = Pin::get_unchecked_mut
  • A = any function taking &mut X which is not repurposing the content without drop

How to avoid this kind of issues? Don't write documentation that extends the set of behaviors defined by the type. Only write documentation that refines the type. This is because compilers only see types, not documentation. And compilers use the set of behaviors defined by the type to optimize. The other solution which is apparently the one Rust is taking, is to blindfold the compiler such that it hallucinates additional behaviors than those the type actually defines. This way, it can't optimize on properties that are not also satisfied by those hallucinated behaviors. That's valid, but risky because eventually one will forget about why those hallucinated behaviors are needed and will try to optimize more. This solution also misses optimization opportunities (which is also why it's risky). All in all, I'd say this kind of pattern is creating some form of tech debt in the language design. It's ok to have some, but we should try to limit it.

It would help to focus on the point you are trying to make, rather than prefacing it with an entire lecture.

Yes that was too optimistic :slight_smile: I tried to put a TLDR though, but seems it didn't help either.

1 Like

A few tweets that may help understanding the pattern.

Restricting values or operations

/// Restricts the set of possible values. 
/// (As a consequence, extends the set of possible operations.)
pub struct Refine<T>(T);
// Example: NonNull
// Plays nicely with the compiler.

/// Restricts the set of possible operations.
/// (As a consequence, extends the set of possible values.)
pub struct Extend<T>(T);
// Example: Pin
// Doesn't play nicely with the compiler and needs some compiler annotation.

Communication between language and compiler

Besides their usage in the language (to prove properties like soundness), types are used to communicate with the compiler. They describe the set of possible behaviors. This helps with modular compilation since the compiler doesn't need to know all call sites to infer the type of arguments a function takes.

If the language lies about the types in the wrong direction (not enough behaviors), the compiler may miscompile. If the language lies in the right direction (too many behaviors), the compiler may not optimize as much as it could. For &mut T, Rust lies in the right direction, it omits to say that the content may be moved out before drop.

Unsafe as a way to extend behaviors by refining behaviors

In the world where documentation only refines type, documenting arguments of a function is actually documenting the function by extending its type. The way to solve that is the unsafe keyword. It extends the set of behaviors of the function by allowing all functions that take refinements (including the bottom type).

/// Pre-condition: x is positive
unsafe fn foo(x: i32);
// We have:
// - { x : i32 | x > 0 } ⊆ i32
// - fn(i32) ⊆ fn({ x : i32 | x > 0 }) ⊆ unsafe fn(i32)

We can't annotate non-function types with unsafe (yet? otherwise unsafe &mut T would be a possible return type for Pin::get_unchecked_mut). So the solution to extend a type is to refine an existing extension. For example to extend &mut T, there's NonNull<T> or even *mut T.

Circumstances that lead to extending types

A language realizes a heavily used type is actually too strict and they want to provide an extended version (i.e. a version that restricts the set of possible operations). There's 2 options:

  • Add a new type with sub-typing to the old. Update the signature of functions that actually are correct on the super-type. (I would argue, sub-typing is a very important property for language evolution. It permits to seamlessly and safely do this kind of change.)
  • Define the loose type using the strict type with encapsulation. Provide functions to cheat, i.e. using the strict type for values that are only in the loose type, in order to be able to reuse existing functions on the strict type that actually are correct with the loose type.
1 Like

Thanks a lot for summarizing, that was very helpful. :slight_smile:

However, I am still not sure I agree there is a problem here. It would help if you had a concrete example of an optimization that you think the compiler should be able to perform, but that would be unsound when combined with Pin::get_unchecked_mut. (Or any other kind of concrete example.)

If I understood correctly, then I think what you are saying is related to the distinction between "validity invariants" and "safety invariants" that I explain in detail in this blog post. &mut has a validity invariant (which, interpreted broadly, includes a bunch of noalias guarantees) and no function is ever allowed to violate this (or else UB). It is not okay to write documentation that says "oh and the mutable reference returned here violates the validity invariant". When doing optimizations, the compiler exploits the validity invariant. In contrast, it is okay to temporarily break the safety invariant, since that is not an invariant that the compiler even knows about.

You seem to be concerned that Pin::get_unchecked_mut, in its documented postcondition, says it returns something that violates the validity invariant of its return type. I agree that would be a huge problem and should never be done! However, I don't think it is the case. Yes, we have to sort out the interaction of self-referential generators and &mut, but that is not really related to get_unchecked_mut as far as I can tell.

Specifically, IIUC @ia0 is claiming that address stability should not be part of the validity requirements of for<T: *> &mut T, and the compiler should be able to optimize assuming that this is true; for instance moving a small type into more local memory (or registers), operating on it there, and only writing it back at the end — basically, write coalescing without preserving reference stability — and that Pin::get_unchecked_mut makes such assumptions about &mut T unsound.

More generally, though, I believe @ia0 is arguing that removing safety invariants from a type via documentation is a "ticking bomb," because a future smarter compiler (or library version) could upgrade safety invariants to validity invariants. "The pointee doesn't care about its address" is a safety invariant of &mut (via mem::swap), and Pin::get_unchecked_mut returns (and holds internally) a &mut which does not uphold the safety invariant.

So I think it's potentially worth breaking invariants into three categories:

  • This is a validity invariant today; breaking it at any point is UB.
  • This is guaranteed to only be a safety invariant; you can break and restore it if you're very careful.
  • This is a library validity requirement; no language UB will immediately occur if you break the invariant temporarily, and you might be able to restore it as if it were only a safety invariant, but it may be upgraded into a true validity invariant in the future.

What would be a concrete example of such a transformation? When I say "concrete" I mean concrete, as in, please show me some code before and after optimizations. :slight_smile:

A library surely couldn't do that; validity invariants are defined by the compiler. A compiler could, but that would be a breaking change in the language spec.

When a library says it is UB to do something, then that is library UB. Whether or not this is also language UB is an implementation detail of the library that can change in the future, so that is not something clients can build on either way. Only when libraries explicitly document that certain invariants can be broken temporarily, then that is okay to do (like we did with str and its UTF-8 requirement).

&mut is a primitive type though so there's not really a notion of "library UB" here. There still is, of course, both a safety invariant and a validity invariant, and they are not equivalent. (The only way I know to precisely express the safety invariant is using Iris and a lot of heavy PL technology.) When we one day fix the validity invariant of &mut then that becomes a stable guarantee that it will not ever be any stronger than this, since strengthening it later would be a breaking change.


A library could upgrade a safety invariant to a validity invariant:
Imagine before rust 1.25 there was my_lib::NonNull with a safety invariant of not being null. Then, in a new version soon after std::ptr::NonNull is stabilized in rust 1.25, my_lib::NonNull changes to wrap std::ptr::NonNull, and thus the safety invariant becomes a validity invariant.

1 Like

I am still not sure I agree there is a problem here. It would help if you had a concrete example

Yes, I'm not claiming there is a problem now. Just that this pattern might eventually lead to problems (unless Rust gives up on future optimization). I don't have a concrete example of real hardware because I don't know enough exotic hardware, but I think it's reasonable to assume some hardware with explicit control over the CPU cache (to get better performance guarantees). The catch compared to existing CPU caches being that this explicit CPU cache has its own address space. So programs need to explicitly move their data to the cache before their tight loop then move it back after. Knowing that something is &mut in a tight loop might benefit from keeping the structure in cache. It's fine to temporarily move the structure because nothing depends on its content being at a specific address. This example may be far fetched, but I'd say it's still a glimpse into what could go wrong.

I think what you are saying is related to the distinction between "validity invariants" and "safety invariants"

Thanks for the blog post! Yes it's related indeed:

  • Validity invariant = [T].opt
  • Custom safety invariant = [T].doc
  • Rust safety invariant = [T].typ
  • You argue that [T].opt should be different from [T].doc, but I think it's because you assume the invariant is the same at all program points (which I think comes from the fact that the safety invariant is apparently just for safe code in your post). I'd argue that between updating the pointer and the length when growing a vector, you still have an invariant, it's just more complex and involves local variables too. The compiler could be allowed to use that invariant, but it would need to reverse-engineer it since it's just documentation, not type, and compilers don't see documentation.
  • You say we should "mak[e] as few things valid as we can". I agree, but this actually means taking opt = typ since that's the lower bound. Otherwise you are forever forbidding sets of optimizations.

Actually let me fix what I just said:

  • Validity invariant is a new concept [T].val
  • We have [T].val ⊆ [T].opt and the difference between both is what you call the wiggle room. We want programmers to obey [T].val but the compiler to only see [T].opt, such that we can move either bound towards the other to either allow more optimization (opt going down) or more flexibility (val going up).
  • Currently we only have [T].typ ⊆ [T].val and [T].doc ⊆ [T].val. My argument was based on [T].typ = [T].val because it doesn't forbid optimizations. But if Rust assumes that giving up on those optimizations for more flexibility (which I think comes from the fact that Rust doesn't have a good subtyping story, which is the source of this tech debt) is worth it, then I think it makes sense. But it's very important that [T].val be very well documented such that [T].val ⊆ [T].opt at all times, otherwise we lose compiler correctness.

So to sum up we have this lattice:

typ ⊆ val  // diff = how much optimization/flexibility we forbid/allow forever
      val ⊆ opt  // diff = how much optimization/flexibility we may enable in the future
doc ⊆ val

we have to sort out the interaction of self-referential generators and &mut, but that is not really related to get_unchecked_mut as far as I can tell

I think it's related because whoever can look at the definition of Pin<&mut T> sees the &mut T without any language annotation like it's the case for MaybeUninit which is another example of extending a type. This &mut T is the same as the result of get_unchecked_mut. It is already violating the type invariant (it is apparently not violating the optimizer aka validity invariant since &mut T is not even noalias and there is this concept of reference stability).

EDIT: Forgot to mention that MaybeUninit doesn't only do the lang-item better, but also the assume_init function has a pre-condition, not a post-condition, so it doesn't expose a non-lang-annotated extended type to the world.

That sounds like something that we can express in current Rust. It sounds a bit like take_mut, in fact? But I guess you are saying this is take_mut without the user actually explicitly having written take_mut. I now get what @CAD97 meant when talking about a lack of address stability.

TBH I don't think we can do such optimizations in Rust. You can already do println!("{:p}", mutable_ref) and get a guarantee that this prints an address that does not change under reborrowing etc. Rust is a too low-level language to allow optimizations like that, in my opinion.

1 Like

It sounds a bit like take_mut

Yes that's take_mut. But any function not allowed on a pinned &mut T is a lost optimization opportunity. Maybe a swap example would be a memory architecture with hot and cold memory, where the hot memory is much smaller in size than the cold one, but it's much faster. So the compiler might want to swap between hot and cold as needed.

TBH I don't think we can do such optimizations in Rust

Today most probably. The question is if we want to forbid them forever. In theory, in a language that is both optimized and flexible (i.e. has powerful sub-typing), you always have doc ⊆ typ = val = opt. In such a language, whenever you realize you need a new type (e.g. because some particular doc comes often in practice and you found a way to type check and infer it), you introduce this new type with its associated sub-typing rules such that old code continues to work, and both old and new code get the maximum optimization they can (old code can always start using the new type for more optimization and it can do so incrementally thanks to sub-typing). In such a language, to allow full expressiveness using doc, you need enough super-types all the way up to the top type and so for all types.

AFAICT a part of the pointer lattice in current Rust looks like:
        *mut T
      /        \
NonNull<T>    Pin<T>
      \        /
        &mut T
At any point, you can attach some doc to lower/restrict the type.
(And you ideally choose the lowest such type to avoid proving too many things).

In Rust, because in practice programmers didn't obey doc ⊆ typ (either because there wasn't a super-type to restrict or it was too high in the lattice and thus too painful to use because too many properties to prove), we end up having to add the notion of val such that we can keep doc ⊆ val. (The fact that val ⊆ opt is just because compilers are still racing for optimizations, their work is not over, and as long as they know they race for val and not typ we're good). And this notion of val forbids optimizations that lie between typ and val.

I guess the situation is not that bad for a few reasons:

  1. Optimizations that early in the language design of Rust are not that important.
  2. As long as compilers know they shouldn't optimize more than val (i.e. val ⊆ opt) we're good.
  3. In the future, we might find a solution to introduce new types to replace the old ones?

What I am saying is that such an optimization would already be wrong, since it changes the output of println!("{:p}", mutable_ref).

The problem/challenge with low-level languages like Rust is that you have to basically pin down everything to let unsafe code do the crazy things it needs to do. So the thing high-level languages do, where you have a super abstract language model and you can "discover" more optimizations later -- that's basically impossible. Optimizations are always bounded by what is "observable", and high-level languages make it impossible for you to observe most low-level things -- but in Rust, almost everything is observable.

So, we basically have to decide which optimizations we want, and forbid the rest forever. That may sound grim, but consider the dual, from the unsafe programmer's perspective -- the more wiggle-room we leave the compiler for more optimizations in the future, the less things unsafe code can do. It's already super hard to make sure unsafe code is correct against a concrete language model; if you add tons of wiggle-room then at some point it becomes just impossible. Higher-level languages get to move the "optimization frontier" because they have no unsafe code; in Rust, we have an inherent conflict between "allow more optimizations" and "allow more unsafe code", so just leaving these choices open is not a great option.


The problem/challenge with low-level languages like Rust

I think that's where our visions differ. Rust is the first language that claims to be both a high-level and a low-level language (as far as I can tell). It does so by providing a lattice of types to choose from and thus continuously navigate between low- and high-level programming as needed. Let's simplify by looking at just 2 pointer types:

  • *mut T: This is the low-level pointer type. You don't get much optimization but you have more control.
  • &mut T: This is the high-level pointer type (at least I initially thought, I'm realizing with this post that this pointer type is actually quite low-level and Rust doesn't really have a high-level pointer type). You get more optimization but you have less control.

The programmer can choose the right type depending on what they need and convert between those types as needed. The language doesn't impose anything on its programmers.

in Rust, we have an inherent conflict between "allow more optimizations" and "allow more unsafe code"

the more wiggle-room we leave the compiler for more optimizations in the future, the less things unsafe code can do

Those statements are true if the language provides only a single type. But by providing a lattice of types, the language lets the programmer decide what they need: high-level (optimization) or low-level (control). The more types the language provides, better the granularity to choose among and thus less properties to manually prove (there's of course some design challenge regarding polymorphism, but bounded sub-typing should help).

So yes, when looking at &mut T, Rust looks like a low-level programming language, because it uses this single type to mean multiple things (namely a high-level &mut T and a lower-level Pin<&mut T>). This is because Pin<&mut T> has a low- to high-level conversion function (namely get_unchecked_mut) taking a post-condition instead of a pre-condition (other types don't do that), which means that the low-level type leaks into the high-level type, so you get [Pin<&mut T>].typ ⊆ [&mut T].val.

I think this bug is specific to Pin, because the other types that Rust added in existing lattices (like NonNull<T>, MaybeUninit<T>, etc) have correct conversion functions:

  • Going from high- to low-level is safe.
  • Going from low- to high-level is "unsafe" and requires a pre-condition that the argument is already in the high-level semantics.

I think it's fine to say that this is too late to fix &mut T now. It has become a monster where typ is high-level but val is low-level (curious to know other types with typ ⊊ val). It is still possible to add a new high-level pointer type with typ = val though if needed.

But in the future, I think it is important to try to keep doc ⊆ typ = val = opt (in particular typ = val) and add new types as needed with correct conversion functions.

1 Like

In terms of validity invariants, &mut T and Pin<&mut T> are identical. In terms of safety invariants, neither is a refinement of the other: you can't safely convert from &mut T to Pin<&mut T> (because unsafe code is allowed to assume that a type once pinned is forever pinned), and you can't safely convert from Pin<&mut T> to &mut T (because then you could move it). So I don't see where you get one being higher-level than the other.

In a language with mutation, I don't think it makes sense to put so much weight on the distinction between preconditions and postconditions. In Rust, whenever you have an unsafe operation that returns a reference, there's often some kind of safety postcondition involved.

That includes the mere act of unsafely converting a raw pointer to a reference, if the raw pointer is something that safe code can deallocate. For example:

1 fn main() {
2     let x: Box<i32> = Box::new(1);
3     let x_ptr: *const i32 = &*x as *const _;
4     let x_ref: &i32 = unsafe { &*x_ptr };
5     std::mem::drop(x);
6     println!("{}", *x_ref); // UB
7 }

Line 6 is UB, but wouldn't be UB if not for line 5. However, the only unsafe code is the conversion on line 4, so the UB has to be attributed to the conversion. That's equivalent to saying that the conversion has the safety postcondition "don't access the returned reference after deallocating it".

When it comes to library rather than language safety invariants, one example is Rc::get_mut_unchecked, an unsafe method that returns a reference with the postcondition that "Any other Rc or Weak pointers to the same allocation must not be dereferenced for the duration of the returned borrow." Another similar example is RefCell::try_borrow_unguarded.

So far, all of my examples are of the form "don't access the reference at all after X invalidation event", because these sorts of postconditions are most common. For them, there is a way to argue that they're not "real" postconditions. In my code snippet above, you could argue that println!("{}", *x_ref); extends the lifetime of x_ref at compile time, and hypothetically, the UB would be caused immediately at the point of conversion, because you're converting to a reference type with too long of a lifetime. But that reasoning is only valid if lifetimes affect runtime semantics, an idea which has been repeatedly considered and rejected, on the grounds that it would be too unpredictable and difficult to reason about for unsafe code authors.

Regardless, there are other examples of the form "don't mutate the returned reference in X way". For example, given an &mut str, you can call the unsafe as_bytes_mut to get an &mut [u8]. But this has the postcondition:

The caller must ensure that the content of the slice is valid UTF-8 before the borrow ends and the underlying str is used.

You might argue that this method shouldn't exist, and you should be forced to use raw pointers instead. But what would that gain? Being forced to use raw pointers would mean you couldn't use any of the normal mutating methods on &mut [u8] or &mut u8 – say, bytes.fill(42) or bytes[2] += 1.