Unsoundness in `Pin`

I don't think we should deprecate Pin::new. I just think it has the highest long-term risk!

3 Likes

Maybe it's worth adding a bound specifically for Pin::new, like P: PinPtr (where PinPtr is an unsafe trait) so at least smart pointer types have to opt-in to being constructed safely. There just aren't that many smart pointer types, so it wouldn't be a huge burden.

That's what I proposed earlier, but Ralf made a convincing argument it wasn't a good solution. Even setting aside proofs, consider this: we can't backwards compatibly add new requirements to that trait's contract, which would mean we wouldn't be able to add new impls to Pin that rely on it.

I just spoke with Niko for a bit, and we agreed that the current implementation of PartialEq should be treated as a straight forward bug. It is unsafe in a generic context to go from &Pin<P> -> &P, any place we do this and then pass &P to user defined code, except for Clone and the Deref traits, is a bug. Our attitude was that we should fix the impl to instead forward directly to the target type.

My suggestion was meant to be in addition to the proposed changes to eg. PartialEq. You said you thought Pin::new had a higher long term risk due to the very broad generic implementation, adding the PinPtr bound would mitigate that risk, or at least leave some wiggle room in case further issues are discovered. I suppose it depends on how real you consider that risk to be.

But this returns to the problem with unknown unknowns: we need to be able to concretely express the contract on whatever that trait would be, or it doesn't mean anything. In effect, based on what we know and understand right now, its the same as just putting the contract on CoerceUnsized.

1 Like

True, but:

  1. We have the option to over-restrict that contract: we can always relax constraints later.
  2. If we get our definition of that contract wrong, fixing it will break less of the ecosystem. When we discover our "unknown unknown" we need only find crates that implement PinPtr (probably very few) and audit them as to whether they follow the new contract or not. In all likelihood, the fact that they're "trying" to be smart pointer types it probably enough to make them valid. I expect most danger to come from people "abusing" Deref in some way who don't consider themselves to be implementing a pointer type, and those people would be unlikely to implement the unsafe trait regardless of what contract it has.

Anyway I'm not really advocating either way on this, just thought it was worth mentioning.

1 Like

Well, I'd argue that you're seeing the world through compiler-tinted glasses. :slight_smile: From a user code perspective, "Copy is not implemented" and "trying to assert that Copy is implemented causes a compilation error" are indistinguishable. There's no reason that unsafe code shouldn't be able to rely on this to enforce its invariants. It's just that, due to lifetime parametricity, there's no real use case for doing so. But, for instance, I'd argue that this useless API is sound:

pub struct Wrap<'dummy, T> {
    pub t: T,
    pd: PhantomData<*mut &'dummy ()>,
}

unsafe impl<T> Sync for Wrap<'static, T> {}

pub fn with_wrap_nonsync<T>(
      t: T, f: impl for<'dummy> FnOnce(&Wrap<'dummy, T>)) {
    f(&Wrap { t, pd: PhantomData })
}

pub fn with_wrap_sync<T: Sync>(
      t: T, f: impl FnOnce(&Wrap<'static, T>)) {
    f(&Wrap { t, pd: PhantomData })
}

However, the unsafe code that implements such an API does have to consider how wrapper types deal with subtyping. In the above example, if Wrap were changed to make 'dummy contravariant, the API would be unsound because you could convert from &Wrap<'non_static> to &Wrap<'static>. But of course that's not Rust's fault: references being covariant over their type parameter is part of their contract, and it's the unsafe code's responsibility to take that into account.

I suppose the same applies to Pin. You could write unsafe code that is unsound iff a user takes advantage of Pin's covariance, but that's the code's fault, not Pin's.

1 Like

Here is an example of lifetimes influencing traits, where a non-static lifetime will fail to compile

1 Like

Above I was mostly arguing that for the particular example under consideration, IMO it is wrong to blame Pin::new.

But after trying to convince myself of the correctness of Pin::new, I agree with you: that is highly non-trivial! This seems to rely on coherence as we are arguing about "P::Deref for our current P" when establishing the invariant of Pin<P>. It might be the case that even with unsafe marker traits, we will still need some amount of coherence-based reasoning.

One thing I am wondering about: assuming we use unsafe marker traits but keep Pin::new as-is, is there still a soundness hole?

I tried to come up with a precise invariant for Pin<P> that is defined generically in P, but I think that is out of reach of our current formal models. The issue here is the conceptual mismatch that I already mentioned before stabilization: really we want Pin<Ptr><T> (where Ptr is "higher-kinded": it has kind Type -> Type) instead of Pin<Ptr<T>>. I have some idea for how to define the invariant for the former, but no good handle on the latter.

My best guess for this invariant is something like "the Deref impl, if it exists, returns a shared reference to a pinned P::Target; and the DerefMut impl, if it exists, returns a mutable reference to a pinned &mut P::Target. (We have no Rust type syntax for directly expressing "shared/mutable reference to pinned P::Target"; this is basically PinRef<P::Target>/PinMut<P::Target> from the previous non-generic approach. Those I know how to formally define.) Note how strongly this relies on coherence -- exactly what I wanted to avoid with unsafe marker traits.

2 Likes

We held our first "lang team design meeting" today, and dedicated an hour to discussing this topic. You can find the minutes on the lang-team repo, and there is a recording on YouTube. I don't have time for a detailed write-up, but it seemed like we were centralizing on a solution based around strengthening negative reasoning. In particular, I think with a rough set of actions like:

  • Introduce some (unstable) technique to impl<T> !DerefMut for &T shortly
    • this would prevent the impl DerefMut for &LocalType we've seen in this thread
  • Improve the documentation of new_unchecked to describe the invariants to be proven and to emphasize that, in order to rely on 'negative reasoning', there should be an explicit negative impl (presently only available on nightly, but coherence can temporarily be used to similar effect for non-fundamental types; we don't think this is a good basis long term though)
    • @RalfJung also expects to do a more formal treatment here to increase confidence
  • Introduce or sketch out what it might look like to permit negative impls of this kind more generally
    • this is a larger design effort but we have lots of pre-existing thoughts that can be collated
7 Likes

We had some more conversations about this at lunch at MPI-SWS (the research institute I work at), musing about how we'd formally verify the generic Pin<P>, and also the lang team had a design meeting about this. I'll write down some thoughts mostly coming from the "formal specification" perspective; I think @nikomatsakis and/or @withoutboats will write down some more practically minded notes (or it seems Niko did that while I was still typing^^).

As mentioned in my previous post, the big question for a more formal approach to this is: what exactly is the invariant of Pin<P>? In other words, what is the property that makes some value v safely usable at type Pin<P>? (This is the safety invariant of Pin<P>, so I will say "a safe T" or "safe at/for T" when I mean "a value for which the safety invariant of T holds".) While thinking about this, to my surprise I realized that I had a hard time figuring that out for my proposal based on maker traits, while we actually came up with a reasonably-sounding answer for @withoutboats' proposal relying on coherence.

Marker-trait-based invariant

The basic idea for this is to have three unsafe traits, matching the three traits that are relevant for Pin: unsafe trait DerefPin: Deref {}, unsafe trait DerefMutPin: DerefPin + DerefMut {} and unsafe trait ClonePin: DerefPin + Clone {}.

The key trait is DerefPin, and the most important part of its unsafe contract (that the implementer has to prove) is to pick an invariant for what a "pinned P-ptr to Deref::Target" is. So, for example, the unsafe impl<'a, T> DerefPin for &mut 'a T would have to decide what, logically speaking, a pinned mutable reference to T is. It would pick something like "an exclusive borrow for lifetime 'a of a T that is in the pinned typestate" (this is following the formalism that I laid down last year a while ago in this post and this follow-up). It's as if the DerefPin trait had a "virtual" or "logical" field defining that invariant -- an "associated invariant". Furthermore, the DerefPin unsafe contract would demand that Deref::deref can be called for (pointers to) values satisfying this invariant, and that they will then return a value that is a safe Pin<&Deref::Target>. (Yes this is cyclic, but we can define the notion of safe pinned references in advance -- they correspond exactly to the PinMut and PinShr types that we had before the generic Pin<P> was introduced). DerefMutPin and ClonePin would have similar contracts, referring to the invariant picked by DerefPin.

We would then have to add a trait bound to Pin, like struct Pin<P: DerefPin>, and then we could say that the invariant for Pin<P> is exactly whatever the DerefPin impl picked.

I think this is solid, and it avoids talking about "the impl of this trait for this type", instead only ever referring to impls that are "in scope" through explicit bounds. There's "just" one problem: Pin::new. That function is safe, so we have to prove that it returns a safe Pin<P>, and the only thing we have to go by is that P::Target: Unpin. This tells us that for P::Target's (safety) invariant, the "normal" and the pinned typestate are the same. But on its own that doesn't help a bit, when what we have to do is show that the value we return satisfies the invariant for "this is a pinned P-ptr to P::Target". I guess we could add an axiom to the contract for DerefPin that says something like "further, if Self::Target is Unpin (in the semantic sense, not specifically implementing some trait), then the normal invariant for P implies the pinned-P-ptr-to-Target invariant".

Pin::new_unchecked is fine though, it just puts the burden on the caller to prove whatever P requires to be proven for pinned-P-pointers.

So, overall, this would (formally) require adding a DerefPin bound everywhere, thus breaking everything, and also supporting Pin::new requires an extra axiom instead of just "falling out" of the construction. That seems awkward, and not nearly as clean as I had hoped.

Coherence-based invariant

Turns out stating this invariant is much simpler than I thought. It goes something like this: a value v is safe for Pin<P> if all of the following conditions hold:

  1. If P: Deref, then passing a reference to v to Deref::deref is safe to do (will not cause UB), and will return a value that is a safe shared pinned reference (basically, the return value is safe for PinShr<'_, Deref::Target>, if PinShr was still a thing). PinShr is defined like above, i.e., something like "a pointer pointing to some memory that is in the shared-pinned typestate for Deref::Target".
  2. If P: DerefMut, then passing a reference to v to DerefMut::deref_mut is safe to do, and will return a value that is a safe mutable pinned reference (basically, PinMut<'_, Deref::Target>).
  3. If P: Clone, then passing a reference to v to Clone::clone is safe to do, and will return a value that is safe for Pin<P>. This is recursive, but the recursive occurrence is of the kind that I think is fine. (For the initiated: the recursive occurrence is, I think, in a "positive position", meaning the function of which we need a fixed point is monotone, and such functions do indeed have a suitable fixed point that we can use for our definition.)

With this, whoever calls new_unchecked has to do the appropriate verification of deref, deref_mut and clone for the specific value they use.
This clearly explains what went wrong at least in some of the exploits by @comex: whoever constructs a Pin<&T> has to prove the second clause above, and they did this by assuming it to be vacuously true because &T does not implement DerefMut. Except, they were wrong about this. Similar for Pin<&mut T> and Clone.

This also explains why Pin::new is okay: for Deref::Target: Unpin, the argument is a valid P so clearly calling Deref::deref and the other functions on a shared reference to that value is fine. PinShr<'a, T> is the same as &'a T, so the first clause is satisfied. Also, PinMut<'a, T> is the same as &'a mut T, so the second clause is satisfied. And the third clause is the "inductive case", and thus satisfied by the "induction hypothesis". (Yeah this sounds like magic, but I am pretty sure if we pick the least fixpoint this is exactly what will happen.)

Basically, the observation here is that the only pinned pointer types that "really matter" are shared and mutable references. Everything else only matters insofar as we can call as_ref and as_mut to get a pinned shared/mutable reference, and then go on with that.

However, even with the proposed negative impls for the #[fundamental] types, we'd still rely on the orphan checker to make sure that Rc and Arc don't implement DerefMut. That's probably okay, but it seems suboptimal at best to rely on this for safety -- who knows in which ways we will be able to extend/improve the orphan check in the future. So, I think we should require either a specific impl or a specific "negative impl" when reasoning about requirements like the one for Pin here. Then we don't rely on the details of the orphan rule any more. We'd still rely on coherence, but I would feel fairly good about this overall setup. I actually came to like this better than my marker-trait-based proposal. If this kind of coherence breaks, we have a totally different set of soundness issues...

We (at MPI-SWS) hope to formalize this some time soon-ish (and when I say "we" I really mean another PhD student and an intern, not me :wink: ). We don't want to actually formally verify the trait system, so we'll make some assumptions of the form "the Deref impl for this type is exactly that", or "this type does not implement Clone", but we should be able to verify everything else -- and this also makes it clear what exactly we rely on from the trait system. (One "fun" aspect of this will be that even Pin<&mut T> will be defined like above, so actually verifying the specific operations for that might be non-trivial. What we really want is to know that this is PinMut<T>, but we might have to do something silly like call <&mut T as DerefMut>::deref_mut to get there, even though that's a NOP... well, we will see, that's our problem. :smiley: )

10 Likes

Expanding a bit on what niko and ralf said (thanks for the detailed description of the proof work, ralf), I think the best solution in the long term is to support explicit statements that a type does not and will not implement a trait, and then add the relevant statements for all the pointer types. This would allow ralf's coherence based reasoning without basing anything on coherence - for each type that doesn't implement DerefMut or Clone, they can point to the specific line of code that guarantees the type does not implement that trait.

There are reasons to support this feature unrelated to pin (in fact my first big RFC was proposing this back in 2015(!!)), but we don't have to go as far as changing coherence reasoning just yet.

Someone with more knowledge of coherence internals needs to review this, but I think we just need to make this change to the language: today, with the optin_builtin_traits feature turned on, you can add negative impls but only of auto traits. We need to remove the restriction to auto traits, and allow negative impls of all traits when that feature is turned on (so not on stable yet). Then, in std, we add these impls:

impl<'a, T: ?Sized> !DerefMut for &'a T { }
impl<'a, T: ?Sized> !Clone for &'a mut T { }
impl<T: ?Sized> !DerefMut for Rc<T> { }
impl<T:  ?Sized> !DerefMut for Arc<T> { }

This would completely close the fundamental based hole and allow Ralf's team to prove the correctness of the pin APIs with no reliance on the orphan rules, without any change to the APIs themselves. Tada!

It also provides a path toward better supporting users who want to do the kind of coherence based reasoning that has run pin afoul. Users who want to reason about types not implementing traits will eventually have support for explicitly stating that the type does not implement that trait, which they are required to do if they want to create a provably safe abstraction based on that fact (which is resistant to any changes to the orphan rules).


This leaves only the problem with CoerceUnsized, which we didn't talk about today. tl;dr: I think we should just make coerceunsized an unsafe trait. There's a larger discussion about how its a distinct problem and justifies this different solution, but let's focus on the other problem for now.

12 Likes

Right, CoerceUnsized remains open. I don't actually understand that trait well enough to say much here. If we make it unsafe, then what exactly is the contract that unsafe impls have to satisfy? Also, doesn't the compiler automatically implement this trait for tons of things, like i32 -> dyn Send? What is the argument for why all these automatic implementations satisfy the contract?

1 Like

Thanks for the write-up Ralf. Re-reading it was helpful.

I am quite happy with a solution based on negative impls, which are a feature that I've long wanted to pursue, but I want to dig a bit more into the difference between "marker traits" and "coherence-based invariant" for a second.

It seems like the key difference is that, in the coherence-based invariant, we did not really define a "custom invariant" for "what a pinned P is". Instead, we stated a single invariant that all pinned pointers must follow, and it is based around the behavior of the three traits (Deref, DerefMut, and Clone).

The only place where a "custom" pinned invariant comes into play is with respect to P::Target -- that is, you have to show that when you deref P you get back a valid pointer to memory in the shared-pinned typestate for P::Target.

Is that correct?

1 Like

CoerceUnsized is just an alternative API to Pin::new_unchecked, imposing the same invariants. By doing Pin::new followed by a CoerceUnsized to an unsized + ?Unpin type (e.g. Pin<Box<i32>> -> Pin<Box<dyn Display>>) you have a constructed a pinned pointer to a non-Unpin target. Therefore, implementing CoerceUnsized for a pointer type is morally equivalent to creating a safe wrapper around Pin::new_unchecked for that pointer type.

This leaves two (edit: three) options:

  • Make CoerceUnsized unsafe
  • Add an unsafe marker subtrait for CoerceUnsized which controls specifically whether a pinned version of that pointer can be coerced
  • Change nothing about CoerceUnsized but add an unsafe marker trait bound on Pin::new and require types that implement that marker + CoerceUnsized to be valid to coerce pinned.

This is strictly a trade off between API complexity & expressiveness. CoerceUnsized is unstable with no near term plan to change that so its not very important to decide definitively at this moment.

(Also in your final question you're confusing CoerceUnsized with Unsize. CoerceUnsized controls which pointers you can wrap types in to get the automatically generated unsize coercion you're talking about.)

2 Likes

Well... kind-of. :wink: The custom "what is a pinned T" (aka the "pinned typestate") exists in every proposal, that is just part of how to model pinning.

What the unsafe marker trait proposal adds on top of that is picking an invariant for a "P-ptr-to-something-pinned", e.g., a "Box-to-something-pinned" or a "Arc-to-something-pinned". In the coherence-based proposal, that invariant is fixed to be just about the three relevant traits.

(Maybe that's what you meant, but you talk of "a pinned P" was confusing. The pointer isn't pinned, it points to something pinned. Orthogonally to that of course we also have Pin<&mut Box<T>>, at which point the Box itself is pinned, which is entirely independent from a "Box-to-something-pinned".)

Okay, so the first option would basically mean that the safety contract of CoerceUnsized itself talks explicitly about pinning? Like, CoerceUnsized<U>: T would have to contain a clause like:

  • forall v, if v is a safe Pin<U> then v is a safe Pin<T> (maybe I swapped U and T here)

Does that make sense? What is the argument for this being satisfied, e.g., by Arc's instance, and why does @comex' counterexample fail to satisfy this contract?

If we follow that route, we should add this to the CoerceUnsized docs and audit all existing instances.

It's not important to make a call either way, but I think it is important to know the safety contract of a trait that we make unsafe. That's why I asked. Just making a trait unsafe without knowing what exactly the contract is doesn't really help if the goal is to increase our confidence in the entire thing.

2 Likes

For sure! So yea its the same contract as Pin::new_unchecked and we just somehow need to make it so that types which meet the bound for P in both of these two impls have to have an unsafe impl where they assert they uphold that contract.

Yes, this is what I meant. I see why saying "P is pinned" could be confusing, indeed.

Are DerefMut::deref and Deref::deref_mut typos?

3 Likes

Yes, sorry for that and thanks for pointing it out. Fixed.

1 Like