Unsoundness in `Pin`

Hm, good point. As far as I can tell, all of these trait impls also have to be sufficiently well-behaved.

But this means our docs for Pin::new_unchecked are incomplete! They should list all the trait impls that users have to promise are "well-behaved" (whatever that means). So in particular, given the derive, we should explicitly list Clone, Hash, PartialEq, Eq, PartialOrd, Ord. Your proposal doesn't free users from having to check all of them, it just changes when they have to check it and how this is expressed in the type system.

Clone is the only one here that returns a new Pin... but also the others get access to &Ptr<T> when really everybody should only be able to access Pin<Ptr<T>>. I cannot come up with a solid correctness argument here, but I also cannot come up with a counterexample.

I don't understand the question. What if the way we implement Vec is unsound? Then we have a bug. But the bug cannot involve reasoning about surprising trait impls, because there is nothing in the correctness argument that would interact with the trait system in interesting ways.

The marker traits make arguing for correctness much simpler by excluding a huge and subtle subsystem of Rust entirely from the discussion. Of course, it doesn't make bugs impossible. Nobody would say that...

Part of why this feels so different for me is likely that we actually have a framework for formally verifying things like Vec or the old separate-ptr-type pinning API. We do not have a formal framework for verifying anything that interacts with the trait system in interesting ways, like Pin. So there is a huge cliff here in terms of what is formally understood and what not, and for me, that makes a big difference. Also empirically, I feel this bug shows that there is a complexity cliff here. This is not to say that it is impossible to get the complex thing right, just that is is harder.

Now I am even more confused. I agree that stating abstractly the requirements is hard, but adding the unsafe marker traits would make it much simpler! Then we can concretely refer to them in all Pin methods that need to rely on anything.

This is exactly why I am proposing the unsafe marker traits -- they make these requirements basically "fall out" of a solid correctness argument for Pin.

The fact that not even you can state the abstract correctness argument is a big warning bell IMO. I thought you had a very concrete idea for those in your mind and we only got a small part of that out for the new_unchecked docs. Had I known that I would have opposed stabilization -- we shouldn't stabilize APIs of which we cannot state the soundness requirement.

De facto, right now Pin for user-defined ptr types is "unstable" in the sense that the contract for that is in flux and might change -- and there likely exist multiple, mutually incompatible interpretations, such that code relying on one combined with code relying on the other is unsound. We have seen that before (Leakpocalypse is an instance of this).

Interesting. On first sight, this is a good argument for @withoutboats' proposal to rule out certain DerefMut impls.

8 Likes

18 posts were split to a new topic: Transitiveness in Pinning

I'd definitely like to see the "correctness requirements" stated under both models. I think it's a bit incomplete here to say that "you can just reference the unsafe PinSafeDerefMut trait" -- we still have to write what the conditions on that trait are.

I did find one thing interesting in this quote by @withoutboats:

The key here is that you have to "prove there is no way" to do something. This does sound a bit tricky. It seems like in both cases we'd prefer to be able to enumerate the ways that something might happen and prove that, in those ways, it does not. It also helps to explain why fundamental is so key here, since we are reasoning about the absence of impls, and that is much harder to do with a fundamental type like &T.

1 Like

Getting &Ptr<T> is safe for most pointer types, but dangerous for Rc. If you could go from Pin<Rc<T>> to &Rc<T>, you could clone it to get a new unpinned Rc<T>, drop the pinned version, then (if there were no other references) use Rc::get_mut to get &mut T.

However, in most cases you can't add your own impls to Rc<T> because it's not marked #[fundamental]. And the existing impls of the relevant traits for Rc<T> are safe because they delegate to the impls for T, passing on only &T rather than &Rc<T>.

But the coherence rules are more flexible for traits that have type parameters! And that includes the comparison traits, like PartialEq. You can write an impl like:

impl PartialEq<Rc<RemoteType>> for LocalType

or even

impl PartialEq<LocalType> for Rc<RemoteType>

though the latter is legal only if RemoteType is non-generic (but that's due to change in the future).

Combine that with the impl for Pin:

impl<P, Q> PartialEq<Pin<Q>> for Pin<P>
where
    P: PartialEq<Q>,
{
    fn eq(&self, other: &Pin<Q>) -> bool {
        self.pointer == other.pointer
    }
    // [...]
}

And sure enough... here is a new exploit, for what I'll call Method 4:

Playground link

This exploit cheats in one respect. There's nothing in the standard library that actually depends on the pin guarantee of Pin<&T>; there's only futures, which require a mutable reference. So the exploit includes a copied-and-pasted version of @withoutboats' pin-cell crate. This crate exposes a variant of RefCell which works with Pin: given a Pin<&PinCell<T>>, you can get Pin<&mut T>, with uniqueness being checked at runtime. The implementation uses unsafe code, but the interface is intended to be compatible with the design of Pin.

Without pin-cell, you can still observe move-before-Drop, but no dramatic crashes.

(The reason I copied and pasted pin-cell instead of just depending on it is that the playground doesn't have it available. The exploit does not depend on PinCell being crate-local.)

I'm not sure how to fix this hole. A PartialEq impl between an Rc<T> and a non-Rc type is a little strange, but it's not obviously insane, unlike the impls involved in the other exploits. It seems considerably more likely that banning such impls altogether would break real code. So I think we should go with the unsafe-marker-trait approach, at least when it comes to PartialEq and friends (we could still ban crazy impls of DerefMut and Clone). We'd have to add some sort of bound to either Rc::pin, which is currently implemented for all T: Sized, or to Pin's PartialEq impl. Not sure which is better.

14 Likes

Nice work!

This whole situation is really quite a mess.

I think (one of) the main takeaways from this is that using derive on Pin is a really bad idea. In addition to adding bounds to Pin::deref and Pin::deref_mut, we need to ensure that every single trait impl (which can access the private field of Pin) is either completely harmless (e.g Copy) or relies on an unsafe guarantee that the user-supplied impl is not evil.

Specifically, I think that we should not try to fix this by adding additional restrictions to unrelated types/methods (e.g. Rc::pin). If we're relying on all potentially problematic std pointer types opting "out* of bad behavior, I think we're virtually guarantees to miss something. Even if we catch all current problems, anyone adding a new ats pointer type will need to ask 'Does this somehow completely break Pin).

I think the right solution is to inspect every method or trait impl on Pin that accesses the private field. We need to ensure that there is either no way of letting user code subvert the guarantees of Pin, or that we have an unsafe marker trait which asserts that the user impl is sane.

I'm not sure how we can do this while keeping Pin ergonomic for the majority of users, and (hopefully) avoiding an explosion of PinSafeX traits for every that X that Pin has a (module-local, can access the private field) impl of

4 Likes

The new example of @comex does not break the obligation stated by @withoutboats. All the Pin<&mut T> in that example are immediately polled and dropped. Nevertheless, it clones a Rc out of a Pin before dropping the pin. Should Pin guarantee existence on immutable pointers? PinCell assumes such guarantee, but the posed proof obligation says otherwise.

1 Like

There's also a guarantee that the memory location will not be reclaimed before running Drop. I don't think anything in the std library relies on that, thus the requirement of using PinCell to observe the unsoundness.

As a semi-general workaround, we could instead implement Hash, PartialEq, Eq, PartialOrd, and Ord when T: Trait instead of when Ptr<T>: Trait, using the Deref implementation manually. This would limit the places we expose &Ptr<T> to only Deref, DerefMut, and Clone. This makes both approaches (additional proof obligations on new_unchecked or an added unsafe trait) a bit more scalable: we could later add trait implementations that manually call deref without having to worry as much about the trait implementation being correct. After all, we already have impl<P: Deref> Deref for Pin<P>.

I suppose there are places where this would create a change in behavior, but I can't see a useful example of that: even tricks like pointer equality can still be used, since the underlying PartialEq impls still get references.

(I'll throw my hat in to say that I much prefer the obligations being on new_unchecked, i.e. at construction time, instead of being at use time: my fuzzy intuition is that this places the minimal burden on users of Pin, since they don't need to follow any complicated unsoundness decisions to understand what this weird PinDeref trait is.)

7 Likes

Would this even change stable std behavior? (I know it could break custom pin-capable pointers.) IIUC, the only pinned types that you can build soundly are Pin<&>, Pin<&mut>, Pin<Box>, Pin<Rc>, and Pin<Arc> (and any impl Deref<Target: !Unpin>).

All of these implement {Hash, PartialEq, Eq, PartialOrd, Ord} by delegating to the deref target.

I guess the breakage could be observed via a smart pointer wrapper that implements PartialEq as identity equivalence rather than structural equivalence. Then, for T: Unpin, we'd have P: PartialEq while T: ?PartialEq, and the two impls don't give the same result.

Well, it's breaking, but this seems like to me the best way to bound the guarantees required by Pin::new_unchecked. Pin only needs to interact with Deref(Mut) and Clone to do its job; the other impls are really just convenience impls that turned into an attack vector.

2 Likes

That said, we control the full set of trait implementations on Pin (that can access &P<T>). To be less breaking, we can just list all trait implementations that convert from &Pin<P<T>> to &P<T> rather than Pin<&P<T>> or &T, and require that they always treat the &P<T > as if it were Pin<&P<T>> if it's going to be used with Pin::new_unchecked. That handles the trait leakage in theory, even if it still requires reasoning about external impls (and doesnt handle the impl for #[fundamental] or CoerceUnsized cases).

I think due to the pain of reasoning negatively about external impls (especially as we relax coherence rules!) we should avoid negative reasoning of such as much as possible. Unfortunately, Pin's guarantee is intherantly negative, so that's difficult.

I think my current preference is to tighten up all impls except Copy/Clone/Deref/DerefMut to directly apply the deref first, prevent nonsense impls of Deref/DerefMut/Clone on #[fundamemtal] types, and be very careful with CoerceUnsized.

Hi, author of stable_deref_trait here.

When originally creating the crate, I chose the gaurentees based on a review of the existing crates owning_ref, rental, and shawshank, and which invariants they appeared to be relying on. The fact that you can impl DerefMut for &T means that shawshank is very likely unsound, and owning_ref and rental probably are as well (which is more alarming, since owning_ref is a much more popular crate).

Regardless of what else we do, I think we should prevent impls of DerefMut on &T. The fact that this was even possible came as a surprise to me, and presumably to the authors of many other unsafe crates as well.

9 Likes

Has there been any work on formalizing Rust's trait system? Or some "well-known" reasons why it's impossible or unwise to do so? I didn't see anything about traits in RustBelt.

Wow, you managed to weaponize this, nice. :smiley:

I think this confirms my suspicion that we are making assumptions about every one of these traits and that so far, we do not have a clear handle on what these assumptions are -- let alone how to enforce them. I also see it as another argument for using unsafe marker traits, to make these assumptions more explicit.

Agreed. Everything that has direct access to the private field of Pin is highly suspicious.

I'd even consider moving out code like this that just safely composes the core API into a separate module, to make sure it doesn't access said field. But there might not be enough code to warrant this.

I don't think it is inherently negative. I mean, we have a WIP positive formalization of it in Coq. It's just very hard to turn that into something intuitive, into something that can be explained without all the formal boilerplate around it.

Interesting proposal, I like that idea. If we go with unsafe marker traits, this would mean we need fewer of those.

This goes well with my proposed module separation: we could impl PartialEq for Pin (and all the others) outside the module that has access to the private field of Pin. That can be done no matter whether we put the obligation on new_unchecked or on as_mut.

By users, I assume you mean people using Pin with existing smart ptr types? For them nothing changes with my proposal as the existing pin-supporting smart ptr types will implement the right trait bounds. I doubt many users will write code that uses Pin and is generic over the smart ptr type.

Then there are also those users that use Pin with their own smart ptr type. I do not know how common that is. But for those users, I think the "complicated unsoundness decision" are crucial for them to understand to make their use of the API sound. I think explicit unsafe marker traits will help with that.

No and no, to my knowledge. It's just hard. But I know several research groups have an interest in doing this. We are just all still busy with the borrow checker. :wink:

Of course, then combining the trait system with reasoning about unsafe code is yet another totally different game.

5 Likes

Yea I agree with this: I think we should "shore up" our pointer types so they work the way we believe they would work (specifically the issue with references) and then continue to place the burden on whoever implements constructors to pinned pointers.

So, this is exactly the sort of thing I was worried about. In my opinion probably Pin::new was too ambitious and we should not have any safe constructor for Pin generic over P. Deprecating Pin::new will be a bummer, though. Possibly we should have a marker trait for Pin::new implemented by all our pointer types. Just one marker trait though, which asserts correctness for all of the behavior of pin.

1 Like

I don't see how Pin::new is at fault here. It's the PartialEq impl for Pin that is wrong. It unsoundly creates a &Rc<T> from a &Pin<Rc<T>>, and we should prevent it from doing so.

What would be the contract that Pin::new violates here? Its only "fault" is the bad combination of Pin<Apple<T>> with Pin<Rc<T>>, but these could be defined in independent parts of the crate tree. We'd have to change our contract for Pin::new_unchecked to cover PartialEq for all combinations of all pinning smart ptr types. Surely that is not a solution.

Instead, we have to require, in impl PartialEq for Pin, that the concrete PartialEq impl we are using is well-behaved. That is the assumption that we are missing currently. (Or, as has been proposed above, we could use Pin::as_ref to step over the smart ptr PartialEq and directly use the one for T, which does not require any extra assumption in the PartialEq beyond what as_ref needs.)

I think blaimg Pin::new here is a big mistake, and will not lead to a reasonable safety contract. We end up having to reason about a combinatorial explosion of any two pinning ptrs that might be compared with each other.

The issue here is making a promise for a trait that's not a super-trait of the unsafe trait. Like, a DerefPin trait that wants to make a promise about the DerefMut impl for the same type if it exists -- we couldn't currently legally implement that trait for &T as we have seen. This is exactly what underpins the soundness holes we saw here.

7 Likes

Suppose we agree to change the PartialEq etc. impls so that we only depend on the correctness of DerefMut and Clone, and we also agree to somehow make &mut T: Clone and &T: DerefMut affirmatively impossible. (Both of these sound like good ideas to me.) And then the remaining question is whether to also add unsafe marker traits.

What would the contract for those traits be? "Don't try any funny business"? Is there any contract we could spell out which it would be possible to violate; is there any example we could point to of what not to do? Having to write unsafe impls to certify that you're not doing anything which, to the best of our knowledge, isn't even possible to do, feels like a faintly bizarre state of affairs to me.


My second thought is a bit fuzzier. Some time ago there was a debate over whether to add traits such as unsafe trait TrustedOrd: Ord to the standard library, with the purpose that data structures like BTreeMap would be able to assume that these traits were correctly implemented, allowing simpler and more efficient implementations of methods for the data structure, instead of having to account for the possibility of pathological impls like ones which invoke random(). The outcome of this debate (which makes a lot of sense to me) was that we shouldn't, because the purpose of Rust is not to be able to say "well, it's your fault -- you wrote unsafe!" when something goes wrong, but to reduce the risk of things going wrong outright.

These situations are very different. But a common element is the unsafe trait -- it strikes me that adding an unsafe trait is a very easy way to obtain a system which is sound in the technical sense, while not necessarily having an impact on the likelihood of code going wrong.

(In the extreme, we could add a marker trait unsafe trait Safe, remove all other unsafe annotations, bound all methods by where Safe<()>, and make the contract of Safe be that it may only be implemented for () if one's program is safe! This would be sound, but not in any useful way.)


Above, Ralf wrote:

I don't know if this is true, but it's at least a good motivation. If we can formulate sensible and comprehensible contracts for these traits, and believe that their presence will reduce the risk of people writing programs that go wrong, then let's add them. Which circles back to the first part of my comment, which was, "can we?".

4 Likes

This made me think about unsafe fields. It wouldn't have prevented this problem, but having

pub struct Pin<P> {
  unsafe pointer: P
}

would have required manual implementations of all relevant traits on Pin, with an unsafe block. This makes them extra visible when auditing the code, and preferably requiring an explaination of the impls safety. But even then, the problem is still difficult as

impl<P, Q> PartialEq<Pin<Q>> for Pin<P>
where
    P: PartialEq<Q>,
{
    fn eq(&self, other: &Pin<Q>) -> bool {
        unsafe { self.pointer == other.pointer }
    }
    // [...]
}

doesn't look very dangerous to me, and the subtle details of this thead easily overlooked.

5 Likes

Is it feasible to start by determining some non-minimal, but sufficient conditions for Pin safety? We can then backwards-compatibly determine more permissive sufficient conditions until we find the necessary conditions.

3 Likes

I'm concerned that this would make Pin much more confusing when custom pointer types are involved. Currently, it's possible to treat Pin<P<T>> (in safe code) as 'just' a restricted version of P<T>. There are methods that you cannot call for !Unpin types, like Pin::get_mut, but the methods that you can call just pass through to the underlying type.

I think this is a good thing - it shouldn't be necessary to understand the details of Pin if you're just using a Pin in safe code. A Pin just (conceptually) acts as an another wrapper type, like Box, Rc, Cell, etc.

Making trait impls bypass the wrapper type would (as far as I'm aware) make Pin unique as far as ats wrapper types go. Of course, Pin is already unique in terms of its constructor requirements - but this only affects people using the new_unchecked constructor. I'm worried that this will create problems with custom wrapper types with unusual (but valid) impls of PartialEq and other traits. Users will essentially have to remember, 'wrapper types usually just pass through trait impls, except for Pin, which does something different'.

I think we should aim to make using an existing Pin as straightforward and unsurprising as possible. The 'weirdness' of Pin should only be noticeable when you need to use new_unchecked, or make a custom wrapper type for use with Pin These are the cases where you actually need to understand what Pin means.

4 Likes

Yes. For Deref and DerefMut, the contract is something like "if the argument is in the pinned typestate, then the result must be in the pinned typestate, too". We'd obviously need a bunch of docs to explain all this typestate business, but there is a contract here, and we even have a rough formalization of it in an under-development Coq mechanization.

This is exactly what "method 1" violates: it returns something unpinned even if the argument is pinned.

For Clone and PartialEq, I don't know the details yet, but I expect them to be similar.

The comparison does not actually work. With BTreeMap, we obtain a data structure that is correct under more circumstances, i.e., we actually reduce the proof burden on clients. On the other hand, the "coherence-based assumptions" proposal (the one where Pin::new_unchecked and friends impose requirements on traits they don't depend on) does not mean that a user has to prove less than they would have to prove with an unsafe marker trait. The things I said above about Deref preserving the pinned typestate all still apply in this proposal. The only different to unsafe marker trait is the part of the pinning API where the proof obligation has to be checked by the other:

  • With unsafe marker traits, the proof obligation arises when calling Pin::as_mut or Pin::clone or one of the other methods that actually invoke the trait method that is mentioned in the proof obligation.
  • With coherence-based assumptions, the same proof obligation arises when calling Pin::new_unchecked (or when otherwise constructing a Pin).

In other words, avoiding unsafe marker traits here does nothing to alleviate the issue you are raising.

I know this is a strawman, but I have to push back on this. Every proposal we discussed here still allows modular reasoning, i.e., you can verify some piece of code once and then make it re-usable as a library. This is the key to making type systems useful, and it is the key for why safe typed abstractions of unsafe code are so useful: you can safely compose everything.

Safe<()> is not an extreme variant of a marker trait, it is qualitatively different by giving up on modularity and composition.

With unsafe marker traits, we are still modular. With coherence-based assumptions, we need a modular way to reason about the trait system, which is exactly what we got wrong so far. We certainly hope that there is a way to reason modularly about the trait system, but this demonstrably is a hard problem. For me it also makes a bit difference that we have no formal guidance whatsoever so far for doing this.

This is a valid concern, but I feel the alternative of having to make up contracts for each of these traits is worse.

5 Likes