Unsoundness in `Pin`

Why would this be enough?

The compiler knows that DummyTrait is not implemented for the type the user provides due to coherence (assuming the type is defined in the crate with the exploit impl), and so that impl doesn't prevent the exploit impl, no?

Seems like the simplest thing would be to use specialization to have Pin::as_mut and Pin::clone panic in the exploitable cases, which would be backwards compatible (assuming no code relies on this behavior).

Even better, make them unavailable at compile time, although that requires to support specialized negative impls.

1 Like

Here's just the first method without the macro to switch between both; I found that easier to follow.


As you said yourself in the next paragraph, this is not at all about stack vs. heap, and it also has nothing inherently to do with pointers. It is about containers in general and that you can pin a container without pinning its content (whether that content is stored in-line or behind a ptr indirection is irrelevant). And that is a core feature of pinning.

No, I disagree on the causal connection you are drawing here. The proper fix for this complex discussion is support for pinning in the compiler.


I have noticed that as a common point of confusion, but I see no strong connection between that as the safety of as_ref. Back when the API got discussed, I was worried about safe as_ref (mostly because it made the formalism more tricky), but my proposed alternative still would make pinning a non-structural property. I think changing pinning the way you suggested (that everything recursively is pinned) would make it basically useless -- after all, self-referential async fn still want to have the ability to move out of their unpinned local variables.

Whether pinning propagates into inner data and whether as_ref is safe are distinct considerations, the only connection is that one of the four possible combinations (pinning always propagates + as_ref is safe) is unsound. But propagating pinning was never a reasonable option for us (AFAIK).

I also don't see the soundness issue as being caused by not propagating pinning. The soundness issue was introduced when we moved from Pin+PinMut+PinBox (dedicated pinned pointer types) to the generic Pin wrapper, and made that rely on existing safe traits only (Deref+DerefMut).

So, I agree with all of your factual statements but with basically none of the conclusions and causal relationships you are deducing from those. :wink:


The proper fix IMO is to require some unsafe trait in Pin::new and Pin::deref/Pin::deref_mut that is used as an explicit marker that the type and Deref/DerefMut impl are well-behaved:

unsafe trait PinDeref: Deref {}
unsafe trait PinDerefMut: PinDeref+DerefMut {}

Then these two should be used everywhere that Deref/DerefMut is used in the Pin API. In fact I had a bad feeling about this exact point when the API landed, but was unable to actually construct a tangible example... maybe I should trust my feelings more. :wink:

Alas, now it is probably too late for the proper fix.

15 Likes

Talking about our stability guarantees, aren't soundness problems exempt?

25 Likes

As a soundness fix, I'd argue adding an unsafe trait bound to Pin::new/deref/deref_mut is about as uninvasive as a breaking change can get. The only requirement is to add the marker trait impl, which is about as trivial as a fix can get.

The other "morally correct" fix of preventing "lying" deref/mut implementations on #[fundamental] types is more theoretically breaking, as it removes potentially correct code (in the absence of pin) without providing a simple alternative. Although, this one has the advantage of not breaking anything unless it actually has a chance of being unsound, and there's a decent chance it's not even used for legitimate cases.

9 Likes

You're right, it wouldn't be enough as written. I misunderstood the coherence rules around blanket impls.

For the record, I'm somewhat persuaded by the suggestions by others to use a different approach entirely, like @RalfJung's PinDeref and PinDerefMut. But it is possible to fix the DummyTrait approach, so I'll explain how. (I actually tested it this time...)

With my original suggestion:

impl<'a, T> DerefMut for &'a T where T: DummyTrait { ... }

the compiler does not prevent downstream crates from adding impls for local types that don't impl DummyTrait. This is because the way std could create a conflicting impl would be to add a "blanket impl", impl<T> DummyTrait for T (possibly with where clauses), and adding a blanket impl of a trait is always a breaking change (see last two paragraphs).

However, an impl like impl<T> DummyTrait for DummyStruct<T> is not considered a blanket impl, and is not a breaking change. Therefore, if you have:

pub trait DummyTrait {}
pub struct DummyStruct<T>(T);
impl<'a, T> DerefMut for &'a T where DummyStruct<&'a T>: DummyTrait { … }

Trying to add an impl in a downstream crate now produces this error (here, cra is the parent crate playing the role of std):

error[E0119]: conflicting implementations of trait `cra::DerefMut` for type     `&Foo`:
 --> src/main.rs:3:1
  |
3 | impl DerefMut for &'static Foo {}
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: conflicting implementation in crate `cra`:
          - impl<'a, T> cra::DerefMut for &'a T
            where cra::DummyStruct<&'a T>: cra::DummyTrait;
  = note: upstream crates may add new impl of trait `cra::DummyTrait` for type  `cra::DummyStruct<&Foo>` in future versions

It's also not possible for a downstream crate to try to satisfy the dummy impl by implementing DummyTrait:

error[E0117]: only traits defined in the current crate can be implemented for arbitrary types
 --> src/main.rs:3:1
  |
3 | impl DummyTrait for DummyStruct<&'static Foo> {}
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ impl doesn't use types inside crate

The other option is to reserve the impl (basically a poor man's negative impl)

#[rustc_reservation_impl=
    "a custom implementation would be permitted since &_ is fundamental, but be very unexpected, \
    and Pin requires the lack of this impl for soundness"]
impl<T: ?Sized> DerefMut for &'_ T {
    fn deref_mut(&&T) -> Self::Target {
        panic!("not allowed")
    }
}

#[rustc_reservation_impl=
    "a custom implementation would be permitted since &_ is fundamental, but be very unexpected, \
    and Pin requires the lack of this impl for soundness"]
impl<T: ?Sized> Clone for &'_ mut T {
    fn clone(&&mut T) -> Self {
        panic!("not allowed")
    }
}
11 Likes

Ah, cool. Didn't know about that one.

I think this actually has nothing to do with Pin.as_ref and Pin.as_mut, and everything to do with the CoerceUnsized impl for Pin. Here's the POC with the as_ref and as_mut calls removed. Note that as_mut is still called so that we can poll the future, but this occurs after we've constructed the 'evil' future. Once we have Pin<&dyn Dubious>, it's game over.

It's entirely unecessary to call as_ref or as_mut when constructing the 'evil' pin, due to the CoerceUnized impl for Pin. You can simply write let evil: Pin<&dyn Dubious<'_>> = Pin::new(&Wrapper), which due to the implicitly inserted cast, will just work.

The actual calls to as_ref or as_mut, in and of themselves, do nothing. When we start with a Pin<&Wrapper>, calling as_ref will give us a Pin<&((&Wrapper) as Deref>::Target)> - a.k.a a Pin<&Wrapper>, since <&Wrapper as Deref>::Target> == Wrapper. However, the CoerceUnsized impl for Pin means that the return value is implicitly cast to the desired type (Pin<&dyn Dubious<'_>>).

I think the root issue here is that the CoerceUnsized impl for Pin allows you to violate the contract of Pin, without using unsafe When you construct a Pin normally, you must either:

  1. Use Pin::new, which asserts that the target type is Unpin.
  2. Use Pin::new_unchecked, which requires to assert via unsafe that your pointer type upholds the contract of Pin.

However, using the CoerceUnized impl for Pin, you can write Pin<T> as Pin<U> when U: CoerceUnsized<T> holds. At no point were you required to assert (via unsafe) that U actually upholds the contract of Pin. As @comex showed, this can be weaponized via dyn Trait - you can go from a Pin<&T> to a Pin<&dyn MyTrait>, despite the fact that &dyn MyTrait does not deref to an Unpin type (dyn MyTrait). You should have been required to prove that dyn Trait is Unpin (it isn't) or that &dyn Trait upholds the contract of Pin (it doesn't, via its 'malicious' DerefMut impl).

10 Likes

Basically, I think that @comex's Method 1and Method 2 are both special cases of Method 3. In particular:

I think this is the other way around - Method 1 does require coercion, but the coercion is obscured by the intermediate Pin.as_ref call.

3 Likes

I think there are two ways to solve this:

  1. Restrict the CoerceUnized impl for Pin to only allow coercing to types U such that <U as Deref>::Target: Unpin holds. Essentially, this is the same requirement as Pin::new. Note that this would by design rule out coercing to types which deref to !Unpin types - see approach two.

  2. Add a new unsafe trait PinSafePointer, similar to @RalfJung's proposed PinDeref and PinDerefMut. This is added as a bound on U in the CoerceUnsized impl for Pin. Implementing this trait for a type T requires that every possible value of T is safe to pass to Pin::new_unchecked. Effectively, this is expressing the contract of Pin::new_unchecked at the type level, rather than the value level.

Personally, I would lean towards option 1. I think this discussion has shown that CoerceUnized is very confusing and subtle, especially when it interacts with Pin. I think allowing Pin coercions is fine in the 'obviously correct' case of coercing to a type that derefs to an Unpin.

However, I think the kind of coercions involved in the POC (coercing to a type which derefs to a !Unpin) type will essentially never occur in real code. If someone has a desire to convert between two such Pin types, I think they should be required to explicitly acknowledge the implications of doing so by first calling, Pin.into_inner_unchecked, calling deref/deref_mut, and constructing a new Pin via Pin::new_uncheckd.

The one thing I'm not certain about is how this interacts with dynamically sized types. Does restricting the CoerceUnsized impl make certain use now impossible, or only achievable via transmute?

1 Like

They are, but we ended up with a hacky fix that appears to have minimal breakage in the ecosystem and a elegant/"real" fix that has catastrophic breakage, we may very well take the hacky one.

1 Like

In general, I think it's a giant red flag to have a type MyType such that

  1. MyType has at least one private field
  2. An impl CoerceUnsized<MyType> for ... exists, where ... is anything (a type variable, a concrete type, etc).

With this combination, you're saying that:

  1. You dont want consumers outside the module constructing arbitrary instances of MyType (i.e. with fields set to arbitrary values). It's likely that there's some additonal invariant you want to maintain.
  2. You do want with consumers outside the module to construct instances of MyType from arbitrary values of some other type.

While it's possible that both could be true, it seems really unlikely that anyone would actually want this. I wonder if it would make sense to have a lint for this.

EDIT: Cell, Rc, and Arc meet the above conditions, but are perfectly fine, since they can in fact be constructed for any T. Maybe it would be better to make CoerceUnsized an unsafe trait.

4 Likes

I'm not sure if this is the right thread, but what could have been done differently to avoid this problem? Should Pin have baked another year or were we doomed to do something wrong just due to the complexity of the API?

10 Likes

The implicated functions are Pin::as_mut (method 1) and Pin::clone (method 2), not Pin::as_ref.

The original PoC uses as_ref() (method 1) and as_mut() (method 2) to upcast to a trait object; you've shown that that's unnecessary, and you can rely on implicit coercion instead. But that's not the interesting part. dyn MyTrait does not impl Unpin, but you don't get any extra abilities from the type not being Unpin. It simply means that the type system no longer guarantees that is safe to unpin – but at that point the concrete type is still Wrapper, which actually is safe.

The dangerous part happens when it calls as_mut() for method 1 – a different call from the one you removed – or clone() for method 2. That's when the Wrapper is swapped out for the future.

Don't believe me? Well...

Here is a version that does the coercion before pinning. It goes from Box<Wrapper> to Box<dyn Dubious> to Pin<Box<dyn Dubious>> to Pin<&mut dyn Dubious>, so there's no coercion happening within the pin.

I'd like to make a version that doesn't use trait objects at all, but the lifetimes don't work out; having a type contain a mutable reference to itself makes the borrow checker act rather weird. I may have been wrong when I said trait objects weren't the only possibility.

5 Likes

Interesting observation! However, I think this is also consistent with what I said: Pin<&dyn Dubious> would not be a problem if we couldn't actually employ its malicious DerefMut impl.

The issue is having a Pin<Ptr<T>> where Ptr (with its Deref/DerefMut) do not uphold the contract, and then actually using that Deref/DerefMut. If we guarded all operations that use the Deref/DerefMut by some unsafe trait bound, then creating Pin<&dyn Dubious> would not be a problem any more.

Basically, we have to either guard the construction of Pin<Ptr<T>> or the use, making sure that Ptr is a well-behaved pointer type. We currently try doing that during construction, but maybe we should try guarding the uses instead. That's an easier argument to make as we do not have to make "Ptr is well-behaved" part of the invariant and maintain it everywhere; we can just check it locally when we need it.

(And as @comex' latest post shows, even removing CoerceUnsized we still would not properly "guard the construction.")

That I agree with.

Good question. We even had a sketch of a formal model for pinning, but that model was always about the three separate pointer types (PinBox<T>/PinMut<T>/Pin<T>). The switch to the generic Pin was done without formal modelling.

Actually trying to prove things about Pin could have helped, but one of our team actually did that over the summer and he did not find this issue. The problem is that our formalization does not actually model traits. So while we do have a formal version of the contract that Deref/DerefMut need to satisfy, we cannot even state things like "for any Pin<T> where T: Deref, this impl satisfies the contract".

Basically, our formal model actually models the variant I proposed, where we do use an unsafe marker trait, and associate the contract to that. That was my mental model for the generic Pin pointer wrapper from the beginning, which is why I was so surprised when the actual API ended up just relying on the (safe!) Deref/DerefMut traits.

So maybe one lesson is that we don't actually understand our trait system enough to write APIs that rely on coherence, i.e., that make statements like "the (unique) Trait impl for T satisfies X", as opposed to locally demanding "give me some Trait impl for T satisfying X". When assuming things about a trait impl, we should always do that through some unsafe trait, and never through an unsafe function that makes assumptions about "whatever happens to be" the trait impl for its type parameter T. (This is not to say that the latter is impossible to get right, but it clearly much more tricky as now the whole set of coherence rules becomes safety-critical for this API.)

16 Likes

I filed an issue so that this has a tracking number:

But I think we should keep discussion here now.

6 Likes

Just to check my understanding, what makes these types special is precisely the safe wrappers that give you Pin<P> for Box, &mut, and &. In other words, if we add a new safe wrapper, we have to check for similar exploits with those types, but in general we don't have to worry beyond that?

I'm not sure that works -- reservation impls were designed to allow overlap from downstream crates. They act in somewhat surprising ways, IIRC. See the tracking issue for more details.

3 Likes

This seems much more defensible than the "fake clone impls", which are quite hard for me to understand. We could certainly consider phasing it in with a warning period, given the soundness justification. There is (additionally) a need for a "pure deref" concept, though, and I hesitate to have too many marker traits -- but I suppose we could add this one as and then layer in a pure one that (likely, but not certainly) subsumes it.

1 Like

Interesting indeed. I'm curious whether anyone can estimate the likely breakage from the various proposed fixes?

1 Like