The Pin unsoundness and `CoerceUnsized`

I guess another fix would be that CoerceUnsized somehow embeds the Pin conditions: i.e., we guarantee that if you coerce a pinned version of this pointer into a pinned version of the target pointer, the pin conditions still hold. I find that less appealing because it makes the details of pinning "non modular" -- every smart pointer type has to think about it.

I creaed a github issue to track CoerceUnsized specifically

Do we really need to guarantee this for all CoerceUnsized impls? From my perspective, the issue isn't that CoerceUnsized guarantees too little - it's that Pin's CoerceUnsized impl requires too much from the (safe) user-provided CoerceUnsize impl on the pointer type.

I think it might be better for Pin's CoerceUnsized impl to require that the pointer type implement an unsafe marker trait (e.g. PinCoerceable) which has whatever requirements we would otherwise be imposing on CoerceUnsized.

One downside to this is that it makes using CoerceUnsized on Pin more difficult in generic code - it might be necessary to add PinCoerceable to a bunch of generic functions. However, I don't think you ever need to use the CoerceUnsized impl on Pin - you can always do Pin::into_inner, coerce the inner value, and then use Pin::new_unchecked on the coerced value. So, while this could potentially be annoying (if using CoerceUnsized in a generic context is even something that people want to do), adding the unsafe auto trait wouldn't make any usecases impossible.

2 Likes

It's true, we could take that approach. Still, I feel like any reasonable Deref impl already guarantees this property, no?

That's true - any Deref impl that breaks that Pin contract will at a minimum be extremely suprising to users, and would likely cause problems even if Pin was not involved. However, I think it's better to to try minimize the ways that weird or questionable code can lead to unsoundness.

If all CoerceUnsize impls require "pin safety" (e.g. upholding the Deref requirements), then I think we run the risk of people implementing CoerceUnsize without considering Pin, potentially leading to unsoudness if consumers try to use the type with Pin. If we require an explicit statement that the pin-related guarnatess are being upheld (via an additinal unsafe marker trait), I think we can reduce this risk.

I think a marker trait is also helpful when reviewing code. This seems analogous to marking functions as unsafe. By marking private functions as unsafe, even when you could "just" verify that all of the callers uphold the contract, you make it clear to reviewers that there are additional requirements to check. Similarly, a reviewer seeing something like unsafe impl PinCoercible for MySmartPtr will know that the Deref impl needs additional review (or that they need to look up what PinCoercible means). If a CoerceUnsize impl always implies "pin safety", then it may be less obvious to reviewers that additional verification is needed.

1 Like

Before we go too far down this road, I think I'd prefer to focus on this question:

What is the safety condition we require in the first place?

In particular, I posited that a "reasonable" Deref impl suffices, but I am not sure if that's really true. This seems like something @RalfJung might have the best intuition for.

The question is basically:

What additional guarantee does Pin::new require in order to ensure that the coercions one can do are safe?

1 Like

It was mentioned that Pin::new is creating the pointer and might thus be viewed as responsible. I would like to disagree. Really, it is the coercion that is creating instances. The current functionality of unsizing coercion is also specifically restricted and this may offer a way out. In its essence, coercion takes two seemingly unrelated types and asserts that the bytes-sequences that are valid as instances of the first are also valid as the second. This is true for many pointer types but not for Pin.

That would then be on the language implementation to offer a way to define more specific trait bounds for the impl of CoerceUnsize without disabling it completely. The relevant impl has of course already landed but adding more preconditions onto CoerceUnsize seems as much a breaking change as changing that impl as it can not be named itself.

It seems hard to imagine someone implementing an unsafe trait without understanding what guarantee they are upholding by implementing it (whether the guarantees are upheld by CoerceUnsized or a new marker trait, the trait by which they are upheld will be unsafe to implement).

I guess this is the point that @Aaron1011 was raising, too. We basically have two options:

  • extend CoerceUnsized such that Pin::new is always safe for all possible coercions
  • add add'l conditions for when a Pin<P> can be coerced to Pin<Q> , such that Pin::new works but the resulting Pin<P> can't always be coerced to Pin<Q> (even though P could've been coerced to Q)

I am kind of ok with either route but I think the choice between them depends somewhat on what the conditions are that makes coercing Pin provable in the first place. If they are relatively simple conditions that should basically always be true but for "rogue" CoerceUnsized impls, then maybe it's simpler to just require them in the first place. If they are relatively specialized to Pin, I might be inclined to take the "helper trait" route.

1 Like

When some pointer calls Pin::new_unchecked with itself it needs to assert the soundness for all values that might possibly be produced with it. For the values created through CoerceUnsize any pointer type P must thus at the moment make guarantees about all types U where P: CoerceUnsized<U>.

impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
    P: CoerceUnsized<U>,

Firstly, this is a positive result since it means that Pin is not broken in general but maybe only for some pointer types. Secondly, it means that it must not be possible to cause an impl of CoerceUnsized<M> for some unrelated user type M in safe code. Since the trait is a lang-item I would not expect this to be possible anyways or caught by other mechanisms. But otherwise, the issues looks unrelated to me.

How did it fail then, when CoerceUnsized is not directly responsible? Critically we had Pin broken for &_ as demonstrated in the exploits. As the &_ is building the Pin let us have a look at how it implements the trait. Since all standard pointer types are similar, they have some inner field *const T, we will have an impl like the following:

impl<'a, 'b:'a, T, U> CoerceUnsized<&'a const U> for &'b const T
where
    T: Unsize<U> + ?Sized,
    U: ?Sized,
{ }

If a pointer type wants to provide a useful coercion as above it suddenly opts-in to depending on the user-specified T: Unsize<U>. And importantly, a user can safely add a new impl since adding a trait impl of an object-safe trait T: UserTrait also adds a new T: Unsize<dyn UserTrait>. This was exploited, in all of the examples. In particular, both the DerefMut and Clone exploit required that Future: Unsize<Wrapper<'a>>. This is what finally allowed coercing Pin<&Wrapper<'a>> into Pin<&dyn Dubious<'a>> which then was exploited further. Note that it could not have been constructed directly.

I would argue that this coercion was broken, and not what was achieved with the dyn Dubious<'a> afterwards and fixing it properly might fix all exploits and not even require a negative impl for Clone on mutable references!

Finally, note the DerefMut based exploit from above. It too requires sensible implementations of CoerceUnsized giving implicit and safe leverage over Pin. Simply making CoerceUnsized an unsafe trait effectively makes writing such StrangeWrappers hard and is an evolution risk to Pin. Could adding new methods to Pin suddenly make previously safe coercsions unsafe? Can we write any coercible, custom pointer if this is the case?

impl<U, T: CoerceUnsized<U>> CoerceUnsized<StrangeWrapper<U>> 
  for StrangeWrapper<T> {}
// Later, exploiting coercion different dyn type:
let pin_sw_mf: Pin<StrangeWrapper<&mut MyFuture>> = Pin::new(
    StrangeWrapper(mf));
// Wait, have we opted into this with `CoerceUnsize` alone?
let mut pin_sw_dyn: Pin<StrangeWrapper<&mut DynFuture>> = pin_sw_mf;

Since the fix should not affect all unsizing of pointer themselves we can only try to fix the inital impl in the conversion chain:

impl<P, U> CoerceUnsized<Pin<U>> for Pin<P>
where
    P: CoerceUnsized<U>,

It does not seem to make a difference whether we add a trait bound to CoerceUnsized or locally to the trait impl directly in terms of achievable soundness. The former is definitely more orthogonal though, which is your option two.

1 Like

It doesn't obviate the need for negative impls. To quote myself from the original thread:

4 Likes

Yeah, that's exactly what I was wondering. :wink:

Ah, that sounds just about right! And because the invariant of Pin is all about "passing this particular value to Deref will do xyz", if CoerceUnsized is also about Deref then that should connect nicely. I don't have nearly enough intuition to do any kind of "proof sketch" here like I did for Pin itself, but the pieces all make sense.

@comex what do you think? It sounds to me like this "catches" your counterexample because that has two Deref impls that return different things, and then you can coerce between them.

2 Likes

Would it be possible to embed the invariants for CoerceUnsized in unsafe methods? If CoerceUnsized requires there to be only one data pointer and a set of PhantomData, maybe something like:

trait CoerceUnsized<T> 
where 
    T: ?Sized,
{
    type UnderlyingPointerData: Sized;
    type ExtraPhantomData;

    unsafe const fn into_raw(self) -> (*mut Self::UnderlyingPointerData, PhantomData<Self::ExtraPhantomData>);
    unsafe const fn from_raw(*mut Self::UnderlyingPointerData, Self::PhantomData<Self::ExtraPhantomData>) -> Self;
}

For me, this looks like it correctly specifies the invariants required by CoerceUnsized. Don't think this is practical though.

An unsafe method/function means that the caller needs to check the invariant, not the implementer. Also, CoerceUnsized is not supposed to invoke any user defined code, it should just add metadata to the pointer.

5 Likes

I think the main problem here might be with clone: its seems very plausible there exists a combination of Clone on a custom pointer and CoerceUnsized on that pointer that, when combined with the Clone and CoerceUnsized impls on Pin, give you a falsely pinned pointer. But we can't require that Clone return a pointer that points to the same value.

In that case I think we would just need to embed the pinned condition into CoerceUnsized..

2 Likes

For my argument above to work, we basically have to require that doing the unsizing coercion does not change the Deref/DerefMut/Clone impl that will be called. Because if that changes, there is just no way to show the proposed Pin invariant for the newly coerced pointer.

With that constraint, how could Clone and CoerceUnsized cooperate to cause a problem here?

This doesn't really answer your question, but requiring it to be the same impl is probably too strict when it comes to slices. In particular, we have Clone impls for Box<[T]> and Box<[T; N]>; they behave equivalently (you could say that cloning commutes with unsizing), but they're not the same impl.

1 Like

I was responding to Niko's proposal that they dereference to the same value, which is different from invoking the same impl. With comex's comment, I think we need to require the entire Pin conditions here: to impl CoerceUnsized<U> for T, you need to prove the pin invariant for U if T: Deref.

That would tie pinning and CoerceUnsized together pretty tightly though.

Also, if all you have is that x: Pin<T> satisfies the pin invariant (which is all about calling certain functions with &x as argument), it will be pretty hard to prove that from this it follows that x also satisfies the invariant for Pin<U> if the functions end up being different in that case. You end up having to prove a theorem like: forall x such that Box<[T; N]>::clone(&x) is safe to call, it is also the case that Box<[T]>::clone(&x) is safe to call. It is a big warning sign to me that we are supposed to deduce something from the fact that some value can be safely passed to some function -- usually the only thing you can do with such a proof is justify a call to that function. (In logical terms, there isn't really a good elimination rule for safety proofs.)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.