Copy + !Unpin

Technically, this is valid today, as Copy is not bound by Unpin.

Is there any reason to support Copy + !Unpin? If we could turn back the stabilization clock, would we want to add the bound?

I think that Copy types should be Unpin because it becomes very easy to accidentally move a type that is Copy without even knowing it. Especially because Pin implements Deref. So Copy not implying Unpin looks like a possible footgun.

If PhantomPinned wasn’t Copy we may have been able to change this, but (unfortunately) at this point it is too late without breaking stability guarantees.

I think Copy actually does semantically imply Unpin. You must be able to Clone the struct, creating a new owned value, by doing a memcpy. This would mean that doing something that requires !Unpin would mean implementing Copy (though not Clone!) would be unsound.

Also, if a Copy type is !Unpin, you can trivially get around the restriction, as you’ve noted, by just safely copying the value.

Even though it’s technically a breaking change, I’d be behind adding the Unpin bound to Copy retroactively. Unfortunately, this is basically impossible since we actually want PhantomUnpinned to be Copy, right?

I am also behind adding Unpin bound to Copy. I don’t think it is valuable to have PhantomUnpinned be Copy. If you want a new value you could just make a new one or clone an old one. Ergonomics of PhantomUnpinned are not as important as preserving safety.

Think it this way: a type is !Unpin means it MAY contain self reference, so simply copy it may still result in a valid object, however, the new copy will contain some pointer pointing to the old object, not the new.

According the situation, this may or may not be useful, but even it is, handling them is hightly dangerous - it looks like the “copy” should have a lifetime constraint to the old, like a &T to the owned type. If the original object was dropped before the copied, the copied object will become invalid immediately.

!Unpin marks a type T as strictly requiring an unsafe guarantee given to pinned instances of a type with relation to its Drop implementation, that is <T as Drop>::drop must be ran before invalidating the storage behind any pinned instance. But for Copy types there is no Drop::drop to run which seems to make the whole guarantee trivial to provide. Does that mean this is technically safe?

pub fn pin_copy<T: Copy>(t: &mut T) -> Pin<&mut T> {
    unsafe { Pin::new_unchecked(t) }
}

Whatever the answer, I don’t think it necessarily indicate if Copy should have a bound of Unpin. First of all, the latter is an auto marker trait while the former is not automatically derived. Note that this also means that Copy is not automatically provided for existential return types -> impl …—whereas Unpin is—and wherever the bound of Copy must be named you can then add Unpin if needed as well. And secondly, the semantics of Copy do not require the semantics of Unpin. Adding the bound would trade 7 characters on some bounds for conflating two different type concepts and complicating their implementation in the compiler (keep in mind both are lang items).

Unpin is only a lang item so that the generator transform can implement it for non-static generators, it has no lang specialness itself.

Types can not implement Unpin because they are being used in an intrusive collection; these don’t contain self references, but instead are referred to at that address by other members of the collection. Copying them is not necessarily invalid for any pinning releated reasons, it just makes this not a member of the collection.

I’m not sure you can create a good API which has a type that meets Copy but not Unpin, but its certainly possible to write valid code that relies on the guarantees of Pin without be invalidated by a Copy.

EDIT: actually I don’t think its possible to write an intrusive collection that doesn’t depend on Drop. Can’t think of a safe API that would have a Copy + !Unpin type. Of course that doesn’t mean we would choose to “forbid” this in the language.

Let’s see this with an example:

// #[derive(Clone)]
pub
struct SelfReferential {
    string: String,
    at_string: NonNull<String>,
    _pin_sensitive: PhantomPinned,
}

impl SelfReferential {
    pub
    fn new (string: String) -> Pin<Box<Self>>
    {
        let mut pinned_box = Box::pin(Self {
            string,
            at_string: NonNull::dangling(),
            _pin_sensitive: PhantomPinned,
        });
        unsafe {
            pinned_box
                .as_mut()
                .get_unchecked_mut() // safety: no moves
                .at_string
                = NonNull::from(&pinned_box.string);
        }
        pinned_box
    }

    pub
    fn at_string<'__> (self: Pin<&'__ Self>) -> &'__ String
    {
        unsafe {
            // # Safety:
            //
            //  - the only way to get `Pin<&Self>` with the public API is by using
            //    `Self::new().as_ref()`, which guarantees that `at_string` is valid by construction
            self.get_ref().at_string.as_ref()
        }
    }
}

Imho think this is a sketchy version of an otherwise justified / plausible pattern.

Now, with this API, using #[derive(Clone)] (and thus also #[derive(Copy)] if the String was, let’s say, a [u8; N] instead) is unsound, since

  • SelfReferential::new("Hello, world!".into()).as_ref().get_ref().clone()
    

    creates a SelfReferential struct whose at_string pointer is dangling; then it suffices to get a pinned shared reference to it (e.g., with Box::pin(...).as_ref() or with stack pinning) to trigger the UAF.

  • Playground


What can we conclude from all this? Naive #[derive(Clone)] (i.e., exactly what happens with Copy) is indeed dangerous to use with !Unpin data, since it let us create a replicate of the !Unpin value elsewhere in memory, where it can afterwards be pinned behind a Pin<P> pointer. Thus, the static / type-level invariant of “my !Unpin value, when Pinned, is always well-formed” gets broken.

This can be avoided with a custom Clone implementation, that clears the pointer instead of copying it (thus also checking the dereference of at_string on runtime :frowning:):

Click to show the code
pub
struct SelfReferential {
    string: String,
    // Runtime invariant: non-null iff not dangling
    at_string: Option<NonNull<String>>,
    _pin_sensitive: PhantomPinned,
}

impl Clone for SelfReferential {
    #[inline]
    fn clone (self: &'_ Self) -> Self
    {
        Self {
            string: self.string.clone(),
            at_string: None, // cleared instead of copied
            _pin_sensitive: PhantomPinned,
        }
    }
}

impl SelfReferential {
    pub
    fn new (string: String) -> Pin<Box<Self>>
    {
        let mut pinned_box = Box::pin(Self {
            string,
            at_string: None,
            _pin_sensitive: PhantomPinned,
        });
        unsafe {
            pinned_box
                .as_mut()
                .get_unchecked_mut() // safety: no moves
                .at_string
                = Some(NonNull::from(&pinned_box.string));
        }
        pinned_box
    }

    pub
    fn at_string<'__> (self: Pin<&'__ Self>) -> Option<&'__ String>
    {
        unsafe {
            // # Safety:
            //
            //  - It is not possible given the public API
            //    to break the runtime invariant:
            self.get_ref()
                .at_string
                .as_ref()
                .map(|x| x.as_ref())
        }
    }
}

This shows the limits of Clone::clone, and maybe the need for the following trait(s) / ideas:

trait PinnedClone {
    fn pinned_clone (self: &'_ Self) -> Pin<Box<Self>>;
}

// and, for cloning + stack pinning:

/// # Safety
///
///  - write_clone must init the memory `target` points to,
///    keeping in mind that it may be pinned right afterwards. (1)
unsafe trait WriteClone {
    /// # Safety
    ///
    ///  - `target` must verify that
    ///    it is safe to `ptr::write::<Self>` to. (2)
    unsafe fn write_clone (self: &'_ Self, target: *mut Self);

    fn with_pinned_clone<R, F> (
        self: &'_ Self,
        f: F,
    ) -> R
    where
        F : FnOnce(Pin<&mut Self>) -> R,
    {
        let mut slot = MaybeUninit::<Self>::uninit();
        let at_slot = slot.as_mut_ptr();
        unsafe {
            // Safety: (2) is verified
            self.write_clone(at_slot);
        }
        struct UnsafeDropGuard(*mut Self);
        impl Drop for UnsafeDropGuard { fn drop (&mut self) {
            unsafe { ptr::drop_in_place(self.0); }
        }
        unsafe {    
            // Safety: invariants from (1)
            let _guard = UnsafeDropGuard(at_slot);
            let slot = Pin::new_unchecked(&mut *at_slot);
            f(slot)
        }
    }
}

// we could even have impl<T : WriteClone> PinnedClone for T { ... }
// (if MaybeUninit is #[repr(transparent)])

TL,DR: Copy and auto-derived Clone is indeed problematic with !Unpin, leading to either “not unsound for all inputs” (i.e., unsafe fn) APIs, or APIs that require runtime checks.

The idea of cloning data that is automatically pinned could be further explored.

2 Likes

I have found a use for Copy + !Unpin, creating a intrusive Directed Acyclic Graph (DAG). This DAG allows you to keep references into itself, and they won’t be invalidated because links are stored in a Pin<Box<DAG>>. Using Pin here serves as documentation that these memory locations must be stable for safety, but it doesn’t need to be exposed to the external api.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b811af272ddf4bcfd964661149b2f880