First experience and thoughts about `Pin` API

earthengine · January 17, 2019, 1:08am

As my first excersise to see how the new std::future::Future and Pin API works, I decided to implement join that will poll two Futures parallely.

Here is what I end up with:

use std::future::Future;
use std::task::{Poll::{self, Ready, Pending}, LocalWaker};
use std::pin::Pin;
use std::cell::Cell;

pub fn join<T1,T2>(f1: impl Future<Output=T1>, f2: impl Future<Output=T2>) 
    -> impl Future<Output=(T1,T2)> {
    struct JoinFuture<T1,T2,F1,F2>(F1,F2,Cell<Option<T1>>,Cell<Option<T2>>);
    impl<T1,T2,F1,F2> JoinFuture<T1,T2,F1,F2> {
        fn get_f1<'a>(self: Pin<&'a mut Self>) -> Pin<&'a mut F1> {
            unsafe { self.map_unchecked_mut(|v| &mut v.0) }
        }
        fn get_f2<'a>(self: Pin<&'a mut Self>) -> Pin<&'a mut F2> {
            unsafe { self.map_unchecked_mut(|v| &mut v.1) }
        }
    }
    impl<T1,T2,F1,F2> Future for JoinFuture<T1,T2,F1,F2>
        where F1: Future<Output=T1>,
              F2: Future<Output=T2>,
    {
        type Output = (T1, T2);
        fn poll(mut self: Pin<&mut Self>, lw: &LocalWaker) -> Poll<Self::Output> {
            match (self.2.take(), self.3.take()) {
                (Some(_), Some(_)) => unreachable!(),
                (Some(v1), _) => match self.as_mut().get_f2().poll(lw) {
                    Ready(v2) => Ready((v1, v2)),
                    _ => { self.2.set(Some(v1)); Pending }
                },
                (_, Some(v2)) => match self.as_mut().get_f1().poll(lw) {
                    Ready(v1) => Ready((v1, v2)),
                    _ => { self.3.set(Some(v2)); Pending }
                },
                _ => match (self.as_mut().get_f1().poll(lw), self.as_mut().get_f2().poll(lw)) {
                    (Ready(v1),Ready(v2)) => Ready((v1, v2)),
                    (Ready(v1), _) => { self.2.set(Some(v1)); Pending },
                    (_, Ready(v2)) => { self.3.set(Some(v2)); Pending },
                    _ => Pending,
                }
            }
        }
    }
    JoinFuture(f1, f2, Cell::new(None), Cell::new(None))
}

Which I have to

use unsafe blocks, because a Pin<&mut T> only garantee that the T object will not move until dropped, not its fields, and we have to generate Pin pointers into its fields, which requires additional garantee.
use Cells. Technically I might be able to implement without them, but with their help I can match with (self.2.take(), self.3.take()) with two shared references of self.

Overall, this is not too bad. Some concerns:

Would it be possible to mark a struct to say: a struct maybe Unpin and so movable, but its certain field is always moved with the whole struct? This is the garantee that required by the unsafe block. Something like field attribute #[sticky] would be good. The current std::mark::PhantomPinned cannot be used here as it would make the struct unmovable.
The as_mut method for Pin<&mut T> plays a very simular role that a reborrow does. I remember somebody have a Reborrow trait proposal. Maybe its time to review this and make reborrowing work for Pin<&mut T> or simular types.

comex · January 17, 2019, 2:16am

This really cries out for a way to mark methods as accessing disjoint fields. After all, there is no inherent need for the Cell, since poll is called with a unique reference. We could avoid it by adding methods to access the other two fields:

fn get_t1<'a>(self: Pin<&'a mut Self>) -> &'a mut Option<T1> {
    unsafe { &mut self.get_unchecked_mut().2 }
}

…and this would be nice and regular and possible to automate using a macro. But with these types of methods, you can’t mutably borrow more than one field at the same time.

I suppose an alternative is to have one method that borrows all the fields:

struct JoinFuture<T1,T2,F1,F2> {
    f1: F1,
    f2: F2,
    t1: Option<T1>,
    t2: Option<T2>,
}
struct JoinFutureBorrowMut<'a, T1,T2,F1,F2> {
    f1: Pin<&'a mut F1>,
    f2: Pin<&'a mut F2>,
    t1: &'a mut Option<T1>,
    t2: &'a mut Option<T2>,
}
impl<T1,T2,F1,F2> JoinFuture<T1,T2,F1,F2> {
    fn borrow_mut<'a>(self: Pin<&'a mut Self>) -> JoinFutureBorrowMut<'a, T1, T2, F1, F2> {
        unsafe {
            let this = self.get_unchecked_mut();
            JoinFutureBorrowMut {
                f1: Pin::new_unchecked(&mut this.f1),
                f2: Pin::new_unchecked(&mut this.f2),
                t1: &mut this.t1,
                t2: &mut this.t2,
            }
        }
    }
}

(Again, this is amenable to implementation via macro.)

With this, I was able to both avoid the Cell and make the poll implementation less verbose overall:

fn poll(self: Pin<&mut Self>, lw: &LocalWaker) -> Poll<Self::Output> {
    let this = self.borrow_mut();
    match (&this.t1, &this.t2) {
        (Some(_), Some(_)) => unreachable!(),
        (Some(_), _) => match this.f2.poll(lw) {
            Ready(v2) => Ready((this.t1.take().unwrap(), v2)),
            _ => Pending
        },
        (_, Some(_)) => match this.f1.poll(lw) {
            Ready(v1) => Ready((v1, this.t2.take().unwrap())),
            _ => Pending
        },
        _ => match (this.f1.poll(lw), this.f2.poll(lw)) {
            (Ready(v1),Ready(v2)) => Ready((v1, v2)),
            (Ready(v1), _) => { *this.t1 = Some(v1); Pending },
            (_, Ready(v2)) => { *this.t2 = Some(v2); Pending },
            _ => Pending,
        }
    }
}

Playground link

Nemo157 · January 17, 2019, 8:41am

You may want to take a look at pin-project which provides just such a macro. Alternatively pin-utils provides macros for creating the per field methods.

Nemo157 · January 17, 2019, 8:45am

This is written impl Unpin for JoinFuture where F1: Unpin + F2: Unpin {}, you just need to conditionally implement Unpin based on whether the fields you project to are also Unpin

earthengine · January 17, 2019, 10:04pm

This provide not much help in my case. It is equivlent to require my function to require F1:Unpin, F2:Unpin trait bounds. This of cause can let me implement join without unsafe code. However this is less generic and it is not what I wanted.

What I actually wanted, is to have the compiler to help on preventing attempts to move a field out of a struct, even when it is Unpin. This is what exactly we promised in the unsafe block, and if the compiler can check, we don't need to be unsafe.

CAD97 · January 17, 2019, 11:13pm

So what you want, if I’m interpreting correctly, is a way to safely go from Pin<&mut T> to &mut T.field where T.field: Unpin.

This is equivalent to wanting Pin<&mut T> to Pin<&mut T.field>, as Pin<P<U>> is equivalent to P<U> when U: Unpin.

In fact, pin_project's “derive” actually strips the pin for you.

earthengine · January 17, 2019, 11:42pm

I was looked into pin_project. Yes it is ergonomic and easy to use. However in terms of safety, it is not much better than unsafe code blocks above. We still have to make some promise that the compiler cannot check, although it is not using the unsafe keyword explicitly.

CAD97 · January 17, 2019, 11:50pm

And this is an inherently hard and unsafe task. It’s like writing Vec; it’s not something that needs to be safe.

Working directly with Pin is probably going to always require you (or a “derive” you use) to use unsafe. Pin projection is unsafe. And one of the best things about Pin's stabilized design is that it is purely a library type.

Using pin_project encapsulates the unsafety for you (though personally I’d prefer opting in to stripping the pin rather than keeping it on fields for safer-by-default). It’s the same as using Vec and the entire Rust philosophy: it’s possible to write machine-checked safe code on top of human-promised safe code (that does generally unsafe things).

earthengine · January 18, 2019, 12:02am

I feel like Vec's story is quite different here. With Vec, if only using the safe APIs, you get panics when you get things wrong (for example, out of index etc). The safety promise of Vec is: your error is either checked at compile time, or at runtime with panics. You cannot cause UB whatever you did in safe code. This is not as good as full compiler checks, but it is the best thing next to it.

With Pin, if we didn’t keep the promise, it will be UB: the programs works like there is no issues, but in some cases it just behaving wierd.

Can we have the samething like in Vec? So when you moved a U under a Pin<&mut U>> returned from get_mut_runtime_checked() incidently, the compiler panics.

CAD97 · January 18, 2019, 2:11am

The point I’m trying to make isn’t that Pin<_> is Vec<_> in terms of safety abstractions, if anything, it’s that Pin<_> is ptr::NonNull<_>. That is, it’s an unsafe-to-use abstraction that you build safe abstractions on top of.

And Pin is the same as Vec in respect to safe code. Anything you do safely to any type in Rust is free from UB. This is the guarantee of Rust. This holds for Pin. It’s just that because the guarantee that it provides – that a type instance won’t be moved by untrusted code – isn’t guaranteed by the compiler that Pin<P<_>> to &_ is safe but to &mut _ is unsafe. Pin is the guarantee of immobility.

Basically, the point I’m trying to make is that you shouldn’t need to use Pin directly the same way you shouldn’t need to use ptr::NonNull directly. Other layers of the abstraction have been written already that you can use safely. join has to be written once, and async/await! hides the details of immobility required to drive a Future.

Pin projection would definitely be nice to have safely. But it doesn’t have to be a feature of the language or even the stdlib. It would help a small number of libraries (not applications, probably, which would just use async/await! and libraries built on Future) for a very noticeable cost (defining how exactly pin projection works) when we have today a macro that solves the issues in a resonably-safe-to-use manner (that is, without writing unsafe yourself, I don’t think you can use pin_project's attribute macros to cause unsafety – correct me if I’m wrong here).

comex · January 18, 2019, 2:35am

I think there may be some confusion about how Pin works. Going from Pin<&mut Struct> to Pin<&mut Field> is safe: it upholds the Pin invariant. There is not even a need for a runtime check. But there are two caveats:

The language does not provide a way to actually perform that operation without first obtaining a real pointer to the struct, hence the need for an unsafe implementation.
It’s only safe as there is not also a way for safe code to go from Pin<&mut Struct> to plain &mut Field (i.e. it’s either-or). This means you have to limit Drop impls can do, which I guess pin_project takes care of.

I somewhat disagree with @CAD97, in that I think built-in pin support probably would help applications significantly. I admittedly don’t have much experience with Rust async/await, but even if it often obviates the need to manually handle Futures, it doesn’t always – nor should it, since not everything you might want to do with them can be expressed with async/await. But that’s somewhat beside the point.

earthengine · January 18, 2019, 3:45am

Actually pin_project didn't take care of it. It just simply require the user to uphold this, so this why I said it is not much better than use explicit unsafe code.

Just giving another example that require manually handle Futures. Here is the alt function: given 2 Future, poll them parallelly, and prepare a result if at least one of them is ready:

pub enum EitherOr<T1,T2> {
    This(T1),
    That(T2),
    Both(T1,T2),
}
pub fn alt<T1, T2>(
    f1: impl Future<Output=T1>,
    f2: impl Future<Output=T2>
) -> impl Future<Output=EitherOr<T1,T2>> {
    struct AltFuture<F1,F2>(F1,F2);
    impl<T,F1,F2> Future for AltFuture<F1,F2>
    where
        F1: Future<Output=T>,
        F2: Future<Output=T>,
    {
        type Output=EitherOr<T1,T2>;
        fn poll(self: Pin<&mut Self>, lw: &LocalWaker) -> Poll<Self::Output> {
            let this = unsafe {
                let this = self.get_unchecked_mut();
                (
                    Pin::new_unchecked(&mut this.0),
                    Pin::new_unchecked(&mut this.1),
                )
            };
            match (this.0.poll(lw), this.1.poll(lw)) {
                (Ready(v1), Ready(v2)) => EitherOr::Both(v1, v2),
                (Ready(v1), _) => EitherOr::This(v1),
                (_, Ready(v2)) => EitherOr::That(v2),
                (_, _) => Pending,
            }
        }
    }
    AltFuture(f1, f2)
}

(Note this is more flexible than the futures::Future::select: it does not require the futures to have the same type. And more importantly, it is less biased: in case that both futures being Ready, it don't have to pick any arbitrary one.)

CAD97 · January 18, 2019, 4:16am

(futures:0.1)::Future::select doesn’t have a restriction on input type? And the current state is a n-ary macro fn: (futures-preview:0.3-alpha.12)::select!. That macro does require a unified output mapped type, but you can still do an either-or enum mapping yourself, and it allows recovering the non-exhausted inputs, unlike your implementation.

If I read the implementation of select correctly, it doesn’t have a bias either: it’s random in what order the futures get polled, and once a selection has been made, no more polls are done.

earthengine · January 18, 2019, 5:04am

Oh, I mean futures::Future::select is too much restrictful, my alt don't have restrictions, as the input futures can have different types.

I am curious on this as it is either impossible for the signature given, or it have to pay the cost of a random number generator.

Also a "random" bias is the worse case of bias as it is nondeterministic.

To resolve the unified result, you have to pick a Future to poll, and if you don't poll all Futures at the same poll cycle, you are bias to the one you just pick. If you do poll both then you have a chance to get both results in hand, in that case you either have to combine them together (if you have a monoid, but that means you have much bigger restriction on the return type), or you have to drop (or restore to unresolved) one of them. Either use a random generator to randomly pick one to be drop or restored, or bias to one of them.

So my solution is to return them both in a new variant in EitherOr, this is my way to avoid bias.

Of cause, my design is also easy to extend to alow recovery of the unresolved future and use macro to allow n-ary operations.

CAD97 · January 18, 2019, 5:22am

Neither futures 0.1 or 0.3-alpha.12 have any restrictions on the input type of select other than that it is a future and that you map all the outputs to a unified type. The only restriction that they apply that you don’t is on the output type of the future, directly in 0.1 and indirectly in 0.3. The inputs can be unrelated types though you seem to be implying they can’t.

And yes, futures 0.3 select uses rand::thread_rng to shuffle the array of potential polling functions to determine the order it polls in. All futures are polled in that random order if none complete, and short circuits at the first completed one. Nothing is lost.

But we’re off topic here, as this isn’t talking about Pin anymore. Sure, maybe the select! implementation isn’t exactly what you’d prefer. But implementing your own alt only requires a small safe cost of #[pin_project] in terms of pin projection.

(I’d very much love to hate to see the “OptionHList” mess required for an n-ary alt that polls all the futures and gives back all the successes.)

earthengine · January 18, 2019, 5:51am

Ok let's get back to the topic. A small safe cost is still not "zero cost". Those alt or select or join have to be in the std or at least in a crate, as well as all other convenience methods that might be useful. Then we can see people eliminate Pin in their safe code.

(I would be interested in writing such a crate, just advise me other useful staff people would need, I probably would not write and_then or other common things as they are trivial with async/await, we have to think again before accepting a function that was useful in the old ways)

(continue the off-topic)

My original design was returning a (Option<T1>,Option<T2>), but its "(None, None)" case is not reachable, and so that I switched to EitherOr. For n-ary alt, I would definitely use n-tuples of Options if we don't need recovery and Eithers if we do, instead of defining new enums, as the all-none case is negligible.

(Also, there is a strong reason to not featuring recovery - we are in a Pined struct, and its content was garanteed not to move until drop. This means, even we ignore those futures that didn't resolve, they are still available somewhere in the stack, the user still can access them and do whatever they want. This is another reason why the std::future helpers need to be re-designed.)

newpavlov · January 18, 2019, 10:56am

A bit off-topic question: is conversion from &mut T to Pin<T> done automatically for T: Unpin? In other words, I still don’t quite get how generators with fn resume(self: Pin<Self>) will work. Do I have to pin one explicitly before using it even if it’s not self-referential, or will it be done automatically somehow behind scenes?

Nemo157 · January 18, 2019, 12:08pm

There is Pin::new for this conversion, e.g. see that this test is able to avoid using any unsafe code when interacting with movable generators.

newpavlov · January 18, 2019, 12:44pm

So if we’ll get for loop integration, we will have to write code like this:

struct Foo { .. }
impl Generator for Foo { .. }

let result = for val in Pin::new(Foo::new(bar)) { .. }
// or
let result = for val in Foo::new(bar).pin() { .. }

Even though this custom generator does not do any self-referencing? Does not look that nice.

A good example where generators can result in a nicer API is chunks_exact variant which returns leftover slice instead of simply omitting it:

// `chunk` has type `&[T; 16]` and `leftover` has type `&[T]` with length < 16
let leftover = for chunk in Pin::new(slice.const_chunks::<16>()) { .. }

Looks quite clunky to me…

Nemo157 · January 18, 2019, 12:53pm

@newpavlov I don’t think so, I’m pretty sure that would be done via an impl<G> IntoIterator for G where G: Generator + Unpin which could be used like

let gen = || { yield 1; yield 2 };
for val in gen {
} 

let self_ref_gen = static || { yield 1; yield 2};
pin_mut!(self_ref_gen);
// or `let self_ref_gen = Box::pin(self_ref_gen);` to use the heap
for val in self_ref_gen {
}

I actually have an experimental crate near this area that I can probably use to verify this all works over the weekend.

Topic		Replies	Views
Async/Await series language design	30	8008	March 25, 2019
Should JoinHandle<T> (from std::thread) implement Future?	4	2112	November 17, 2019
[Pre-RFC] pattern matching std::pin::Pin, converting &mut T to Pin and narrowing scope language design	4	538	October 9, 2021
Arc and Pin? libs	11	2660	April 29, 2022
`async_std` and `futures` related question, need help:) libs	3	1029	July 12, 2020

First experience and thoughts about `Pin` API

Related topics