Pre-RFC Local Wakers
Currently, wakers in rust are all Send + Sync
. This implies that waker implementations
must all be thread safe, which forbids useful optimizations in thread per core runtimes.
The following API additions would be proposed in order to add support for local wakers.
LocalWaker
LocalWaker would be a struct analogous to Waker
, but without
the Send + Sync
trait bounds. Just like thread safe wakers it would be constructed from
a RawWaker
and a RawWakerVTable
.
Context
Context
would get two additional methods: local_waker()
to get a LocalWaker
, and set_local_waker(&mut self, waker: &LocalWaker)
to set it.
If a local waker isn't set, Context
will use the Waker
it was given at construction to
create a LocalWaker
. This way, all runtimes would support local wakers by default, while having
the ability to specialize if they want to. Opting out of Waker
is also possible if needed, by
just panicking on wake()
.
LocalWake (possibly)
LocalWake would be a trait analogous to Wake
, that
would use Rc
instead of Arc
. It would look roughly like this:
pub trait LocalWake {
fn wake(self: Rc<Self>);
fn wake_by_ref(self: &Rc<Self>) {
self.clone().wake();
}
}
impl<W: LocalWake + 'static> From<Rc<W>> for LocalWaker {
fn from(waker: Rc<W>) -> LocalWaker { /* .. */ }
}
Drawbacks
Supporting both local wakers and thread safe wakers would likely require two allocations instead of one, which disincentivizes specialization. This could lead runtimes to pick one and stick with that one. However, nothing prevents runtimes to support this behavior to be customized to their user's needs.
Also, if a runtime decides to support local wakers only, then it is going to be incompatible with
most futures in the ecosystem, since most would only use waker()
.
Example
This is a use case example that shows the kinds of things we would be able to do with local wakers, which are just too expensive or too dificult to do with thread safe wakers.
Lets say that we want to implement a join!
macro that doesn't poll spuriously. We might want to
give each joined future a separate waker, so we can tell which futures were woken.
We might try something like this:
pub struct JoinWaker {
task_waker: Cell<Option<LocalWaker>>,
// this tells which futures have been woken
flags: Cell<u64>,
}
This join macro would be limited to up to 64 futures, since there are only 64 flags. Then we can create an array of raw waker vtables and use them to construct a different waker for each future. Each waker vtable will flip a different flag when woken.
// each raw waker vtable would flag a different bit.
// and they would all wake the task_waker
const JOIN_RAW_WAKER_VTABLES: [RawWakerVTable; 64] = /* .. */;
It would also be necessary to replace the task_waker on every poll, since we are not guaranteed to
always be given the same waker. Therefore, we store it on a Cell
, so we can replace it on poll.
If we wanted to make this kind of macro today, we would need to write it like this:
pub struct JoinWaker {
task_waker: Mutex<Option<Waker>,
flags: AtomicU64,
}
On every call to wake, and poll we would need to lock the mutex, and the flags now need to be atomic, when they didn't use to be before. If our runtime implementation is of the thread per core architecture, this amount of unnecessary synchronization might be a deal breaker.