Disclaimer: I am in no way suggesting this for now, or even for shortly after 1.0; there are too many other things that should take precedence. I’m just throwing this out there to see if it bounces.
So iterators are all nifty and stuff, right, but they’re kind of a pain to write. Let’s face it, Python already solved this:
def count_count_things(n, noun):
for i in range(1, n+1):
suffix = "s" if i > 1 else ""
yield "%d... %d %s%s, ah ah ah!" % (i, i, noun, suffix)
Even C# can do this (note: written off the top of my head, no guarantee that it’ll compile; it’s been a while!):
IEnumerable<String> CountCountThings(int n, String noun) {
for(int i=1; i<=n; ++i) {
var suffix = (i > 1) ? "s" : "";
yield return String.Format(
"{0}... {0} {1}{2}, ah ah ah!",
i, noun, suffix);
}
}
We gotta get in on this action, I tells ya! But I have a problem with both of these. Both of them look like functions, but kinda-sorta aren’t. Unlike regular functions, when you call them, you’re actually constructing an object that implements a state machine which is equivalent in behaviour to the body of the function, but none of the code you wrote actually runs.
Does that sound familiar? It should.
let evens: Vec<_> = (0..5).map(|e| 2*e).collect();
Look at that closure; it looks like an expression, but kinda-sorta isn’t. Unlike other expressions, when you evaluate them, you’re actually constructing an object that implements a function which is equivalent in behaviour to the body of the expression, but none of the code you wrote actually runs.
headasplode.gif
That’s right, generators are a kind of closure. ahem:
fn count_count_things(n: u32, noun: String) -> impl FnCoro() -> String {
move || {
for i in 1..n+1 {
let suffix = if i > 1 { "s" } else { "" };
yield format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix);
}
}
}
Bam. Generators using closure syntax. Assuming a general state machine transform, and using Iterator
for now until I come back to what exactly FnCoro
looks like, the above would be transformed into something roughly akin to:
fn count_count_things(n: u32, noun: String) -> Iterator<String> {
count_count_things_impl {
_state: 0,
n: n,
noun: noun,
_i_iter: None,
_i: None,
}
}
struct count_count_things_impl {
_state: u32,
n: u32,
noun: String,
_i_iter: Option<Range<u32>>,
_i: Option<u32>,
}
impl Iterator for count_count_things_impl {
fn next(&mut self) -> Option<String> {
loop {
match self._state {
0 => {
// 1..n+1
self._i_iter = Some(1..n+1);
self._state = 1;
},
1 => {
// for i in 1..n+1 {
// let suffix = if i > 1 { "s" } else { "" };
// yield format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix);
let _i_iter = self._i_iter.as_ref().unwrap();
self._i = Some(_i_iter.next());
let i = self._i.as_ref().unwrap();
let suffix = if *i > 1 { "s" } else { "" };
let _result = format!("{}... {} {}{}, ah ah ah!"), i, i, noun, suffix);
self._state = 2;
return Some(_result);
},
_ => {
// }
self._i = None;
self._state = 1;
}
}
}
}
}
“But wait…”, I hear you say, “…why would you use closures? I mean, what possible use does the argument list provide?”
Oh girl, you will not believe how ballin’ that’s gonna be.
See, the Pythonistas realised long ago that generators, with a relatively little tweak, could do so much more than just yield a sequence of values.
>>> def biggerer():
... n = None
... while True:
... n = yield n
... n = n * 2
>>> an_biggerer = biggerer()
>>> next(an_biggerer) # kickstart
>>> an_biggerer.send(42)
84
>>> an_biggerer.send(12)
24
>>> an_biggerer.send("wowzers!")
'wowzers!wowzers!'
Still not impressed? Python 3.4’s entire asyncio
package, the thing that allows one to write high-performance asynchronous, IO-bound code in Python, relies on this. It works by having all “coroutines” (i.e. generators following the coroutine protocol) yield special objects that describe how and when to “resume” a suspended coroutine, usually when an IO operation completes. When the scheduler (i.e. event loop) resumes a coroutine, it .send(...)
s the result of the operation.
A simple example of this stolen from the first blog post I could find:
import asyncio
import time
@asyncio.coroutine
def sleepy():
print("before sleep", time.time())
yield from asyncio.sleep(5)
print("after sleep", time.time())
asyncio.get_event_loop().run_until_complete(sleepy())
(Ignore the man behind the yield from
curtain; for now, consider it a kind of subroutine-y yield
variant.)
There’s actually nothing special about asyncio
, mind; I used generators in Python 3.3 to build a completely asynchronous GUI framework for fun. Non-blocking UI operations without threads is a beautiful thing.
“But… we have libgreen
…”
Which is complicated. This is simple. In fact, one of the best things about this over something like libgreen
is that all context switches are explicit. This, it turns out, actually does matter. If you switch away, then switch back, things may have changed. Resources that were there prior to the switch may have been destroyed. An invariant you established prior to the switch may no longer hold.
The question is, how to reconcile this with coroutine closures? Waitaminute; we have that arg list!
fn biggerer() -> impl FnCoro(i32) -> Option<i32> {
let mut next = None;
move |yield: i32| {
loop {
let n = yield next;
next = Some(n * 2);
}
}
}
“Ok, hold on now. First of all, what’s with yield
in the arg list?”
Coroutine closures don’t have arguments in the conventional sense. Anything you “pass” to one of them gets injected as the result of the last yield
expression. If you don’t specify it, it’s implicitly ()
.
“Another problem: in your Python example, you had to ‘kickstart’ the generator. How’s that going to work here?”
Ok, so I’m skimming details a bit. Turns out that you can’t really just call a FnCoro
. In fact, FnCoro
would really look like this:
trait FnCoro<Arg> where <Self as FnCoro<Arg>>::Output: Sized {
type Output;
fn start(&mut self) -> CoroResult<<Self as FnMut<Args>>::Output>;
fn resume(&mut self, send: CoroResume<Arg>) -> CoroResult<<Self as FnMut<Args>>::Output>;
}
enum CoroResult<Output> {
Terminated,
Yielded(Output),
}
enum CoroResume<Arg> {
Value(Arg),
Unwind(Box<Any+'static /*'*/>),
}
“Wait, wait, wait… Unwind
?!”
Well, if you yield control to another coroutine in order to do some work (like file IO), and that operation fails, you need some way to propagate that failure. In Python, this is done by calling the throw
method to raise an exception inside a generator.
“You still haven’t explained that yield from
business from earlier.”
Basically, the problem for Python was that delegating to a sub-generator was painful. To give you an idea how painful, this:
def writer_wrapper(coro):
yield from coro
is equivalent to this:
def writer_wrapper(coro):
"""Works. Manually catches exceptions and throws them"""
coro.send(None) # prime the coro
while True:
try:
try:
x = (yield)
except Exception as e:
coro.throw(e)
else:
coro.send(x)
except StopIteration:
pass
(Thanks to this StackOverflow question for the example.)
An additional benefit was that yield from
allowed the interpreter to “short circuit” resume/suspend. Normally, on resumption, the interpreter would have to go back down through the generators one level at a time, resuming each in turn. Then, on the innermost yield
, it’d have to slowly wind its way back out, unwinding one stack frame at a time.
yield from
allows the interpreter to just remember which frame was the inner-most, and jump there directly. On suspension, it remembers the stack frame and jumps back out. At this point, I hope it goes without saying that we want yield from
as well.
“Hang on, you called it FnCoro
… so how does this interact with the other Fn*
traits, or Iterator
for that matter?”
The other Fn*
traits are a bit of a wash. There’s various ways they could interact:
-
FnCoro(R) -> Y
could implementFnMut(Option<CoroResume<R>>) -> CoroResult<Y>
. -
FnCoro() -> Y
could implementFnMut() -> Option<Y>
. -
FnMut(Option<CoroResume<R>>) -> CoroResult<Y>
could implementFnCoro(R) -> Y
. -
FnOnce() -> Option<Y>
or similar could implementFnCoro() -> Y
.
None of them particularly stand out to me, except perhaps FnCoro
implementing FnMut
. In fact, you could merge start
and resume
by adding Start
variant to CoroResume
.
As for Iterator
:
struct CoroIter<F>(F);
impl<F, Y> IntoIterator for F where F: FnCoro() -> Y + Sized {
type Iter = CoroIter<F>;
fn into_iter(self) -> CoroIter<F> {
self.start();
CoroIter(self)
}
}
impl<F, Y> Iterator for CoroIter<F> where F: FnCoro() -> Y + Sized {
type Item = Y;
fn next(&mut self) -> Option<Y> {
match self.0.resume(CoroResume::Value(()) {
CoroResult::Terminated => None,
CoroResult::Yielded(v) => Some(v)
}
}
}
“So you’re saying that this weird-ass addition to closures gives us trivial iterators and an explicit, high performance asynchronous primitive?”
Yes.
There are really only two absolutely massive problems:
-
Edit: turns out
yield
is reserved. That’s 50% less massive problems after just an hour! At this rate, it’ll be implemented in no time!Originally:
yield
isn’t a keyword. C# solved this by using a keyword “pair” in the form ofyield return
, which would work, but is kinda fugly. Plus, then we’d haveyield from return
sincefrom
isn’t a keyword either. Yech. -
I’m really unsure how in the high-flying bacon lifetimes and drop semantics are going to interact with this. shudder
Thoughts?