Crazy Idea: Coroutine Closures

DanielKeep · February 10, 2015, 1:24pm

Disclaimer: I am in no way suggesting this for now, or even for shortly after 1.0; there are too many other things that should take precedence. I’m just throwing this out there to see if it bounces.

So iterators are all nifty and stuff, right, but they’re kind of a pain to write. Let’s face it, Python already solved this:

def count_count_things(n, noun):
    for i in range(1, n+1):
        suffix = "s" if i > 1 else ""
        yield "%d... %d %s%s, ah ah ah!" % (i, i, noun, suffix)

Even C# can do this (note: written off the top of my head, no guarantee that it’ll compile; it’s been a while!):

IEnumerable<String> CountCountThings(int n, String noun) {
    for(int i=1; i<=n; ++i) {
        var suffix = (i > 1) ? "s" : "";
        yield return String.Format(
            "{0}... {0} {1}{2}, ah ah ah!",
            i, noun, suffix);
    }
}

We gotta get in on this action, I tells ya! But I have a problem with both of these. Both of them look like functions, but kinda-sorta aren’t. Unlike regular functions, when you call them, you’re actually constructing an object that implements a state machine which is equivalent in behaviour to the body of the function, but none of the code you wrote actually runs.

Does that sound familiar? It should.

let evens: Vec<_> = (0..5).map(|e| 2*e).collect();

Look at that closure; it looks like an expression, but kinda-sorta isn’t. Unlike other expressions, when you evaluate them, you’re actually constructing an object that implements a function which is equivalent in behaviour to the body of the expression, but none of the code you wrote actually runs.

headasplode.gif

That’s right, generators are a kind of closure. ahem:

fn count_count_things(n: u32, noun: String) -> impl FnCoro() -> String {
    move || {
        for i in 1..n+1 {
            let suffix = if i > 1 { "s" } else { "" };
            yield format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix);
        }
    }
}

Bam. Generators using closure syntax. Assuming a general state machine transform, and using Iterator for now until I come back to what exactly FnCoro looks like, the above would be transformed into something roughly akin to:

fn count_count_things(n: u32, noun: String) -> Iterator<String> {
    count_count_things_impl {
        _state: 0,
        n: n,
        noun: noun,
        _i_iter: None,
        _i: None,
    }
}

struct count_count_things_impl {
    _state: u32,
    n: u32,
    noun: String,
    _i_iter: Option<Range<u32>>,
    _i: Option<u32>,
}

impl Iterator for count_count_things_impl {
    fn next(&mut self) -> Option<String> {
        loop {
            match self._state {
                0 => {
                    // 1..n+1
                    self._i_iter = Some(1..n+1);
                    self._state = 1;
                },
                1 => {
                    // for i in 1..n+1 {
                    //     let suffix = if i > 1 { "s" } else { "" };
                    //     yield format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix);
                    let _i_iter = self._i_iter.as_ref().unwrap();
                    self._i = Some(_i_iter.next());
                    let i = self._i.as_ref().unwrap();
                    let suffix = if *i > 1 { "s" } else { "" };
                    let _result = format!("{}... {} {}{}, ah ah ah!"), i, i, noun, suffix);
                    self._state = 2;
                    return Some(_result);
                },
                _ => {
                    // }
                    self._i = None;
                    self._state = 1;
                }
            }
        }
    }
}

“But wait…”, I hear you say, “…why would you use closures? I mean, what possible use does the argument list provide?”

Oh girl, you will not believe how ballin’ that’s gonna be.

See, the Pythonistas realised long ago that generators, with a relatively little tweak, could do so much more than just yield a sequence of values.

>>> def biggerer():
...     n = None
...     while True:
...         n = yield n
...         n = n * 2
>>> an_biggerer = biggerer()
>>> next(an_biggerer) # kickstart
>>> an_biggerer.send(42)
84
>>> an_biggerer.send(12)
24
>>> an_biggerer.send("wowzers!")
'wowzers!wowzers!'

Still not impressed? Python 3.4’s entire asyncio package, the thing that allows one to write high-performance asynchronous, IO-bound code in Python, relies on this. It works by having all “coroutines” (i.e. generators following the coroutine protocol) yield special objects that describe how and when to “resume” a suspended coroutine, usually when an IO operation completes. When the scheduler (i.e. event loop) resumes a coroutine, it .send(...)s the result of the operation.

A simple example of this stolen from the first blog post I could find:

import asyncio
import time

@asyncio.coroutine
def sleepy():
    print("before sleep", time.time())
    yield from asyncio.sleep(5)
    print("after sleep", time.time())

asyncio.get_event_loop().run_until_complete(sleepy())

(Ignore the man behind the yield from curtain; for now, consider it a kind of subroutine-y yield variant.)

There’s actually nothing special about asyncio, mind; I used generators in Python 3.3 to build a completely asynchronous GUI framework for fun. Non-blocking UI operations without threads is a beautiful thing.

“But… we have libgreen…”

Which is complicated. This is simple. In fact, one of the best things about this over something like libgreen is that all context switches are explicit. This, it turns out, actually does matter. If you switch away, then switch back, things may have changed. Resources that were there prior to the switch may have been destroyed. An invariant you established prior to the switch may no longer hold.

The question is, how to reconcile this with coroutine closures? Waitaminute; we have that arg list!

fn biggerer() -> impl FnCoro(i32) -> Option<i32> {
    let mut next = None;
    move |yield: i32| {
        loop {
            let n = yield next;
            next = Some(n * 2);
        }
    }
}

“Ok, hold on now. First of all, what’s with yield in the arg list?”

Coroutine closures don’t have arguments in the conventional sense. Anything you “pass” to one of them gets injected as the result of the last yield expression. If you don’t specify it, it’s implicitly ().

“Another problem: in your Python example, you had to ‘kickstart’ the generator. How’s that going to work here?”

Ok, so I’m skimming details a bit. Turns out that you can’t really just call a FnCoro. In fact, FnCoro would really look like this:

trait FnCoro<Arg> where <Self as FnCoro<Arg>>::Output: Sized {
    type Output;

    fn start(&mut self) -> CoroResult<<Self as FnMut<Args>>::Output>;
    fn resume(&mut self, send: CoroResume<Arg>) -> CoroResult<<Self as FnMut<Args>>::Output>;
}

enum CoroResult<Output> {
    Terminated,
    Yielded(Output),
}

enum CoroResume<Arg> {
    Value(Arg),
    Unwind(Box<Any+'static /*'*/>),
}

“Wait, wait, wait… Unwind?!”

Well, if you yield control to another coroutine in order to do some work (like file IO), and that operation fails, you need some way to propagate that failure. In Python, this is done by calling the throw method to raise an exception inside a generator.

“You still haven’t explained that yield from business from earlier.”

Basically, the problem for Python was that delegating to a sub-generator was painful. To give you an idea how painful, this:

def writer_wrapper(coro):
    yield from coro

is equivalent to this:

def writer_wrapper(coro):
    """Works. Manually catches exceptions and throws them"""
    coro.send(None)  # prime the coro
    while True:
        try:
            try:
                x = (yield)
            except Exception as e:
                coro.throw(e)
            else:
                coro.send(x)
        except StopIteration:
            pass

(Thanks to this StackOverflow question for the example.)

An additional benefit was that yield from allowed the interpreter to “short circuit” resume/suspend. Normally, on resumption, the interpreter would have to go back down through the generators one level at a time, resuming each in turn. Then, on the innermost yield, it’d have to slowly wind its way back out, unwinding one stack frame at a time.

yield from allows the interpreter to just remember which frame was the inner-most, and jump there directly. On suspension, it remembers the stack frame and jumps back out. At this point, I hope it goes without saying that we want yield from as well.

“Hang on, you called it FnCoro… so how does this interact with the other Fn* traits, or Iterator for that matter?”

The other Fn* traits are a bit of a wash. There’s various ways they could interact:

FnCoro(R) -> Y could implement FnMut(Option<CoroResume<R>>) -> CoroResult<Y>.
FnCoro() -> Y could implement FnMut() -> Option<Y>.
FnMut(Option<CoroResume<R>>) -> CoroResult<Y> could implement FnCoro(R) -> Y.
FnOnce() -> Option<Y> or similar could implement FnCoro() -> Y.

None of them particularly stand out to me, except perhaps FnCoro implementing FnMut. In fact, you could merge start and resume by adding Start variant to CoroResume.

As for Iterator:

struct CoroIter<F>(F);

impl<F, Y> IntoIterator for F where F: FnCoro() -> Y + Sized {
    type Iter = CoroIter<F>;

    fn into_iter(self) -> CoroIter<F> {
        self.start();
        CoroIter(self)
    }
}

impl<F, Y> Iterator for CoroIter<F> where F: FnCoro() -> Y + Sized {
    type Item = Y;

    fn next(&mut self) -> Option<Y> {
        match self.0.resume(CoroResume::Value(()) {
            CoroResult::Terminated => None,
            CoroResult::Yielded(v) => Some(v)
        }
    }
}

“So you’re saying that this weird-ass addition to closures gives us trivial iterators and an explicit, high performance asynchronous primitive?”

Yes.

There are really only two absolutely massive problems:

Edit: turns out yield is reserved. That’s 50% less massive problems after just an hour! At this rate, it’ll be implemented in no time!

Originally: yield isn’t a keyword. C# solved this by using a keyword “pair” in the form of yield return, which would work, but is kinda fugly. Plus, then we’d have yield from return since from isn’t a keyword either. Yech.
I’m really unsure how in the high-flying bacon lifetimes and drop semantics are going to interact with this. shudder

Thoughts?

kennytm · February 10, 2015, 1:57pm

FYI, there is https://github.com/rust-lang/rfcs/issues/388 for something similar.

Is there an RFC for this impl Trait return type thing?

While yield isn't a keyword, it is reserved since 0.8, so we should be free to turn it into a keyword in the future. We could use a macro to express yield_from!(subcoro).

DanielKeep · February 10, 2015, 2:11pm

Pfft. That doesn't have my genius idea of horribly mutilating closures! It's clearly not as awesome.

There's something somewhere, but I really can't be bothered trying to track it down right now. Basic summary: -> impl Trait means "returns a type that implements Trait, but it's unnameable." I believe it was originally introduced to make returning iterators less of a pain in the sitting hardware.

Aww yiss.

As for the macro... we could, but it kinds feels like a really important thing to make first-class, if only to promote its use. Since there is no pre-existing yield syntax, we can just say "yield from is distinct from yield (from)".

glaebhoerl · February 10, 2015, 2:40pm

Abstract return types are https://github.com/rust-lang/rfcs/pull/105.

Generators have deep connections to delimited continuations and exceptions, though I’m not deeply familiar with the territory myself. This and this might be food for thought.

I confess to only having skimmed the OP, but I wonder which parts of the closure connection are truly important here? An idea that’s been sitting in my head for a very long time is that we could extend the anonymous closures mechanism from the Fn traits to trait object literals in general, with the same kind of capture-environment-as-self magic, e.g. |n| n + 1 would be equivalent to Fn { .call(n) = n + 1 } (specific syntax is highly debatable, here I’m punning on my favored syntax for struct literals).

Then for iterators we could do (using my preferred syntax for abstract return types):

fn count_count_things(n: u32, noun: String) -> abstract Iterator<String> {
    let mut i = 0;
    move Iterator {
        .next() = {
            i += 1;
            if i >= n { 
                None 
            } else {
                let suffix = if i > 1 { "s" } else { "" };
                Some(format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix))
            }
        }
    }
}

Of course this is missing exactly the yield magic which is the topic of discussion, but I wonder if this might be a cleaner base to build on than trying to shoehorn things into the Fn hierarchy?

DanielKeep · February 10, 2015, 3:11pm

I think it would be hard and/or weird to not have a first-class concept. The problem is that, if I take the above and try to apply it to FnMut (which FnCoro is basically a specialisation of), I get:

fn count_count_things(n: u32, noun: String) -> abstract FnMut(CoroResume<()>) -> CoroResult<String> {
    move FnMut {
        .call_mut(resume) = {
            for i in 1..n+1 {
               let suffix = if i > 1 { "s" } else { "" };
               yield format!("{}... {} {}{}, ah ah ah!", i, i, noun, suffix));
            }
        }
    }
}

The problem here is that the resume argument is required for the state machine, but is never (and cannot ever) be used by the actual body. That’s not even counting the fact that the contents of resume, when you’re dealing with a coroutine, will teleport from the argument into the result of the last yield out of nowhere. Also, call_mut has to return a CoroResult<String>, but that’s just kinda been waved away. It’s just… weird.

One of the things I liked about creating a new kind of closure is that it makes the generator more or less a single logical unit. Oh, sure, the explicit call interface reveals it for what it really is, but you don’t need to worry about that unless you’re manually interacting with a coroutine.

Also, removing the for loop is cheating. I went with that example because I wanted to keep things reasonably simple, but yield can show up more or less anywhere: you have to express it as a state machine.

system · March 25, 2019, 8:23am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Pre-RFC] Coroutines language design	18	3317	March 25, 2019
[Post-RFC] Stackless Coroutines language design	67	15085	March 25, 2019
We can have nice AsyncIterator::map now! language design	0	363	November 6, 2024
Blog Post: Futures Concurrency III language design	2	917	May 16, 2022
Help test async/await/generators/coroutines! announcements	34	22277	March 25, 2019

Crazy Idea: Coroutine Closures

Related topics