Given resurgence of interest in async IO libraries, I am thinking or re-introducing my RFC for coroutine support in Rust, updated to keep up with the times (though coroutines are useful for more than just async).
Anyhow, here 's the latest draft. I have a feeling it is coming off a bit terse… I’d appreciate feedback on what seems unclear and what sections should be expanded.
You will get an error because Y couldn’t be inferred. This isn’t a problem with regular closures because you can specify the return type with || -> Ret. Instead with coroutines you’ll have to invoke takes_coro<String, _, _>(...) which can be confusing.
However I don’t know in practice if that’s just a minor nit or if it’s really a problem.
I've considered that, but then we'd need to add a third variant to CoResult. And how would that work with parameterized coroutines? Also, we'd probably have to box the inner iterator.
I am hoping that Rust doesn't need this, because, unlike Python, it should be able to inline iteration over the inner.
I didn't explicitly mention this, but you should be able to write || -> CoResult<String,()> {...}, just like for regular closures.
Please see Motivating Example #3. We'll probably need some syntax sugar for this to be ergonomic, but Rust syntax extensions should be able to cover that.
I am thinking that yield would have the same precedence as return; anything else would be confusing. So you'd have to parenthesize in the latter case: (yield a)?
Firstly, thanks for this (pre) RFC, it looks really promising
Wrt. the Iterator adapter wouldn’t following be better, or did I miss some thing?
impl<G,T> Iterator for G where G: FnMut() -> CoResult<T,()> {
type Item = T;
fn next(&mut self) -> Option<T> {
match self.call() {
Yield(x) => Some(x),
Return(*) => None
}
}
}
Lastly I wonder restrictions of borrows across yields might sometimes be a bit surprising (through needed) if e.g. the borrow is to some value on the heap guarded by a hoisted struct or e.g.:
//in coro
|outer| {
for inner in outer.iter() {
// isn't now the OuterIter on the stack (=> hoisted) and
// and inner has a borrow to it
for element in inner.iter() {
// now inner and InnerIter live over a yield, which they are
//not allowed to because they have a borrow to the hoisted OuterIter?!
yield element;
}
}
}
I probably misunderstood/overlooked something, but if not wouldn’t this be kind of a problem (Coros would still be helpfull, but a bit less and they also would be more confusing)?
EDIT: is just noticed there is an error |outer| would be a parameter returned from yield, so correct would be if outer is brought into the scope of the corotine by closing over it.
Correction:
type ElCoro = FnMut() -> CoResult<El, ()>;
fn random_fn<'a>(outer: &'a Some2DCollectionOfEl) -> impl ElCoror + 'a {
|| {
for inner in outer.iter() {
// ...
for element in inner.iter() {
// ...
yield element;
}
}
}
}
Indeed, it would! Those examples are more like sketches, I doubt they'd compile.
Depends on the thing you are iterating over:
If it's something like Vec<Vec<i32>>, this should be fine, because the outer iter returns a &Vec<i32>, and the inner iter would reference into that value (which lives in the outer collection), not into the outer iter.
If, however, you are iterating by value over, say, Vec<[i32;10]>, the outer iter would return a [i32; 10], which would get hoisted, and, yes, we'd have a problem.
IMO, this case should be infrequent and is easy to fix by switching to iteration by reference.
while let Yield(x) = coro1 {
print!("{} ", x); // prints "0 1 2 3 4 5 6 7 8 9 "
}
@vadimcn Is the absense of parentheses intended ? coro1() seems more logical and readable.
Should coroutines be introduced with a special keyword,- to distinguish them from regular closures?
For example, coro |a, b| { ... }
I would prefer so, even if coro sounds a bit weird and a better name might be prefered. I think this would be more readable:
fn iter(&'a self) -> impl Iterator<T> + 'a {
coro {
let mut i = 0;
while i < self.len() {
yield self[i];
}
}
}
as || could become optional in that case. And when the body of a function is only a coroutine it would allow a sugar like:
coro fn iter(&'a self) -> impl Iterator<T> + 'a {
let mut i = 0;
while i < self.len() {
yield self[i];
}
}
Now I'm a bit disatisfied about the idea of encoding async/await as a combination of functions and macros above generators. I would like so much all monadic types benefit from the same sugar. It is still unclear to me if we could allow something as powerfull as F# computation expressions without relying on pure source transformation.
I think we need to support efficient iterator nesting. An interesting approach is described in this paper from Spec#, in section titled Translation of nested iterators. This approach allow nested iterators to perform in linear time. ¿are you considered this approach?
@burakumin: I think the absence of parentheses is no intended there, wrt. the the coro (or similar) keyword I think that it is neither good nor bad, but I’m not so biggest fans for the “more readable” variants you provided. For one think I’m not the biggest fan of unesseasry special cases like dropping the || in the case where there are no parameters to the coro (coro {...} instead of coro || {...}) it’s just not worth the 2 less characters to type. Also I prefere to have nothing in a function signature which is not part of the actual signature, this include some struct destructing in the parameter list, the mut in (...,mut foo: Bar,... and your coro function prefix. (Through I still use the mut in the end ). While this is just a very subjective preference, the main reason against it is, that yield is already a reserved keyword, while coro isn’t (and using yield there also seems suboptimal)
TL;DR: nested loop/coro optimazation: yes! But as a implementation detail
@gabomgp I don’t think it is necessary to include any langue/API elements for this, in many cases (assuming -> impl FnMut() -> CoResult<...> can be used) it is possible for the compiler to have all necessary information and freedoms to do any necessary optimization.
Through it is true that e.g. in following case:
|| {
let another_coro = some_how();
for e in another_coro {
yield e;
}
}
it might nice to have some additional optimazation, it should be possible to do so with a MIR pass which detects such yield from patterns and transforms the code accordingly. Through yes, it might not necessary be supper easy to detect such patterns, so to give the compiler a hint and for usage convinience a yield_from!(coro) macro could be introduced, which is defined to be semantically equivalent to the while let loop from the RFC (not depending on the Iterator adapter). Note that this might need a bit of extra work wrt. to cases which include some “wrapping” similar to the await! example.
E.g. (not I rarely write macros, so there might be some error in it)
macro_rules! yield_from {
($coro:expr) => {
let hoisted_coro = $coro;
while let CoResult::Yield(e) = (hoisted_coro)() {
yield e;
}
}
}
//... in a function or similar
|| {
// normal use, compiler get a hint for optimization
let some_coro = get_it();
yield_from!(some_coro);
}
@vadimcn: I wonder about a implementation detail wrt. hoisted variables. If I’m not mistaken for a variable being hoisted can mean it’s living on the heap. Now when a coro is continued, will the variable be used from there, moved back ontop of the stack or a mixture of this? Both seems to make sense(1) depending on the actual coror and I’m not sure if we can rely on LLVM to optimize this correctly.
(1): just another function call before the next yield -> stais where it is (heap?), more complex coro -> gets back on the stack
No, just a typo. Invocations should use parens, of course.
@gabomgp, @naicode:
I think optimization of nested iterators should be done by the compiler. We might not even have to do anything special, because inlining passes should take care of tight inner loops over nested iterators.
The hoisted variables live in the coroutine environment struct, just like captured vars of regular closures do. The environment itself may end up on the heap, but coroutine code doesn't need to worry about that, it will always access them through the self pointer.