Recursion for async/.await

Please, take a look at RecursiveFuture in Playground, it is possible to have Zero-Cost Abstraction even with recursive calls:

No it's not. You cannot use alloca to extend the memory for your recursive future. Any memory you allocate will get dropped from the stack once the call to RecursiveFuture::create_self returns.

2 Likes

That is why I have called it alloca-like !!

We could implement alloca_like function and dealloca_like that will allocate and deallocate by increamenting/decrementing Stack Address

unsafe fn alloca_like(size: usize, layout: Layout) -> *mut u8;
unsafe fn dealloca_like(ptr: *mut u8, layout: Layout) -> bool; // Success or Failure (called from wrong Stack Frame)

You can't just make up functions that can't be implemented. It's probably possible to implement a function that rewrites its own calling stack to make more space in the stack space above, but it would need serious compiler support. I don't know any languages that implement anything like that.

2 Likes

What the alloca()-like function does or doesn't do isn't the issue. The issue @canndrew's correctly pointing out is that returning from any function changes the stack address. So when create_self() returns, any stack memory it allocated is going to get deallocated. That's just how stacks work.

This is essentially just another way of saying that recursive types fundamentally require some kind of heap allocation, whether or not async/await is involved.

3 Likes

We could implement alloca_like function and dealloca_like that will allocate and deallocate by increamenting/decrementing Stack Address

And once RecursiveFuture::create_self returns the stack address will go back to where it was and the memory is gone again.

Or if you mean to inline create_self so that poll gets the stack space then the stack memory will be gone once poll returns.

you could desugar boxing of async fn recursive() { recursive().await } to

async fn recursive() {
   recursive_boxed().await
}

fn recursive_boxed() -> Box<dyn Future> {
   Box::pin(recursive().await)
}

If we're bikeshedding the attribute name, then I suggest avoiding recursive, because there are other uses for boxed futures, e.g. to reduce size of Future objects. When your program is fully async, async fn main()'s Future can get huge, because it ends up containing all of program's state for all code paths.

Is there a reason why this doesn't compile? Is it a fundamental limitation, or just rustc not understanding that Box::pin fixes the problem here?

async fn recursive() {
    Box::pin(recursive()).await
}

The compiler, or a compiler plugin could do that. A simple proc-macro can't, or at least can't do it 100% reliably, since the recursive call could be hidden inside a macro or something.

I agree with you, but it is not an issue either !!

In alloca_like we could just copy current Stack Frame after our allocated memory in that way after create_self function finishes memory would not be deallocated :wink:

What is the benefits of it ?

I guess that may be good reason to make it a compiler built-in/lang item after all.

In this case we would have in compiled .rlib two different functions !!

I do not think it is a good idea ...

I'm not sure, I think that this should work. It's surprising that it doesn't.

It's a minor optimization. It keeps the top-level recursive unboxed, so the first non-recursive call wouldn't immediately cause allocation. It moves cost from calling recursive functions in general to recursing calls themselves.

But anyway it will 'cause allocation on the heap !!

What if someone what to compile this code on embedded system without heap-allocator ?

I think RecursiveFuture with alloca-like function is better ... but maybe because I propose it :))))))

The alloca-like approach cannot work, even if the conflicts around the stack frame are resolved, because futures' contents are never allocated from the stack to begin with! They are allocated in some impl Future object, which is usually stored on the heap, or occasionally stored as a local within the stack frame of something like block_on.

Their whole allocation strategy is designed around computing the maximum "stack" size up front, as the size of that impl Future object. You can't alloca from a struct or enum- they don't have any built-in room to expand the way the CPU's call stack does. This is why recursive async functions error- they're creating a recursive type internally.

The closest thing to alloca-ing from an impl Future object is... boxing the new field! You could get a little bit closer to alloca by reserving some extra space in the top-level task and using it for a special-purpose allocator within that task, moving things closer to green threads, but the key here is that you cannot use stack memory for impl Future storage, because the stack completely goes away every time you suspend.

10 Likes

Yeah, seems like you are right !! I've forget about that allocation is responsibility of scheduler ...

Then the best solution is just provide additional trait like RecursiveFuture:

async fn recursive() {
    // Do some stuff
    recursive.await;
    // Do some stuff
}

will be compiled to:

fn recursive() -> impl RecursiveFuture {
    // Do some stuff
    recursive.await;
    // Do some stuff
}

See Rust Playground

And the scheduler will decide how allocate the data structure coroutine ... ?

You keep suggesting similar things, without realizing that recursion in state machines creates infinitely sized types. It's not just about allocation, it's a logical impossibility of describing it as a type without indirection.

8 Likes

Surprised there's been no mention of tail call optimization yet? With TCO tail calls can be done without extra allocation, it's one of the places where TCO would be really nice to have.

4 Likes