Please, take a look at RecursiveFuture in Playground, it is possible to have Zero-Cost Abstraction even with recursive calls:
No it's not. You cannot use alloca to extend the memory for your recursive future. Any memory you allocate will get dropped from the stack once the call to RecursiveFuture::create_self returns.
You can't just make up functions that can't be implemented. It's probably possible to implement a function that rewrites its own calling stack to make more space in the stack space above, but it would need serious compiler support. I don't know any languages that implement anything like that.
What the alloca()-like function does or doesn't do isn't the issue. The issue @canndrew's correctly pointing out is that returning from any function changes the stack address. So when create_self() returns, any stack memory it allocated is going to get deallocated. That's just how stacks work.
This is essentially just another way of saying that recursive types fundamentally require some kind of heap allocation, whether or not async/await is involved.
If we're bikeshedding the attribute name, then I suggest avoiding recursive, because there are other uses for boxed futures, e.g. to reduce size of Future objects. When your program is fully async, async fn main()'s Future can get huge, because it ends up containing all of program's state for all code paths.
The compiler, or a compiler plugin could do that. A simple proc-macro can't, or at least can't do it 100% reliably, since the recursive call could be hidden inside a macro or something.
I agree with you, but it is not an issue either !!
In alloca_like we could just copy current Stack Frame after our allocated memory in that way after create_self function finishes memory would not be deallocated
It's a minor optimization. It keeps the top-level recursive unboxed, so the first non-recursive call wouldn't immediately cause allocation. It moves cost from calling recursive functions in general to recursing calls themselves.
The alloca-like approach cannot work, even if the conflicts around the stack frame are resolved, because futures' contents are never allocated from the stack to begin with! They are allocated in some impl Future object, which is usually stored on the heap, or occasionally stored as a local within the stack frame of something like block_on.
Their whole allocation strategy is designed around computing the maximum "stack" size up front, as the size of that impl Future object. You can't alloca from a struct or enum- they don't have any built-in room to expand the way the CPU's call stack does. This is why recursive async functions error- they're creating a recursive type internally.
The closest thing to alloca-ing from an impl Future object is... boxing the new field! You could get a little bit closer to alloca by reserving some extra space in the top-level task and using it for a special-purpose allocator within that task, moving things closer to green threads, but the key here is that you cannot use stack memory for impl Future storage, because the stack completely goes away every time you suspend.
You keep suggesting similar things, without realizing that recursion in state machines creates infinitely sized types. It's not just about allocation, it's a logical impossibility of describing it as a type without indirection.
Surprised there's been no mention of tail call optimization yet? With TCO tail calls can be done without extra allocation, it's one of the places where TCO would be really nice to have.