When will Rust have TCO/TCE? A proposal on the subject


#1

Moved this post from: https://users.rust-lang.org/t/when-will-rust-have-tco-tce/20790/5

I know Rust has had some RFCs about the subject. I know there is even a reserved keyword become for explicit tail call elimination. However, the implementation of TCO/TCE seems to be held back.

I have read some topics and I understood Rust is limited by LLVM’s tail call demanded calling conventions. Could not we change the design of TCO in Rust? Could not rustc inline functions and convert recursion to local jumps?

I have to confess reading “functional features” in descriptions about Rust while Rust not having proper recursion support makes me sad.

I know some of you will say iterators are enough, however, they do not fit well in mutual recursion (e.g. a parser) where manually transforming things into a loop ends in a ver-big-hard-to-read function.

A possible syntax for the design I am proposing:

letrec! {
    fn is_even(n: u128) -> bool {
        if n == 0 { true } else { is_odd(n - 1) }
    }

    fn is_odd(n: u128) -> bool {
        if n == 0 { false } else { is_even(n - 1) }
    }
}

The macro should be built-in, generated machine code should be equivalent to this (fictional goto statement for illustration purposes):


fn is_even(n: u128) -> bool {
    let mut is_even_arg0;
    let mut is_odd_arg0;
    is_even_arg0 = n;
    goto 'is_even;

    'is_even: {
        let n = is_even_arg0;
        if n == 0 { 
            return true 
        } else { 
            is_odd_arg0 = n - 1;
            goto 'is_odd; 
        }        
    }

    'is_odd: {
        let n = is_odd_arg0;
        if n == 0 { 
            return false 
        } else { 
            is_even_arg0 = n - 1;
            goto 'is_even; 
        }        
    }
}


fn is_odd(n: u128) -> bool {
    let mut is_odd_arg0;
    let mut is_even_arg0;
    is_odd_arg0 = n;
    goto 'is_odd;

    'is_odd: {
        let n = is_odd_arg0;
        if n == 0 { 
            return false 
        } else { 
            is_even_arg0 = n - 1;
            goto 'is_even; 
        }        
    }

    'is_even: {
        let n = is_even_arg0;
        if n == 0 { 
            return true 
        } else { 
            is_odd_arg0 = n - 1;
            goto 'is_odd; 
        }        
    }
}

I would offer myself to write a compiler plugin implementing this macro to demonstrate what I mean. We do not have access to goto/jmp in plain Rust, but it is possible to emulate it for demonstration purposes.


#2

I would be interested in seeing this implemented as a procedural macro to experiment with.


#3

Do we have procedural macro in plain_macro! style yet?

I think we don’t. However, we do have compiler plugins which have the option of defining built-in macros.


#4

Yes we have function-like procedural macros. They are stable to define in 1.29 and stable to use in 1.30.


#5

Is there any tutorial or example?


#6

https://github.com/dtolnay/syn/tree/master/examples/lazy-static


#7

Thanks! I am already trying to write one. I found a limitation related to generics. However, I think rustc would not have this limitation. The limitation is related to the fact proc-macros have only access to syntax.


#8

This is useful to know because if type information is required for this functionality, then the feature is going to involve language changes beyond just a macro in the compiler. Rust macros always only have access to syntax and not type information, equally true whether the macro is built into the compiler or provided by a proc macro library.

That said, my experience has been that in the majority of cases where a macro author thinks that their macro needs type information, it is possible to implement the same macro without type information just with creative use of existing language features.


#9

I can think a way of monomorphizing functions manually. However, it has a bit more limited inference because of obvious reasons. It is a little bit laborious to implement too.

Maybe there is a creative workaround, as you suggest =P


#10

If I’m not mistaken, having just though about it, the problem would be in invoking the same generic function with multiple types, correct? A completely flat in-lining of the functions would definitely suffer from needing to monomorphize.

I think a trampolining solution could be created though. I’m playing with that currently. (EDIT: no, I’m pretty sure it falls prey to needing monomorphization as well.)


#11

Whatever macro solution is come up with, it’s going to be shortchanged for not having type information. It’s probably possible to work around parts of it, but I think it needs type information to be complete.

For the generic/monomorphization problem, consider:

tco! {
    pub fn foo() -> Box<dyn Display> {
        if rand() {
            spam("hi")
        } else {
            spam("hi".to_string())
        }
    }

    fn spam(d: impl Display) -> Box<dyn Display> {
        Box::new(d)
    }
}

Never mind that this is a stupid use of TCO, it’s just plain old inlining, this would require any trampoline setup to have multiple versions of the generic fn for every type it’s called with.

Using a trampoline:

enum NextCall {
    foo(,),
    spam(impl Display,),
    __DONE(Box<dyn Display>)
}

pub fn foo() -> Box<dyn Display> {
    trampoline(NextCall::foo())
}

fn trampoline(mut next: NextCall) -> Box<dyn Display> {
    fn foo() -> NextCall {
        if rand() {
            NextCall::spam("hi")
        } else {
            NextCall::spam("hi".to_string())
        }
    }
    fn spam(d: impl Display) -> NextCall {
        NextCall::__DONE(Box::new(d))
    }
    loop {
        next = match next {
            NextCall::foo() => foo(),
            NextCall::spam(d) => spam(d),
            NextCall::__DONE(ret) => return ret,
        }
    }
}

The problem is that our unfolded state needs to be able to hold everything that the call might be called with. This is easily done via monomorphization, but this is impossible to do purely via syntax.

Relatedly, any syntax-based macro solution is going to mess up in some cases around shadowing and finding the tco’d fns via paths other than just naming them. A type-aware plugin could use resolve to actually do the right thing in all cases.


#12

I have actually built a trampoline once, but it never occurred to me using procedural macros to achieve that. Indeed, it is a much better idea than using an expensive allocation of thunks on every return.

Here is the trampoline I’ve made: https://docs.rs/tramp/0.3.0/tramp/

Example:

#[macro_use] extern crate tramp;

use tramp::{tramp, Rec};

fn factorial(n: u128) -> u128 {
    fn fac_with_acc(n: u128, acc: u128) -> Rec<u128> {
        if n > 1 {
            rec_call!(fac_with_acc(n - 1, acc * n))
        } else {
            rec_ret!(acc)
        }
    }

    tramp(fac_with_acc(n, 1))
}

assert_eq!(factorial(5), 120);

Although the example is a simple recursion, it supports mutual tail call recursion. In fact, it supports even non-recursive tail call elimination.

The problem with this implementation is that the thunk https://docs.rs/tramp/0.3.0/tramp/struct.Thunk.html makes an allocation at each recursive tail call.


#13

I’ve got a version working!

tco! {
    pub(crate) fn factorial(n: u128) -> u128 {
        fac(n, 1)
    }

    fn fac(n: u128, acc: u128) -> u128 {
        if n > 1 {
            fac(n - 1, acc * n)
        } else {
            acc
        }
    }
}

fn main() {
    for i in 0..35 {
        println!("{}! = {}", i, factorial(i));
    }
}

Unfortunately this example doesn’t really blow the stack much so I can’t prove it’s using a trampoline, but I plan on adding a few more examples to make sure it works. A mutually recursive is_even/is_odd definition should blow the stack without TCO pretty well, shouldn’t it?

The above example’s tco!{} expands into this code on the playground. (EDIT: this link is incorrect, will fix later)


#14

TCO is hard(er) in Rust because what looks like a tail call isn’t necessarily actually a tail call due to Drop deferring code to the end of scope.

Interestingly, implementing it via a trampoline like this conveniently means that Drop just works.

I’ve still no idea how to get ? working like you’d expect it to (or honestly anything other than a tail call return) simply, but I know those are possible, if ugly, as purely syntactical transformations. It’s generics that I’m unsure whether we’d be able to handle at all.

Obviously as a building block you’d want to support generics, but I’m curious how much of a restriction it’d be in practice. In this implementation you’re already quite limited, in that TCO is only (currently) usable within a singular exposed fn (the first and only the first is required to be non-private, the rest are made “private” to the tco!{} block) and its helpers. (I do want to make it possible to expose other entry points as well though; it was just easiest to only expose the first for an initial draft.)


#15

I believe that an implementation at the language level would execute Drop before calling the function. This is the importance of having an explicit become keyword to invoke tail call optimization. Using become implies calling every Drop before, which means you the borrow checker should prevent you from passing, say, a temporary local reference to the callee.

Btw, nice implementation!


#16

I fixed the playground link; here’s what it expands to:

pub(crate) fn factorial(n: u128) -> u128 {
    __trampoline(__NextCall::factorial(n))
}
enum __NextCall {
    factorial(u128),
    fac(u128, u128),
    __DONE(u128),
}
fn __trampoline(mut next: __NextCall) -> u128 {
    pub(crate) fn factorial(n: u128) -> __NextCall {
        __NextCall::fac(n, 1)
    }
    fn fac(n: u128, acc: u128) -> __NextCall {
        if n > 1 {
            __NextCall::fac(n - 1, acc * n)
        } else {
            __NextCall::__DONE(acc)
        }
    }
    loop {
        next = match next {
            __NextCall::factorial(n) => factorial(n),
            __NextCall::fac(n, acc) => fac(n, acc),
            __NextCall::__DONE(res) => return res,
        }
    }
}

fn main() {
    for i in 0..35 {
        println!("{}! = {}", i, factorial(i));
    }
}

But yeah, given the problems with generics, I think a language become would be the best way of handling TCO, and become would indeed happen after Dropping anything with drop glue. The trampoline macro is interesting, but I think the language-level solution is just more robust.

(If someone wants me to publish my taillcallopt crate, help me clean out the issues and I’ll do so.)