This is a pre-RFC for explicit proper tail calls via become
.
My motivation for this is that during work on a Scheme VM in Rust, I learned that tail calls are the best way to implement jump tables, but Rust doesn’t guarantee proper tail calls, so this would crash in debug mode.
Introduction
Many languages support proper tail calls, which means that a call that does nothing after the call and returns the return value of the callee consumes no stack space. These are mostly functional languages, such as Scheme, ML variants, and Haskell. Other languages that support proper tail calls include ECMAscript 6 (though not supported by most implementations) and Lua. Perl supports an explicit tail call (via the syntax goto &sub
).
This proposal is like Perl, in that tail calls are explicit.
Motivation
Proper tail calls allow for iteration to be expressed as tail-recursive loops. They also eliminate the stack space penalty of functions that simply call another function with slightly different arguments. Finally, they allow jump tables to be expressed as arrays of functions, which can speed up interpreter loops.
Syntax
A proper tail call is written using the become
keyword. This keyword must precede a function or method call or an overloaded operator invocation.
Examples
/// Infinite loop
pub fn infinite_loop() -> ! {
become infinite_loop()
}
/// Iterate over a slice recursively
pub fn recursive_slice_iteration(x: &[u8]) {
do_something(x[0]);
become(recursive_slice_iteration(x[1..]))
}
Semantics
A become
statement is typed like a return
statement. That is, it is considered to diverge, and the argument of become
must have the same type as that of the function calling become
.
The execution and lifetime semantics of a become
statement are as if the following occurred:
- The lifetimes of all objects that are not being passed by value to the callee end, as if their variables went out of scope. Destructors are called for objects that implement
Drop
. Note: this is an observable change in behavior (other than reduction in stack usage) if the destructor has observable side-effects. - All objects that are being passed to the callee by value are moved into temporary variables, even if they implement
Copy
. Their original variables go out of scope. - Execution is transferred to the callee, with the temporary variables as arguments.
This implies that:
- it is forbidden for a tail-called function to be have a reference to the immediate caller’s stack frame (lifetime error).
- It is permitted to pass arguments by value to a tail-called function, even if they are local variables and even if they have destructors.
There are additional restrictions on tail-called functions. These are not necessary for soundness, but are a consequence of the implementation details of Rust.
- A tail-called function must not be
extern
(except forextern "rust"
orextern "rust-intrinsic"
), since tail-calls are incompatible with the standard C calling convention. - If tail-called function is in a crate to which the calling crate will be dynamically linked, the call must not be made through the PLT, due to LLVM limitations. I suspect this is because the linker-generated trampolines will mishandle the code. Therefore, the compiler will generate code as if the GCC option
-fno-plt
was given (which is a good idea anyway). This may require “mangling” the assembler generated by LLVM to rewrite@function
to@object
(it did for GHC). This is a robust transformation and is used in GHC’s production-quality LLVM backend. - Tail calls are not supported on MIPS and System Z (at least per the LLVM docs). This is a bug and will be fixed by patches to LLVM. Tail calls are also not supported on GPU backends; this may be harder to fix (I have no idea).
Implementation
Tail calls map directly to the corresponding support in LLVM. Rust uses the fastcc
calling convention, which supports tail calls. If Rust changes calling conventions, tail calls would need to be implemented for the new calling convention.
An optional part of this includes switching to -fno-plt
code generation. I prefer this solution, since it is faster and avoids PLT-related security bugs, but it is not mandatory except for actual tail calls.
Alternative approaches
It would be possible to forbid any local variables that implement Drop
from being in scope (except those passed to the callee by value) when a tail-call is made. This would avoid a silent semantics change when become
is replaced by return
(or visa versa), but would likely require a large number of manual drop
calls.