Clang attribute [[musttail]] in Rust?

Anyway to use the [[musttail]] attributes in Rust (including the nightly compiler)?

Document: Attributes in Clang — Clang 16.0.0git documentation

This attributes ensures the function call at function return is tail call optimized. So the stack size is O(1) instead of O(n), and there are many performance improvement opportunities with this feature

Related blog:

I do not find this in Rustc unstable book

How to implement:

  1. LLVM IR "Call" instruction already support "musttail"
  2. Add a new attribute "musttail", which can be applied to function call (or function declaration).
  3. The HIR and MIR needs to verify this and keep track of this attribute
1 Like

This has been discussed for at least 8 years, but there's nothing right now.

become is reserved for it, though.

It's a bit frustrating that this has been postponed multiple times, the last one in 2018. We're in 2022, can this discussion finally be restarted?

I would like to at least have the compiler provide the tools to iterate it on the crates ecosystem, even if it works only on nightly; or provide a nightly-only attribute, or something, and see where it goes (multiple people wrote compiler patches for tail calls to work, but they are stale now; it would be much better to have a #[tail_call] nightly-only macro)

And for a nightly-only experiment, it would be okay if it worked only on supported architectures. #[cfg_attr(not(target_arch = 'wasm32'), tail_call)] exists

Here's the (recently updated) process for adding things: Propose a change to the language - The Rust Language Design Team

To have a chance to stabilize, it would need a plan to stabilize on all supported architectures. IIRC last it came up there was no native tail call support on WASM, for example. So, in general, it might not be possible to enable arbitrary tail calls.

The question, then, is what kind of more-restricted construct could meet your needs? Is intra-crate calls enough? rustc could, hypothetically, always rewrite such calls into a loop itself.

Or maybe it doesn't need even that much. The most feasible version might be an inside-a-function custom control flow construct, like described in [Pre-RFC] Safe goto with value - #9 by scottmcm. Then, instead of become dispatch(…) it might be continue 'dispatch, locally in the function so there isn't even tail call optimization needed since it'd just be one big CFG.

4 Likes

The [[musttail]] attribute is introduced in Clang in 2021. At the time of the initial discussion on 2014, Clang does not have it. I may propose a RPC when I have time. I think it's probably the time to have it in Rust.

Some ideas:

  1. I do not like the become keywords in the 8 year RFC PR. Because The tail call requirement should not change the behavior of the Rust program.
  2. Since this is not supported by all architecture, I think it is more appropriate to introduce it under the target_feature attribute.
#[target_feature(enable = "musttail")]
return f();

The drawback is that this is the first target feature which is shared by multiple targets.

  1. If this target does not support musttail, a compiler error is issued. This can be done by adding a "musttail" field in the target spec json, produced by rustc +nightly -Z unstable-options --target=wasm32-unknown-unknown --print target-spec-json

Restrictions:

  1. Only targets supported by Clang will be supported in Rust. Compile error otherwise.
  2. The musttail attr can only be applied to a return.
  3. The musttail attr can only be applied to a call expression.
  4. The returned function cannot be a lambda function (Not supported in Clang either) This can be relaxed to support stateless lambda function.
  5. No destructor (Types explicitly implement the drop trait) call is allowed on tail call return. They must be out of lifetime, or be explicitly manually dropped earlier.
  6. Two functions must have the same ABI
  7. The call arguments must not contain any reference to the local variables of the parent function.

Other questions:

  1. Can function pointer be supported (This is supported by clang)?
  2. Should the function under return have exactly same return value as the parent function. Should Type coercions be allowed?

Existing proposals:

  1. become keyword
  2. C++ 23 proposal return goto Note that this proposal strongly against an attribute instead of a keyword

Community feedback suggested that an attribute on a regular-syntax return statement might be preferable for linguistic reasons. However, this is impossible: a standard attribute can be ignored without changing the meaning of a correct program, while changing a return goto statement to a return statement (or vice-versa) significantly changes the meaning of both the statement itself, and the calling function’s scope in terms of object lifetimes

2 Likes

The drawback is that this is the first target feature which is shared by multiple targets.

Aren't the x86 features also shared between multiple targets (the x86 and the x86_64 targets specifically)?

That second thing is why become exists. It can drop all the locals (not used in the call) before the call instead of after, to make it way less of a pain to use tail calls.

7 Likes

There is an active proposal to add a return_call instruction to WASM. It's already implemented by V8/Chrome but feature gated. The only blocker seems to be that no second engine implements the proposal yet.

Firefox is working on an implementation, so it's probably only a matter of time until native tail calls are standardized in WASM.

2 Likes

Not all backends support tail calls either. For example Cranelift doesn't.

1 Like

Why would that matter? The tail calls should be handled in the compiler frontend. If the tail call pattern is too complex to be optimized away, the compiler should error out.

Why "should"? Notably that's not what clang's support does.

But also this is why above I was talking about a more limited construct, because if the frontend has to do it, that can only ever work in restricted scenarios (at least if separate compilation is going to be a thing).

So the question, then, is in what scenarios people actually need guaranteed tail calls, rather than just as an optimization. Some of them are easy, like tail recursion the frontend could certainly do. But do cross-crate calls ever need guaranteed tail calls, for example?

Precisely because it is unreasonable to expect that every target and every backend has proper support for tail calls. It's also looks complex enough that it's more reasonable to implement once in the frontend, rather than in every possible backend.

It's a damn important optimization, if you happen to rely on it. That's a sufficient reason for me. But I admit, I didn't think about cross-crate tail calls. The most important optimization for me is tail recursion, at most mutual tail recursion between several functions in a module.

One useful case that sits in between direct (mutual) tail recursion and cross-crate tail calls: indirect tail calls, as you might use in e.g. an interpreter.

The functions may all be defined locally but they have their address taken, and they are chosen dynamically. The function pointers may even be stored in a data structure.

1 Like

Of course. But I bet I'm never actually relying on it for, say, random things like become Regex::new(…). So it's totally ok for that kind of thing to not be supported.

The question is just where and how to draw the line.