Summary
The proposal is to add a new codegen option, -C split-stacks, that causes the compiler
to emit the function prologue required to support (potentially) discontiguous stacks that
can be grown at-will at runtime.
This proposal also includes a new function attribute, #[no_split_stack], which is a
per-function opt-out of the split-stack function prologue.
Motivation
Split stacks, also called “segmented stacks” or “linked stacks”, are a technique for
growing the stacks of threads while a thread is running. Instead of reserving a large amount
of memory up-front for a thread’s stack (and crashing if that limit is ever exceeded), one can
instead reserve a small amount of memory and expand it as needed.
Split stacks were useful enough for even Rust to use them, for a time - the old "libgreen"
pre-1.0 Rust runtime utilized them to implement a “green-threaded” runtime atop libuv. Rust
was hardly the first runtime to utilize split stacks: the Go programming language made
heavy use of split stacks for goroutines, until they chose to instead use contiguous stacks
for goroutines (although they still require the compiler support detailed in this RFC) 1.
Microsoft’s Midori project also made use of split stacks to great effect 2.
While Rust has departed from the “libgreen” days, in both philosophy and implementation as
a systems programming language, Rust still stands as a potential C++ replacement as a
language for runtime implementations. Even if Rust does not use split stacks for its runtime,
it is still useful for Rust code to be emitted to use split stacks, for interoperating with (or
even implementing) a runtime that uses them. Both C and C++ have the ability to use split
stacks using the clang/gcc option -fsplit-stacks.
Detailed Design
LLVM provides the compiler support required to support split stacks 3, while libgcc
provides the runtime support 4. One could also supply the runtime support themselves, by
implementing the API described in 4.
For us, as users of LLVM, the work consists of:
- Adding a new codegen option,
-C split-stacks that, when enabled, adds the “split-stack” function attribute on all functions (except ones marked by the below attribute). This will ensure that LLVM emits the proper function prologue for split stack functions.
- Adding a new attribute,
#[no_split_stack] that, when adorning a function, will prevent the function from having the split stack prologue. When writing a split-stack runtime, this attribute would be used on all code paths that involve allocating new stack segments.
It’s worth noting that rustc has emitted split-stack code in the past and it is known
to work, so this work is not entirely unprecedented.
How We Teach This
Split stacks (and the general problem of stack switching) is difficult to teach and
understand, even for those with a background in computer science. A few blog posts about
split stacks with images would go a long way in illustrating what’s actually happening
behind the scenes of a split-stack runtime. We can also use existing documentation for
split-stack runtimes (mostly Go) to illustrate what is occuring behind the scenes.
Drawbacks
Like -C panic=abort, -C split-stack requires that all crates linked to the crate being
compiled also be compiled with -C split-stack in order to be most effective. Tools like
xargo 5 make this very doable.
LLVM does not support split stacks for every target. We are able to check this using the
target specification JSON 6.
Unresolved Questions
libgcc allocates memory itself for stack segments, which may be inappropriate for no_std
scenarios. no_std users of -C split-stack could certainly provide their own
implementation of __morestack and friends. What should we do with no_std?