I've been thinking about this a lot recently. Here are my thoughts. Any feedback would be welcome.
EDIT: switch from "empty" to "user-defined" and updated with insights from discussion so far.
Summary
We redefine a naked function as a function using the "user-defined" calling convention, which we specify below. A naked function can define an arbitrary contract with its callers, and thus it is always an unsafe fn. Such a contract may include requirements about the state of the cpu registers, stack, or memory when the naked function is called. The compiler offers no help in enforcing this contract. This design is chosen to be maximally flexible, allowing naked functions to be used a bunch of different contexts. Functions using the "user-defined" ABI cannot be called directly from normal rust. Instead, they must be invoked through some other means, such as (inline) assembly, C, or hardware. Alternately, a function pointer can be unsafely cast to the "C" ABI or some other ABI if it truely can be called with that convention.
Finally, we define how code is generated for a naked function.
Why this proposal?
Currently, it's not really clear what's allowed in a naked function, what code is generated, and what is UB. Basically, the RFC just says that no prologue and epilogue is emitted. Moreover, the current implementation doesn't really emit errors, even for things that are clearly wrong, like using the Rust ABI for a naked function, even though the ABI is unspecified.
Naked functions are a promising mechanism for writing low-level code without requiring the user to write a separate assembly file and link it. In particular, I would like naked functions to be a very general mechanism that can be used for implementing things like context switch routines, interrupt handlers, or other ABIs.
Specification
We can define a naked function by using the "user-defined" calling convention:
unsafe extern "user-defined" fn foo() -> ! { ... }
foo's body is allowed to contain arbitrary Rust code.
Restrictions
Violating any of the following results in a compile-time error:
foomust beunsafefoomust not declare any formal parametersfoomust not beconstor inlinedfoomust not be marked with#[track_caller]foomust return either()or!
Some of these restrictions may be relaxed by future RFCs.
The "user-defined" calling convention
foo is an unsafe fn. That is, the caller of foo has an obligation to show that the preconditions of foo are satisfied and that calling foo will not violate any system invariants. Determining these preconditions and invariants is left entirely to the developer writing foo, and making sure they hold is left entirely to the caller of foo. This is done to make naked functions maximally flexible. For example, they can safely be used in interrupt handler contexts, where the stack may be in an unusual state, or when context switching, where the stack may be inaccessible altogether.
foo is not allowed to assume anything about the state of the machine at its entry point unless it is explicitly required by the contract with the caller. This includes assumptions about how foo was called, the contents of the registers, the contents of the stack, etc. Note that this means that foo may need to first set up a stack before it can use local variables or call other functions.
In addition to the contract with the caller, foo is required to comply with the following:
- If
fooreturns(), it must ensure that returning is not UB. This is architecture-specific. For example, on x86_64,foomust ensure that executing theretqinstruction is not UB by making sure that*(stack pointer + 8)is a valid return address. - If
fooreturns!, control flow must never return fromfoo. Returning constitutes UB.
Symbols and Scopes
The foo function defines a name-mangled symbol, just like any other normal function. foo can be used as a function pointer, just like any other function name.
The body of foo is a lexical scope, just like a normal rust function, and can contain other symbols and refer to symbols in more general scopes.
Generated code
Calling a naked function
A naked function cannot be called from normal rust (i.e. foo()) because the compiler does not know the calling convention and ABI. Thus, any attempt to invoke a function or function pointer with an extern "user-defined" ABI will result in a compile-time error.
A naked function can be invoked instead through (inline) assembly, C, or some other mechanism, such as a hardware interrupt vector. Alternately, if the naked function's contract states that it can be invoked through some other ABI, an unsafe cast to a function pointer with that ABI can be done, and the cast pointer can be invoked directly.
EDIT: this is the old text, for posterity...
foocan be entered from rust code using a normal function call (foo()), but the compiler will not assume thatfoois always entered from rust code. In fact, callingfoofrom normal rust code may violate the contract and trigger UB. For example, an interrupt handler may assume that it will only ever be invoked by the hardware. Callers offooare responsible for making sure that they uphold their end of the contract withfoo.When rust code does call a naked function, the compiler will emit code in the caller that follows the C ABI and architecture-specific calling conventions for calling a function. For example, on x86_64, it will save caller-saved registers and use the
call fooinstruction (exceptfoowould be name-mangled), which will save the return address on the stack (there are no arguments, so none need to be passed).If the naked function returns
(), then code emitted in the caller after the function call will assume thatfooobeys the C ABI and calling convention.foois responsible for making sure it upholds any necessary invariants so that this is not UB.
Body of foo
The body of foo will generate code as follows:
- No function prologue or epilogue is generated whatsoever. Callee-saved registers are not saved; the implementor of
foomust do it if it is needed. - Inline assembly at the beginning or end of a naked function will be placed at the entry and exit of the code generated for the naked function, with no intervening instructions. This means that inline assembly can be used to do necessary setup and tear down for the user-defined calling convention.
- Local variables will be lazily allocated on the stack. That is, whenever a local is first declared, instructions are generated that make space on the stack for the local. Such instructions are not to be relocated around inline asm.
- Rust statements in the body of
foowill generate the same code as they would in any other rust function. It is the responsibility offooto guarantee that these generated statements are not UB. For example,foomay call a normal rust function in its body, butfoomust ensure that there is a valid stack to do so. - When control flow reaches the end of
foo, aretinstruction is inserted if the return type offoois(). If control flow reaches the end offooand it's type is!, then UB occurs.