I've been thinking about this a lot recently. Here are my thoughts. Any feedback would be welcome.
EDIT: switch from "empty" to "user-defined" and updated with insights from discussion so far.
Summary
We redefine a naked function as a function using the "user-defined"
calling convention, which we specify below. A naked function can define an arbitrary contract with its callers, and thus it is always an unsafe fn
. Such a contract may include requirements about the state of the cpu registers, stack, or memory when the naked function is called. The compiler offers no help in enforcing this contract. This design is chosen to be maximally flexible, allowing naked functions to be used a bunch of different contexts. Functions using the "user-defined"
ABI cannot be called directly from normal rust. Instead, they must be invoked through some other means, such as (inline) assembly, C, or hardware. Alternately, a function pointer can be unsafely cast to the "C" ABI or some other ABI if it truely can be called with that convention.
Finally, we define how code is generated for a naked function.
Why this proposal?
Currently, it's not really clear what's allowed in a naked function, what code is generated, and what is UB. Basically, the RFC just says that no prologue and epilogue is emitted. Moreover, the current implementation doesn't really emit errors, even for things that are clearly wrong, like using the Rust ABI for a naked function, even though the ABI is unspecified.
Naked functions are a promising mechanism for writing low-level code without requiring the user to write a separate assembly file and link it. In particular, I would like naked functions to be a very general mechanism that can be used for implementing things like context switch routines, interrupt handlers, or other ABIs.
Specification
We can define a naked function by using the "user-defined"
calling convention:
unsafe extern "user-defined" fn foo() -> ! { ... }
foo
's body is allowed to contain arbitrary Rust code.
Restrictions
Violating any of the following results in a compile-time error:
foo
must beunsafe
foo
must not declare any formal parametersfoo
must not beconst
or inlinedfoo
must not be marked with#[track_caller]
foo
must return either()
or!
Some of these restrictions may be relaxed by future RFCs.
The "user-defined"
calling convention
foo
is an unsafe fn
. That is, the caller of foo
has an obligation to show that the preconditions of foo
are satisfied and that calling foo
will not violate any system invariants. Determining these preconditions and invariants is left entirely to the developer writing foo
, and making sure they hold is left entirely to the caller of foo
. This is done to make naked functions maximally flexible. For example, they can safely be used in interrupt handler contexts, where the stack may be in an unusual state, or when context switching, where the stack may be inaccessible altogether.
foo
is not allowed to assume anything about the state of the machine at its entry point unless it is explicitly required by the contract with the caller. This includes assumptions about how foo
was called, the contents of the registers, the contents of the stack, etc. Note that this means that foo
may need to first set up a stack before it can use local variables or call other functions.
In addition to the contract with the caller, foo
is required to comply with the following:
- If
foo
returns()
, it must ensure that returning is not UB. This is architecture-specific. For example, on x86_64,foo
must ensure that executing theretq
instruction is not UB by making sure that*(stack pointer + 8)
is a valid return address. - If
foo
returns!
, control flow must never return fromfoo
. Returning constitutes UB.
Symbols and Scopes
The foo
function defines a name-mangled symbol, just like any other normal function. foo
can be used as a function pointer, just like any other function name.
The body of foo
is a lexical scope, just like a normal rust function, and can contain other symbols and refer to symbols in more general scopes.
Generated code
Calling a naked function
A naked function cannot be called from normal rust (i.e. foo()
) because the compiler does not know the calling convention and ABI. Thus, any attempt to invoke a function or function pointer with an extern "user-defined"
ABI will result in a compile-time error.
A naked function can be invoked instead through (inline) assembly, C, or some other mechanism, such as a hardware interrupt vector. Alternately, if the naked function's contract states that it can be invoked through some other ABI, an unsafe cast to a function pointer with that ABI can be done, and the cast pointer can be invoked directly.
EDIT: this is the old text, for posterity...
foo
can be entered from rust code using a normal function call (foo()
), but the compiler will not assume thatfoo
is always entered from rust code. In fact, callingfoo
from normal rust code may violate the contract and trigger UB. For example, an interrupt handler may assume that it will only ever be invoked by the hardware. Callers offoo
are responsible for making sure that they uphold their end of the contract withfoo
.When rust code does call a naked function, the compiler will emit code in the caller that follows the C ABI and architecture-specific calling conventions for calling a function. For example, on x86_64, it will save caller-saved registers and use the
call foo
instruction (exceptfoo
would be name-mangled), which will save the return address on the stack (there are no arguments, so none need to be passed).If the naked function returns
()
, then code emitted in the caller after the function call will assume thatfoo
obeys the C ABI and calling convention.foo
is responsible for making sure it upholds any necessary invariants so that this is not UB.
Body of foo
The body of foo
will generate code as follows:
- No function prologue or epilogue is generated whatsoever. Callee-saved registers are not saved; the implementor of
foo
must do it if it is needed. - Inline assembly at the beginning or end of a naked function will be placed at the entry and exit of the code generated for the naked function, with no intervening instructions. This means that inline assembly can be used to do necessary setup and tear down for the user-defined calling convention.
- Local variables will be lazily allocated on the stack. That is, whenever a local is first declared, instructions are generated that make space on the stack for the local. Such instructions are not to be relocated around inline asm.
- Rust statements in the body of
foo
will generate the same code as they would in any other rust function. It is the responsibility offoo
to guarantee that these generated statements are not UB. For example,foo
may call a normal rust function in its body, butfoo
must ensure that there is a valid stack to do so. - When control flow reaches the end of
foo
, aret
instruction is inserted if the return type offoo
is()
. If control flow reaches the end offoo
and it's type is!
, then UB occurs.