I think "just don't let control reach the end of a naked function returning !" is a reasonable requirement; inserting extra trapping instructions is not the expected behavior if you're writing a function that simply shouldn't return. (It might be reasonable in debug mode; it definitely isn't in release mode.)
(Also, for cases where we do want to trap instead of UB, we should use something more like a ud2 instruction or equivalent, which traps, rather than halting.)
If a naked function can only be called via asm/FFI, and the body of a naked function must be 100% inline asm, how is it an improvement over just doing the whole thing in asm?
(I know the opening post of this thread says "arbitrary Rust code" in the function body, but I've completely lost track of who's advocating for or against that and whether the current nightly feature works like that in theory and/or in practice)
When compared with external/global asm, naked fn + single asm block allows for name mangling, proper visibility, is documented in rustdocs, and can itself call name mangled functions (by passing them to the asm block with a “symbol” constraint).
If we can make it work reliably, I’m pretty sure nobody is against arbitrary rust code in naked functions.
When control flow reaches the end of foo , a ret instruction is inserted if the return type of foo is () . If control flow reaches the end of foo and it's type is ! , then UB occurs.
The more I think about it, the less I’m a fan of the automatic instruction insertion advocated for here. For one thing, can we even reliably insert such specific instruction for all architectures? And would it not lead to surprises? Wouldn’t most function need to run the epilogue before returning?
I’d rather we ban the return keyword and enforce that naked function must end with an asm block containing the necessary “ret”. For functions returning !, such an asm block would be unnecessary.
Another thing that is underspecified imo:
Inline assembly at the beginning or end of a naked function will be placed at the entry and exit of the code generated for the naked function, with no intervening instructions. This means that inline assembly can be used to do necessary setup and tear down for the user-defined calling convention.
In what state is the epilogue asm called? Will the stack variables be deallocated/esp be back to what it was when the prologue ended?
I would like to make this clear once and for all, since this seems to be a common misconception:
You can only use asm! in a naked function, and NOTHING ELSE.
This is explicitly stated in both the GCC and Clang documentation.
The fact that this is not currently enforced by rustc is a bug, not a feature. The use of any local variables, including any inserted by rustc for temporary values can and will cause stack corruption, especially in debug builds. Just because your code seems to work correctly in release builds does not mean it doesn't have undefined behavior, it just means you got lucky and your code may break when built with a future compiler.
It is, in fact, difficult to achieve technically. MIR doesn't even have a concept of "here's where this variable is declared", every function just has a list of locals entirely separate from the instruction stream (there are liveness markers but I believe they are just best-effort). LLVM, for its part, has nothing like this "barrier" you imagine. If you want two instructions to not be reordered w.r.t. each other, you're going to have to define them such that they both have some (real or imaginary) effect that would make it a behavioral change to swap the instructions. And alloca is not defined to have any such effect (currently; but I don't expect a proposal to change that will have much success).
But, more importantly, even if you could convince LLVM to not reorder any allocas w.r.t. inline asm and always treat them as dynamic allocas, this still won't give you anything robust. LLVM will happily reorder "pure"/side-effect-free instructions (e.g. arithmetic operations) relative to inline asm and allocas. The code generated from those operations may need to use the stack to, for example:
spill values when running out of registers
call functions (e.g. compiler-builtins or libm functions)
move values between different register classes where there's no register-register move instruction available (e.g. RV32IFD code moving f64s between a 64 bit FPR and a pair of 32 bit GPRs)
Working around that would entail a fully general code movement barrier that affects every instruction, even "pure" operations. Such a thing is entirely incompatible with an optimizing compiler. So I do not think this idea of allowing anything more than inline asm in naked functions is or can be made workable.
You can only use asm! in a naked function, and NOTHING ELSE.
Yes, I am aware. Hence why I said "if we can make it work reliably". I do not believe that to be possible, but at face value this is what the idea seems to be about.
IMO, the best thing would be for rust to check the single-asm invariant (as is requested by this open PR to the naked fn RFC, but allow symbols to be passed to it, e.g. (using the ASM v2 RFC syntax):
This would then allow naked functions that call arbitrary rust code by simply branching to that function. This is currently not possible (except through no_mangle) because the asm macro as it exists today cannot pass mangled function names (I tried using the X constraints but got an ICE in llvm).
Would this (passing constraints to the asm block) be allowed by LLVM?
Basically, what I want from naked function is a way to put a label on some asm code that rust can reason about (through visibility and mangling), and similarly a way to give that asm code symbols into the rust code. Once those building blocks are available, I believe we can mostly do whatever we want WRT interrupt handlers and whatnot.
It is actually possible with the current asm! macro, but the way to do it is pretty obscure. Basically you need to cast the function to usize (since normally a fn is a ZST unless coerced to a function pointer) and use the "s" constraint.
Making this more accessible is one of the main reason why I added the sym operand type in the new inline asm RFC.