Idea: Naked Functions 2.0

josh · January 16, 2020, 4:16am

I think "just don't let control reach the end of a naked function returning !" is a reasonable requirement; inserting extra trapping instructions is not the expected behavior if you're writing a function that simply shouldn't return. (It might be reasonable in debug mode; it definitely isn't in release mode.)

(Also, for cases where we do want to trap instead of UB, we should use something more like a ud2 instruction or equivalent, which traps, rather than halting.)

Ixrec · January 16, 2020, 10:40am

Question from the clueless outsider:

If a naked function can only be called via asm/FFI, and the body of a naked function must be 100% inline asm, how is it an improvement over just doing the whole thing in asm?

(I know the opening post of this thread says "arbitrary Rust code" in the function body, but I've completely lost track of who's advocating for or against that and whether the current nightly feature works like that in theory and/or in practice)

bjorn3 · January 16, 2020, 10:57am

According to the OP arbitrary rust code would be allowed in the body:

roblabla · January 16, 2020, 12:38pm

When compared with external/global asm, naked fn + single asm block allows for name mangling, proper visibility, is documented in rustdocs, and can itself call name mangled functions (by passing them to the asm block with a “symbol” constraint).

If we can make it work reliably, I’m pretty sure nobody is against arbitrary rust code in naked functions.

When control flow reaches the end of foo , a ret instruction is inserted if the return type of foo is () . If control flow reaches the end of foo and it's type is ! , then UB occurs.

The more I think about it, the less I’m a fan of the automatic instruction insertion advocated for here. For one thing, can we even reliably insert such specific instruction for all architectures? And would it not lead to surprises? Wouldn’t most function need to run the epilogue before returning?

I’d rather we ban the return keyword and enforce that naked function must end with an asm block containing the necessary “ret”. For functions returning !, such an asm block would be unnecessary.

Another thing that is underspecified imo:

Inline assembly at the beginning or end of a naked function will be placed at the entry and exit of the code generated for the naked function, with no intervening instructions. This means that inline assembly can be used to do necessary setup and tear down for the user-defined calling convention.

In what state is the epilogue asm called? Will the stack variables be deallocated/esp be back to what it was when the prologue ended?

Amanieu · January 16, 2020, 1:10pm

I would like to make this clear once and for all, since this seems to be a common misconception:

You can only use asm! in a naked function, and NOTHING ELSE.

This is explicitly stated in both the GCC and Clang documentation.

The fact that this is not currently enforced by rustc is a bug, not a feature. The use of any local variables, including any inserted by rustc for temporary values can and will cause stack corruption, especially in debug builds. Just because your code seems to work correctly in release builds does not mean it doesn't have undefined behavior, it just means you got lucky and your code may break when built with a future compiler.

hanna-kruppe · January 16, 2020, 1:34pm

It is, in fact, difficult to achieve technically. MIR doesn't even have a concept of "here's where this variable is declared", every function just has a list of locals entirely separate from the instruction stream (there are liveness markers but I believe they are just best-effort). LLVM, for its part, has nothing like this "barrier" you imagine. If you want two instructions to not be reordered w.r.t. each other, you're going to have to define them such that they both have some (real or imaginary) effect that would make it a behavioral change to swap the instructions. And alloca is not defined to have any such effect (currently; but I don't expect a proposal to change that will have much success).

But, more importantly, even if you could convince LLVM to not reorder any allocas w.r.t. inline asm and always treat them as dynamic allocas, this still won't give you anything robust. LLVM will happily reorder "pure"/side-effect-free instructions (e.g. arithmetic operations) relative to inline asm and allocas. The code generated from those operations may need to use the stack to, for example:

spill values when running out of registers
call functions (e.g. compiler-builtins or libm functions)
move values between different register classes where there's no register-register move instruction available (e.g. RV32IFD code moving f64s between a 64 bit FPR and a pair of 32 bit GPRs)

Working around that would entail a fully general code movement barrier that affects every instruction, even "pure" operations. Such a thing is entirely incompatible with an optimizing compiler. So I do not think this idea of allowing anything more than inline asm in naked functions is or can be made workable.

roblabla · January 16, 2020, 3:49pm

You can only use asm! in a naked function, and NOTHING ELSE.

Yes, I am aware. Hence why I said "if we can make it work reliably". I do not believe that to be possible, but at face value this is what the idea seems to be about.

IMO, the best thing would be for rust to check the single-asm invariant (as is requested by this open PR to the naked fn RFC, but allow symbols to be passed to it, e.g. (using the ASM v2 RFC syntax):

#[naked]
unsafe fn test() -> () {
    asm!("call {}
          ret", sym(test_impl));
}

extern fn test_impl() -> u32 {
    1
}

This would then allow naked functions that call arbitrary rust code by simply branching to that function. This is currently not possible (except through no_mangle) because the asm macro as it exists today cannot pass mangled function names (I tried using the X constraints but got an ICE in llvm).

Would this (passing constraints to the asm block) be allowed by LLVM?

Basically, what I want from naked function is a way to put a label on some asm code that rust can reason about (through visibility and mangling), and similarly a way to give that asm code symbols into the rust code. Once those building blocks are available, I believe we can mostly do whatever we want WRT interrupt handlers and whatnot.

Amanieu · January 16, 2020, 4:55pm

It is actually possible with the current asm! macro, but the way to do it is pretty obscure. Basically you need to cast the function to usize (since normally a fn is a ZST unless coerced to a function pointer) and use the "s" constraint.

Making this more accessible is one of the main reason why I added the sym operand type in the new inline asm RFC.

system · April 15, 2020, 4:55pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
`extern "none"` ABI compiler	2	223	September 29, 2024
Idea for UnsafeFn and calling convention traits	11	1170	March 25, 2019
Pre-RFC: Unsafe reasons language design	49	978	January 6, 2025
Creating 1-ZSTs guaranteed to have same extern "C" ABI as () Unsafe Code Guidelines	18	1195	August 31, 2023
Subteam reports 2016-03-21 announcements	1	1231	March 25, 2019

Idea: Naked Functions 2.0

Related topics