Unnecessary `push rax` in inline assembly

Take a look at this snippet:

The compiler generates unnecessary stack manipulation code (push rax and add rsp, 8). IIUC the compiler does it to align stack pointer for an assembly block which may use stack (i.e. when we do not provide options(nostack)), but in this case the stack gets aligned by the clobbered registers, so the stack is already properly aligned.

Is there a way to work around this? Unfortunately, I can not use options(nostack) since I need to clobber LLVM-reserved registers (rbp and rbx) and it can not be done with out(reg) _.

Maybe I can use options(nostack) in the snippet above assuming that code between push/pop (i.e. where the snippet uses nop) does not manipulate the stack? After all, the main effect of the option is documented like this:

If this option is not used then the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call.

options(nostack) also means that there may be a redzone below the stack pointer, which must not be clobbered as the compiler may be using. I don't think there is a way to avoid this unnecessary push rax short of using naked asm instead.

I don't think this is correct – the stack is misaligned on entry to the function (because the stack on x86-64 on non-Windows requires 16-byte alignment when you start to make a function call, but the function call instruction increases the stack size by 8 bytes, so every function that uses the normal calling convention starts with a stack that is misaligned by exactly 8). Then four registers are spilled (so that you can clobber them), which pushes 32 bytes onto the stack, so it's still misaligned by 8.

A good way to think about it is that pushing 8 bytes onto the stack is never "unnecessary" unless you don't call functions at all, in the sense that if the stack is aligned without the push, then it would be misaligned with the push (as the stack requires 16-byte alignment) – thus it's either necessary or actively harmful, there are no situations where it doesn't matter. So if the push here weren't needed, generating it would be unsound / a code generation bug.

If you specify three variables as clobbered rather than four, you'll observe that there isn't a push rax added, and in fact it would actually be incorrect/unsound to add one.

2 Likes

Ah, true. push rax is used to compensate for the return pointer.

I wish we had an option "this asm block does not require stack pointer alignment".

1 Like