Why C++ pass 128 bits via registers, while rust uses stack?

If compile such code (linux + amd64):

#include <cstdint>

struct Small {
  int32_t a, b, c, d;
};

int32_t sum1(Small s)
{
    return s.a + s.d;
}

then disassembler show that sum1 takes arguments in two 64bit registers: godbolt link.

But for some reason Rust for similar code:

pub struct Small {
    a: i32,
    b: i32,
    c: i32,
    d: i32,
}

#[no_mangle]
pub fn sum1(s: Small) -> i32
{
    s.a + s.d
}

uses stack instead of registers: godbolt link.

Any reasons behind such difference?

This is just how the Rust ABI is currently implemented. If you watch to match C then you can define your function as extern "C"

Note that Rust does not "use stack" in your snippet, instead it generates sum1 which accepts a pointer to the struct. It may point to any place in memory, not necessary to stack. As for which ABI is better... it depends. Yes, the Rust ABI may result in additional stack usage, but the C++ ABI may result in unnecessary loads and register pollution (e.g. fields b and c in your example). Overall, I don't think it matters much in practice, since small functions for which the difference could be noticeable are likely to be inlined.

2 Likes

We do pass aggregate types that fit in a single register in said register: rust/compiler/rustc_target/src/callconv/mod.rs at 15469f8f8ae0a77577745cf56d562600fdb6539a · rust-lang/rust · GitHub In the past we did the same when it fit in two registers but we stopped doing this in https://github.com/rust-lang/rust/pull/94570 because it caused the autovectorizer to be less effective.

5 Likes

Does this mean that the autovectorizer in Rust will do a better job than C++ in the case where C++ is passing some 128 bits parameter via registers? (If both use LLVM)