When a program is using system malloc
/free
for the allocator, tracking of objects' size and alignment for deallocation is redundant and causes unnecessary code bloat.
It seems that it happens because __rust_dealloc
is magic, and doesn't get inlined, so the optimizer can't see that its arguments are unused. There's this opaque layer of abstraction between allocations/deallocations in Rust and calls to malloc
/free
, which makes it not quite a zero-cost abstraction.
This:
pub fn actual(_drop: Box<[u8; 123456]>) {
}
pub fn expected(drop: Box<[u8; 123456]>) {
unsafe { free(Box::leak(drop).as_mut_ptr().cast()); }
}
Compiles to (godbolt link):
example::actual:
mov esi, 123456
mov edx, 1
jmp qword ptr [rip + __rust_dealloc@GOTPCREL]
example::expected:
jmp qword ptr [rip + free@GOTPCREL]
Note that Drop
of a Box
needlessly sets size and alignment of the allocation, because it can't know that this particular allocator implementation won't use it.
rest of the repro code
use std::alloc::*;
#[global_allocator]
static A: Alloc = Alloc;
struct Alloc;
unsafe impl GlobalAlloc for Alloc {
#[inline(always)]
unsafe fn alloc(&self, _layout: Layout) -> *mut u8 { std::process::abort(); }
#[inline(always)]
unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) { std::process::abort(); }
#[inline(always)]
unsafe fn alloc_zeroed(&self, _layout: Layout) -> *mut u8 { std::process::abort(); }
#[inline(always)]
unsafe fn realloc(
&self,
_ptr: *mut u8,
_layout: Layout,
_new_size: usize
) -> *mut u8 { std::process::abort(); }
}
extern "C" {
fn free(_: *mut u8);
}
So it seems like there's a missed optimization opportunity here.