It can be very useful for user programs to be able to take arbitrary actions when when heap allocation and deallocation occurs.
My motivation is that I want to build a heap profiler for Servo, one that’s similar to Firefox’s DMD. This profiler would record a stack trace on each allocation, and use that to emit data about which parts of the code are performing many allocations (contributing to heap churn) and which ones are responsible for allocating live blocks (contributing to peak memory usage). This may sound like a generic tool that could be built into Rust itself, but it’s likely to end up with Servo-specific parts, so flexible building blocks would be better than a canned solution.
There are lots of other potential uses for this, and this kind of facility is common in other systems. E.g. glibc provides one for malloc/realloc/free, and there’s also the more general LD_PRELOAD
(on Linux) and DYLD_INSERT_LIBRARIES
(on Mac) which allow you to hook any library functions.
What this needs.
-
A program needs to be able to specify that it wants to opt in to this feature, and a way to specify the wrapping functions (one each for
std::rt::heap::allocate
,reallocate
,reallocate_inplace
,deallocate
, and possiblyusable_size
andstats_print
). The opting-in could be at compile-time or runtime; the latter is probably preferable because it’s more flexible, but this is not a strong preference. -
The allocation/deallocation functions need to call any provided wrappers.
-
A way for wrappers to temporarily disable wrapping while they are running, so that we don’t get infinite recursion if the wrapper itself triggers a call to the function that it wraps.
I have a basic, gross, proof-of-concept implementation. (I’ll show just the part relating to the wrapping of allocate
; the other functions are very similar.) It adds the following code to src/liballoc/heap.rs, which defines a struct for holding the wrapper function and a function for setting the wrapper.
pub type AllocateFn = unsafe fn(usize, usize) -> *mut u8;
pub type AllocateFnWrapper = fn(AllocateFn, usize, usize) -> *mut u8;
struct AllocatorWrappers {
allocate: AllocateFnWrapper,
}
static mut wrappers: Option<AllocatorWrappers> = Option::None;
pub fn set_allocator_wrappers(allocate: AllocateFnWrapper) {
let new_wrappers = AllocatorWrappers {
allocate: allocate,
};
unsafe { wrappers = Option::Some(new_wrappers) };
}
It also modifies allocate
like so:
#[inline]
pub unsafe fn allocate(size: uint, align: uint) -> *mut u8 {
- imp::allocate(size, align)
+ match wrappers {
+ Option::None => imp::allocate(size, align),
+ Option::Some(ref h) => (h.allocate)(imp::allocate, size, align),
+ }
}
In the normal case this adds a single, perfectly-predictable branch to the alloc/dealloc path, which is hopefully small in relation to the cost of an alloc/dealloc.
And here is part of a sample program that uses it.
// If a wrapper contains code that itself calls the function being wrapped,
// we'll hit infinite recursion. Therefore, each wrapper needs to be able to
// temporarily disable wrapping. This is achieved via a thread-local flag.
thread_local!(static WRAPPERS_ENABLED: Cell<bool> = Cell::new(true));
fn my_allocate(real_allocate: AllocateFn, size: usize, align: usize) -> *mut u8 {
WRAPPERS_ENABLED.with(|wrappers_enabled| {
if !wrappers_enabled.get() {
return unsafe { real_allocate(size, align); }
}
wrappers_enabled.set(false);
println!("my_allocate({}, {})", size, align);
wrappers_enabled.set(true);
});
unsafe { real_allocate(size, align) }
}
fn main() {
// Want to do this as early as possible.
set_allocator_wrappers(my_allocate, my_reallocate, my_reallocate_inplace, my_deallocate);
// ... do stuff ...
// Without this, I get "thread '<main>' panicked at 'cannot access a TLS
// value during or after it is destroyed'". Presumably the problem is that
// the destructor for WRAPPERS_ENABLED gets run, so we can't access the TLS
// any more, and then deallocate() is called for something else. Urgh.
WRAPPERS_ENABLED.with(|wrappers_enabled| {
wrappers_enabled.set(false);
});
}
As I said, it’s totally gross.
-
set_allocation_hooks() should be called as early as possible, so that it doesn’t miss any allocations. It’s also entirely non-thread-safe, which is probably ok if you do call it right at the start of main(), but is still nasty. Putting the wrappers table inside an RwLock or something might be possible but it would be a shame, performance-wise, to do that for a data structure that’s written once and then read zillions of times. I figure there must be a more Rust-y way of doing this. Could a
#[feature("...")]
annotation work? We really want these wrappers to be enabled the language runtime at start-up, rather than the program itself having to do it. -
The thread-local storage to avoid infinite recursion isn’t so bad, though it would be nice if this could be somehow handled within the Rust implementation somehow so that each individual program doesn’t have to reuse it. The hack at the end of
main
– to deal withdeallocate
calls once the TLS is destroyed – is gross, too.
It was really just a prototype to see if I could get something working. And it does.
So… I’m wondering if (a) a feature like this seems like something that might be accepted into Rust, and (b) if there are better ways of implementing it. I hope so, because my ideas for measuring Servo’s memory usage are dead in the water without this.
This idea has a small amount of overlap with this old RFC – I want a custom allocator for the entire program, basically – but is much smaller.
Thank you for reading.