I am building a LD_PRELOAD program in RUST to track file system dependencies such as readlink, open, fopen, etc. The goal is to add build dependency tracking for build tools like gnu make, cargo, etc. I infact had a version mostly working in C and ran fine on rustc compiler as well.
I started rewriting it in RUST. And have mostly loved the experience so far. But ran into hang problem when using the LD_PRELOAD library on the rustc compiler.
The problem is that the LD_PRELOAD intercepts readlink(), which is called by jemalloc() code in RUSTC. The test code does nothing more than intercept readlink and continues to call the original readlink. And it looks like this goes into recursion
Question: How can I rewrite the piece of code below that I can prevent this recursion? I am asking this here after discussing it in the user forum in the below thread.
I am trying to see if people with internal rustc understanding may provide so advice.
Basically looking to figure out how I can my LD_PRELOAD program to work with the rustc compiler.
A few things I need to figure out here I think, is how I can do some of the static global variable initialization such as
dlsym_next(concat!("readlink", "\0"));
static mut ONCE: Once = Once::new();
etc
in the test code below, without triggering the jemalloc code that does readlink/open invocations causing recursion. All these intercepts works fine in my C version. So I suspect its not the iterception, but the initialization code that triggers rustc's jemalloc code that again does readlink and goes into mutex lock.
Hoping the Rustc experts here may have some advice on how to avoid jemalloc recursion and be able to do static global initialization without triggerig it.
I even tried directly doing a syscall to readlink after interception. But I still cant avoid the hang. Hangs differently,
(gdb) bt
#0 0x00007f144590e4ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f1445909dcb in L_lock_883 () from /lib64/libpthread.so.0
#2 0x00007f1445909c98 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x000056401d0150ed in malloc_mutex_lock_final (mutex=0x56401d2351d8 <init_lock>) at ../jemalloc/include/jemalloc/internal/mutex.h:141
#4 rjem_je_malloc_mutex_lock_slow (mutex=0x56401d2351d8 <init_lock>) at ../jemalloc/src/mutex.c:84
#5 0x000056401cfea09f in malloc_mutex_lock (tsdn=0x0, mutex=) at ../jemalloc/include/jemalloc/internal/mutex.h:205
#6 malloc_init_hard () at ../jemalloc/src/jemalloc.c:1506
#7 malloc_init () at ../jemalloc/src/jemalloc.c:217
#8 imalloc (sopts=, dopts=) at ../jemalloc/src/jemalloc.c:1986
#9 calloc (num=1, size=32) at ../jemalloc/src/jemalloc.c:2138
#10 0x00007f14456fd550 in dlerror_run () from /lib64/libdl.so.2
#11 0x00007f14456fd058 in dlsym () from /lib64/libdl.so.2
#12 0x00007f14497b6cc7 in ldpreload::dlsym_next::h7cab03892daf2fab (symbol=...) at src/lib.rs:43
#13 0x00007f14497b7429 in ldpreload::readlink_local::get::$u7b$$u7b$closure$u7d$$u7d$::haeb8628401923c4c () at src/lib.rs:71
#14 0x00007f14497b73b0 in std::sync::once::Once::call_once::$u7b$$u7b$closure$u7d$$u7d$::haf60324ae025b6d9 () at /rustc/8d69840ab92ea7f4d323420088dd8c9775f180cd/src/libstd/sync/once.rs:264
#15 0x00007f14497de848 in std::sync::once::Once::call_inner::hfbdd978c729db7b8 () at src/libstd/sync/once.rs:416
#16 0x00007f14497b7329 in std::sync::once::Once::call_once::hc3adf9476c282443 (self=0x7f1449a720a0 ldpreload::readlink_local::get::ONCE::h19ff01a158e3f42e, f=...) at /rustc/8d69840ab92ea7f4d323420088dd8c9775f180cd/src/libstd/sync/once.rs:264
#17 0x00007f14497b6db9 in ldpreload::readlink_local::get::hc45eb880fee9d18f (self=0x7f14498379f4) at src/lib.rs:70
#18 0x00007f14497b7488 in ldpreload::readlink_local::readlink::$u7b$$u7b$closure$u7d$$u7d$::h8cc84822c0f8be3c () at src/lib.rs:83
#19 0x00007f14497b7b40 in core::option::Option$LT$T$GT$::unwrap_or_else::hb73d717eb9eefcaa (self=..., f=...) at /rustc/8d69840ab92ea7f4d323420088dd8c9775f180cd/src/libcore/option.rs:428
#20 0x00007f14497b6e93 in readlink (path=0x56401d023e2d "/etc/malloc.conf", buf=0x7ffde2568650 "", bufsiz=4096) at src/lib.rs:79
#21 0x000056401cfeee12 in malloc_conf_init () at ../jemalloc/src/jemalloc.c:913
#22 malloc_init_hard_a0_locked () at ../jemalloc/src/jemalloc.c:1281
#23 0x000056401cfe8f4f in malloc_init_hard () at ../jemalloc/src/jemalloc.c:1517
#24 malloc_init () at ../jemalloc/src/jemalloc.c:217
#25 imalloc (sopts=, dopts=) at ../jemalloc/src/jemalloc.c:1986
#26 malloc (size=size@entry=72704) at ../jemalloc/src/jemalloc.c:2038
#27 0x00007f14420dfae0 in pool (this=0x7f1444c4eca0 <(anonymous namespace)::emergency_pool>) at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_alloc.cc:117
#28 __static_initialization_and_destruction_0 (__priority=65535, __initialize_p=1) at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_alloc.cc:244
#29 _GLOBAL__sub_I_eh_alloc.cc(void) () at ../../../../gcc-5.5.0/libstdc++-v3/libsupc++/eh_alloc.cc:307
#30 0x00007f1449a828f3 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#31 0x00007f1449a7415a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#32 0x0000000000000001 in ?? ()
#33 0x00007ffde256a2af in ?? ()
#34 0x0000000000000000 in ?? ()
Test code as follows
extern crate core;
extern crate libc;
#[macro_use]
extern crate ctor;
use libc::{c_void,c_char,c_int,size_t,ssize_t};
use std::sync::atomic;
#[cfg(any(target_os = "macos", target_os = "ios"))]
pub mod dyld_insert_libraries;
/* Some Rust library functionality (e.g., jemalloc) initializes
* lazily, after the hooking library has inserted itself into the call
* path. If the initialization uses any hooked functions, this will lead
* to an infinite loop. Work around this by running some initialization
* code in a static constructor, and bypassing all hooks until it has
* completed. */
static INIT_STATE: atomic::AtomicBool = atomic::AtomicBool::new(false);
pub fn initialized() -> bool {
INIT_STATE.load(atomic::Ordering::SeqCst)
}
// extern "C" fn initialize() {
// Box::new(0u8);
// INIT_STATE.store(true, atomic::Ordering::SeqCst);
// }
// /* Rust doesn't directly expose __attribute__((constructor)), but this
// * is how GNU implements it. */
// #[link_section = ".init_array"]
// pub static INITIALIZE_CTOR: extern "C" fn() = ::initialize;
#[ctor]
fn initialize() {
Box::new(0u8);
INIT_STATE.store(true, atomic::Ordering::SeqCst);
println!("Constructor");
}
#[link(name = "dl")]
extern "C" {
fn dlsym(handle: *const c_void, symbol: *const c_char) -> *const c_void;
}
const RTLD_NEXT: *const c_void = -1isize as *const c_void;
pub unsafe fn dlsym_next(symbol: &'static str) -> *const u8 {
let ptr = dlsym(RTLD_NEXT, symbol.as_ptr() as *const c_char);
if ptr.is_null() {
panic!("redhook: Unable to find underlying function for {}", symbol);
}
ptr as *const u8
}
#[allow(non_camel_case_types)]
pub struct readlink {__private_field: ()}
#[allow(non_upper_case_globals)]
static readlink: readlink = readlink {__private_field: ()};
impl readlink {
fn get(&self) -> unsafe extern fn (path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t {
use ::std::sync::Once;
static mut REAL: *const u8 = 0 as *const u8;
static mut ONCE: Once = Once::new();
unsafe {
ONCE.call_once(|| {
REAL = dlsym_next(concat!("readlink", "\0"));
});
::std::mem::transmute(REAL)
}
}
#[no_mangle]
pub unsafe extern "C" fn readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t {
println!("readlink");
if initialized() {
println!("initialized");
::std::panic::catch_unwind(|| my_readlink ( path, buf, bufsiz )).ok()
} else {
println!("not initialized");
None
}.unwrap_or_else(|| readlink.get() ( path, buf, bufsiz ))
}
}
pub unsafe fn my_readlink(path: *const c_char, buf: *mut c_char, bufsiz: size_t) -> ssize_t {
println!("my_readlink");
readlink.get()(path, buf, bufsiz)
}