Summary
Make the Rust allocator callable from C.
Motivation
When using both C/C++ and Rust in the same project, it can be necessary to create Rust Box
'es, Vec
's and String
's. As the Rust code may use an allocator other than malloc
, it is not allowed to use malloc
on the C/C++ side. Instead the Rust allocator needs to be called. Currently this requires everyone to individually creating wrapper functions on the Rust side that export a C interface. It would be much easier to allow directly using the Rust allocator from C/C++.
Guide-level explanation
It becomes possible to allocate and free memory using the Rust allocator in C/C++ code linking to Rust code. This is useful when interacting a lot between Rust and C/C++ code. For example:
void *_RNvNtC4rust5alloc5alloc(size_t, size_t);
bool rust_box_is_42(uint32_t *);
bool box_is_42(uint32_t val) {
uint32_t *box_ptr = _RNvNtC4rust5alloc5alloc(sizeof(uint32_t), alignof(uint32_t));
rust_box_is_42(box_ptr);
}
#[no_mangle]
unsafe extern "C" fn rust_box_is_42(box_ptr: *const u32) -> bool {
// `Box::from_raw` would not be allowed if `box_ptr` was allocated using `malloc`.
let box_ = Box::from_raw(box_ptr);
*box_ == 42
}
Or the other way around:
void _RNvNtC4rust5alloc7dealloc(void *, size_t, size_t);
uint32_t *get_rust_boxed_num();
bool box_is_42() {
uint32_t *box = get_rust_boxed_num();
bool is_42 = *box == 42;
// It would not be valid to free `box` using `free`.
_RNvNtC4rust5alloc7dealloc(box, sizeof(uint32_t), alignof(uint32_t));
return is_42;
}
#[no_mangle]
extern "C" fn get_rust_boxed_num() -> *const u32 {
Box::into_raw(Box::new(rand()))
}
It is not allowed to override (or define) any of the allocator symbols yourself, whether in C or Rust. They must always be defined by rustc through either #[global_allocator]
or libstd.
Reference-level explanation
The following functions will be exported that forward to the global Rust allocator:
#[linkage_name = "_RNvNtC4rust5alloc5alloc"]
extern "C" fn alloc(size: usize, align: usize) -> *mut u8;
#[linkage_name = "_RNvNtC4rust5alloc11alloc_zeroed"]
extern "C" fn alloc_zeroed(size: usize, align: usize) -> *mut u8;
#[linkage_name = "_RNvNtC4rust5alloc7realloc"]
extern "C" fn realloc(ptr: *mut u8, size: usize, align: usize) -> *mut u8;
#[linkage_name = "_RNvNtC4rust5alloc7dealloc"]
extern "C" fn dealloc(ptr: *mut u8, size: usize, align: usize);
These functions directly correspond to the respective methods on GlobalAlloc
. All functions have the same safety invariants as the corresponding methods on GlobalAlloc
. In addition the safety invariants of Layout::from_size_align
must be followed when called with the given size
and align
.
These functions are directly visible to all code statically linked into a rust executable or shared library that depends on liballoc
and all code that links to either libstd.so
or a shared library that contains #[global_allocator]
. They will always be available at runtime when liballoc
is linked in, but it may be necessary to tell the linker that it is fine if it can't immediately find the allocator symbols using for example --undefined
.
This RFC can be implemented by renaming the methods of the allocator shim and then adding documentation that the signatures of the functions in the allocator shim must not change.
Drawbacks
Why should we not do this?
Rationale and alternatives
The function signature of these functions is already effectively stable as they correspond 1-to-1 with methods on the stable GlobalAlloc
.
The OOM handling functions could also be stably exported, but given that defining the OOM handler isn't stable yet anyway and there are several design choices there that would affect the possibly exported C api, this RFC does not propose to export a C api for the OOM handler.
It would be possible to not export these functions. This is the status quo. It however requires everyone to write their own wrapper functions.
Prior art
C/C++ allow directly calling the global allocator from other languages using malloc
and free
. Many "managed" languages like Python or Java also allow directly using the global allocator to allocate native objects using a C api.
Unresolved questions
What name should be used? @eddyb suggested the names used in this RFC in https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/upstreaming.20LLVM.20patch.20for.20__rust_.20alloc.20functions/near/247438389 such that backtraces look nice and it is clear that the functions are coming from Rust and not C or C++. It would also be possible to use names like __rust_alloc
and __rust_dealloc
or __rustalloc_alloc
and __rustalloc_dealloc
. These have the advantage of being easier to memorize and looking less weird.
Should the extern "C"
or extern "C-unwind"
abi be used. In other words, should these functions be allowed to unwind in the future.
How do we ensure that --undefined
is never necessary to allow linking to succeed?
Future possibilities
None that I know of.