RFC 2884 proposes a design for allowing functions to return unsized or large types types directly on the heap. It does this by allowing callers to provide a pointer to an allocation of the right size to the callee.
With the current design of the RFC, functions that return unsized types are transformed into a generator. When called for the first time, the generator returns the size of the allocation it needs to the caller. When called the second time with a pointer to said allocation, it writes out its return value to that pointer. Unfortunately, this generator-based design imposes two limitations on functions that return unsized types:
- They can't be converted into a function pointer.
- They can't store their unsized return value on their stack using alloca.
There is an alternative desugaring for functions that return unsized values that solves both of these issues. Instead of compiling to a generator, a function that returns an unsized value gets an extra implicit "emplacer" parameter, which is a closure that tells it how to construct the allocation it needs. For example,
fn return_unsized(a: u32) -> [u32] {
[a, a, a]
}
Becomes, approximately:
fn return_unsized(a: u32, emplacer: dyn FnOnce(Layout, <[u32] as Pointee>::Metadata, dyn FnOnce(*mut ())) {
emplacer(
Layout::from_size_align(mem::size_of::<u32>() * 3, mem::align_of::<u32>()).unwrap(),
3,
|dest| unsafe { ptr::write(dest as *mut [u32; 3], [a, a, a]) },
)
}
This allows frobnify
to construct the allocation for its unsized value without ever yielding to its caller, obviating the need for a generator.
The emplacer
closure could be passed in via one of the various "context" proposals that people have suggested.
A more complete desugaring:
fn return_unsized(a: u32) -> [u32] {
[a, a, a]
}
fn main() {
// Function with unsized return can coerce to function pointer!
let fn_ptr: fn(u32) -> [u32] = return_unsized;
dbg!(Box::new_with::<[u32]>(|| fn_ptr(1)));
}
would desugar to something like this:
#![feature(ptr_metadata)]
extern crate alloc;
use alloc::alloc::{alloc, Layout};
use core::mem::{self, MaybeUninit};
use core::ptr::{self, Pointee};
fn box_new_with<T: ?Sized>(
unsized_ret: impl FnOnce(dyn FnOnce(Layout, <T as Pointee>::Metadata, dyn FnOnce(*mut ()))),
) -> Box<T> {
let mut uninit_box = MaybeUninit::uninit();
unsized_ret(|layout, meta, closure| {
let box_ptr = unsafe { alloc(layout) as *mut () };
closure(box_ptr);
let init_box = unsafe { Box::from_raw(ptr::from_raw_parts_mut(box_ptr, meta)) };
uninit_box.write(init_box);
});
return unsafe { uninit_box.assume_init() };
}
fn return_unsized(
a: u32,
emplacer: dyn FnOnce(Layout, <[u32] as Pointee>::Metadata, dyn FnOnce(*mut ())),
) {
// a more complicated function could do stuff here
// call to "emplacer" must be last thing `return_unsized` does
emplacer(
Layout::from_size_align(mem::size_of::<u32>() * 3, mem::align_of::<u32>()).unwrap(),
3,
|dest| unsafe { ptr::write(dest as *mut [u32; 3], [a, a, a]) },
)
}
fn main() {
let fn_ptr: fn(
u32,
dyn FnOnce(Layout, <[u32] as Pointee>::Metadata, dyn FnOnce(*mut ())),
) = return_unsized;
dbg!(box_new_with::<[u32]>(|e| fn_ptr(1, e)));
}
(Playground for closest equivalent that works on today's Rust)
When called without an implicit emplacer, functions that return unsized types would use a default emplacer provided by Rust that allocates the unsized value at the end of the callee's stack, and then ensures it is copied back to the caller's stack when the function returns. This may require functions that return unsized values to have a different calling convention. The special "magic alloca emplacer" could also be exposed as an intrinsic that other emplacers can call.