Returning a large type into a `Box` without copies

pubfnbar · July 12, 2025, 9:16pm

I've been under the impression that rustc tends to return large types from functions by allocating the space for it before the function call, passing the address of the allocated memory to the function, and writing the return value inside the function through that pointer.

In the following example, based on calling the functions foo_ok and bar_ok it seems like this kind of thing might be happening also when the function call expression is being passed as the argument to a std::boxed::box_new call, because if something like this didn't happen you'd expect both foo_ok and bar_ok to cause a stack overflow, which they don't.

But in the same example code, the slightly different functions foo_bad and bar_bad do cause a stack overflow when used in the same way. Can someone explain this behavior to me? By the way, whether compiling in debug or release mode doesn't seem to make a difference. You can try the following code here at the playground.

#![feature(liballoc_internals)]
#![allow(dead_code)]
#![allow(internal_features)]
#![allow(unused_variables)]

fn foo_ok() -> [u32; 10_000_000] {
    let arr = [10; 10_000_000];
    arr
}

fn foo_bad() -> [u32; 10_000_000] {
    let mut arr = [5; 10_000_000];
    arr[0] = 42;
    arr
}

fn bar_ok(take_five: bool) -> [u32; 10_000_000] {
    if take_five {
        [5; 10_000_000]
    } else {
        [0; 10_000_000]
    }
}

fn bar_bad(take_five: bool) -> [u32; 10_000_000] {
    let arr = if take_five {
        [5; 10_000_000]
    } else {
        [0; 10_000_000]
    };
    arr
}

fn main() {
    {
        let foo = std::boxed::box_new(foo_ok());
        let bar_true = std::boxed::box_new(bar_ok(true));
        let bar_false = std::boxed::box_new(bar_ok(false));

        // Prints 10 5 0
        println!("{} {} {}", foo[0], bar_true[0], bar_false[0]);
    }

    {
        // Any one of the following lines causes stack overflow
        // let foo = std::boxed::box_new(foo_bad());
        // let bar_true = std::boxed::box_new(bar_bad(true));
        // let bar_false = std::boxed::box_new(bar_bad(false));
    }
}

josh · July 13, 2025, 5:23am

There are currently multiple proposals in flight for "initializer expressions" or "placement new", which would allow initializing a new value directly in a Box without putting it on the stack first.

pubfnbar · July 13, 2025, 6:48am

So, it's a simple case of std::boxed::box_new not being fully implemented for all its intended use cases, but it (or some other syntax/function for the same purpose) is being worked on.

the8472 · July 13, 2025, 10:27am

Currently the most reliable method to avoid it is to create an uninit box, then writing the values into it, one by one for large arrays or slices and then calling assume_init.

Ddystopia · July 14, 2025, 5:51pm

Dead while waiting for a language feature

Ddystopia · July 14, 2025, 5:54pm

The most recent work I am aware of is Public view of rust-lang | Zulip team chat , but I didn't yet seen it going anywhere public

Ddystopia · July 14, 2025, 5:57pm

In my production code, I am usually having fn(Pin<&mut Uninit>) -> Pin<&mut Init> signatures, but you are losing the Drop and most likely will have to have some (Initialized) bool in Uninit to ensure user is not trying to reuse the allocation again.

scottmcm · July 14, 2025, 7:16pm

BTW, if your actual use case is large arrays, build them up in vectors or with iterator APIs. There's no good reason to use [u8; 10_000_000] ever. (Or, really, even [RealisticNontrivialType; 2_000].)

let foo: Box<[u8]> = repeat_n(0, 9_999_999).chain([1, 2]).collect();
dbg!(foo.len());

works even in debug mode https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=8438fa1a1436b2a12e78dde9eab67482

Vorpal · July 14, 2025, 7:32pm

I would add a bit of nuance.

In embedded with there are use cases for statically allocated arenas and buffers when you don't have dynamic allocation. But that is more likely it be 8 or 16k on the upper end (most likely smaller than that).

When working with microphone sound data over I2S I did need a static buffer for doing DMA of the samples to, that ended up being 8k treated as a ring buffer of 4x 2k buffers). And since it is a static it won't run into the copying issues as far as I know.

Ddystopia · July 14, 2025, 8:07pm

Framebuffers for ui tend to be 200kb+ sometimes, but you don't usually move them by value

scottmcm · July 14, 2025, 8:08pm

And that's still about 3 orders of magnitude smaller than the type I said.

There's absolutely a place for types like [u8; 1024] or even [u16x8; 256].

But if you're doing megabytes at a time, did you really need the array type, especially outside of a box?

Vorpal · July 14, 2025, 8:50pm

@Ddystopia mentioned frame buffers. That does seem like a relevant use case. Same for textures etc. (Of course frame buffers on embedded are a bit smaller than on the latest GPU. I have not worked with graphics on embedded so I don't know the actual values. But their 200k doesn't seem unreasonable. And as they said: you don't move them by value for obvious reasons.)

And "outside box" by ref or mut ref is the norm on embedded of course. Not becuse alloc is expensive (it is, but if you are dealing with MB anyway, it isn't that bad). But without an MMU you are dealing with physical addresses, this makes it much harder to deal with memory fragmentation for example. You definitely don't want to waste any of your precious RAM.

And without an MMU you lack the safety net of an OS, so traditionally in C if you don't allocate and free, you can't have use-after-free (much less of a concern in Rust of course).

Ddystopia · July 16, 2025, 9:43am

Oh, turns out format_args! can now be stored in let binding and it makes use of super let!

github.com/rust-lang/rust

Auto merge of #140748 - m-ou-se:super-format-args3, r=jdonszelmann

committed 07:13PM - 19 Jun 25 UTC

bors

+296 -392

Allow storing `format_args!()` in variable Fixes https://github.com/rust-lang/r…ust/issues/92698 Tracking issue for super let: https://github.com/rust-lang/rust/issues/139076 Tracking issue for format_args: https://github.com/rust-lang/rust/issues/99012 This change allows: ```rust let name = "world"; let f = format_args!("hello {name}!"); // New: Store format_args!() for later! println!("{f}"); ``` This will need an FCP. This implementation makes use of `super let`, which is unstable and might not exist in the future in its current form. However, it is entirely reasonable to assume future Rust will always have _a_ way of expressing temporary lifetimes like this, since the (stable) `pin!()` macro needs this too. (This was also the motivation for merging https://github.com/rust-lang/rust/pull/139114.) (This is a second version of https://github.com/rust-lang/rust/pull/139135)

mattfbacon · July 19, 2025, 5:07pm

No one actually answered the question .. I am also interested why the placement init works sometimes but not others. It should not have to be explicit because that would violate Separation of Concerns -- explicit placement init is (should be) reserved for manual optimisation and IMO is not related to the original question.

Ddystopia · July 19, 2025, 6:57pm

Rust is generally really bad at optimizing moves... It is really bothering on embedded, where stack is limited while Rust can easily make stack usage 3-10 times more than needed

Topic		Replies	Views
Feature Idea: Add a macro to "de-magic" box syntax language design	9	1237	April 10, 2021
Pre-RFC: placement box with Placer trait ideas (deprecated)	10	5556	March 25, 2019
Allow cloning into MaybeUninit (to allow cloning boxes directly on the heap) language design	18	788	November 27, 2023
More efficient boxed slice creation language design	10	1218	August 1, 2020
Pre-RFC: minimal, flexible placement and returning unsized types language design	21	3340	August 24, 2019

Returning a large type into a `Box` without copies

Related topics