Out argument for efficient initialisation

Object construction in C++

A constructor in C++ initialises its object in an already allocated memory location pointed to by the implicit and hidden this pointer. This design decision allows for a number of niceties:

  • The same constructor function can be used for direct stack allocation, direct heap allocation, or other in-place allocation with an explicit address.
  • Copy elision in assignment (auto foo = Foo{5}) is also possible.

Return-value optimisation (for unnamed temporary values in the return statement) and named return-value optimisation (for other locally constructed objects returned by value) are also supported in C++, and at least made easier to implement thanks to this design.

Because the final object is already allocated beforehand an arbitrary deeply nested object construction will not produce a cascade of copy operations propagating bottom-up. Each memory location is written to once, and possibly in linear order, not repeatedly copied layer by layer up the composition hierarchy.

An opportunity for Rust

Hidden out argument

It would be nice if something similar to this could be supported and standardised in Rust. If Rust supported some sort of out argument then the same optimisations and functionality could be implemented. The out argument would be similar to a normal mutable pointer but with the assumption that the variable is uninitialised. Each individual field in that object would, by construction, be initialised:

fn new(a: i32) -> Foo {
    Foo {
        x: a,
        y: Bar::new(a),
    }
}

into:

fn new2(x: i32, out: *mut Foo) {
    unsafe {
        std::ptr::write(out.x, a);
        Bar::new2(a, &mut out.y);
    }
}

RVO transformations

Any function that returns by value is eligible for the hidden out-argument optimisation. Instead of returning the value the function is transformed to write into the supplied memory location. Primitives are simply written to the memory location. Because the value of the function is written to the provided place it no longer needs to be “returned”. This kind of transformation should be able to deal with both named and unnamed return-value optimisation, as well as copy elision in assignments.

Fallible and infallible initialisations

A constructor in C++ may only fail by throwing an exception. This trick is of course used to hide the unhappy path. A more rusty way could be to look for an instance of the Try trait. For example:

fn try_new(a: i32) -> Option<Foo>

could be transformed into:

fn try_new2(a: i32, place: *mut Foo) -> Option<()>

The call-side would be unchanged:

let foo = Foo::try_new(5)?;

let foo = box Foo::try_new(5)?;

container.push_into(|| Foo::try_new(5)?);

Or perhaps an explicit choice would have to be made?

let foo <- Foo::try_new(5)?;

Seems like something that would fit well with an &in reference type for (at least partial) compiler checked memory safety (or &out, I’m of the opinion that the linked descriptions are backwards, while they make sense on their own I believe using the type as part of the documentation for function signatures makes more sense).

I believe we already RVO every return type larger than a pointer.

2 Likes

If the RVO is indeed guaranteed then transformations like these are probably more greasy than they are useful. There is just one other feature in this sketch, however, that I still think is interesting or at least unusual:

The failure/success information (encoded by the type implementing Try) is separated from the constructed / initialised value.

If you construct a complex object, and any part might fail, then instead of wrapping each part in its own Option<Part> or Result<Part,_> the whole composition is RVOed as one compact entity.

Option and Result are both super nice, but they also risk unwanted cohesion when aggregating objects.

It didn't take long for a new design to spring up in the space left by placement! Readers interested in additional context may want to look at the summary of the last take of placement: https://github.com/rust-lang/rust/issues/27779#issuecomment-378416911

What you describe sounds a little like one of the strawman alternatives brought up in that comment:

A call to a fn() -> T can be satisfied by a fn(&uninit T) function of the same name (allowing you to assign fields directly in the function body via the uninit reference)

The direction of faillible construction sounds reasonable. There may be some interesting interactions with adapters and the exact mechanism that makes it work would need a little thought I believe. One note I have from previous thoughts as an unresolved question - what if you want to return an Option<(T1, T2)>, but want to place each of them in different locations? It also seems a little sad that we're limited to result-like types and not wrapper types in general.

One thing I have previously liked about this approach is that it lets you send off fields of a struct to be placement initialized as well (e.g. place_field_a(&uninit x.a)), though the same thing is trickier with raw pointer types because you have to do offset calculation yourself, which is a bit of a pain.

My vague preference has been for new pointer types (over raw) just because of the additional conveniences, like not forgetting to initialise fields.

Another previous unresolved question is around placement initialisation of enums, particularly placing their fields. I'm not sure how (if at all) this can be supported.

1 Like

What you describe sounds a little like one of the strawman alternatives brought up in that comment: A call to a fn() -> T can be satisfied by a fn(&uninit T) function of the same name (allowing you to assign fields directly in the function body via the uninit reference)

Yes similar, but why is it a straw man?

One note I have from previous thoughts as an unresolved question - what if you want to return an Option<(T1, T2)>, but want to place each of them in different locations?

If we forget about the implicit transformations that I mentioned and make all of this explicit then tuples can be initialised in-place (at different locations) by using more pointers, one for each item:

fn init(t1: *mut T1, t2: *mut T2) -> Option<()>

It also seems a little sad that we’re limited to result-like types and not wrapper types in general.

Any type implementing Try could be used for error handling.

trickier with raw pointer types because you have to do offset calculation yourself

&mut x.a still works with raw pointers.

My vague preference has been for new pointer types (over raw) just because of the additional conveniences, like not forgetting to initialise fields.

Yes, mine too, maybe &out? The *mut was just a conservative sketch.

Another previous unresolved question is around placement initialisation of enums, particularly placing their fields. I’m not sure how (if at all) this can be supported.

I don't see a technical reason for why we couldn't init enums if we wanted to go this path of initialiser functions with &out pointers.

I think that this is why _0 (the return value) in MIR is considered owned by the caller, and thus why one can just write to _0 to set the return value. This optimization is part of what the following PR was targeting, iirc:

Sadly it hasn't made it in yet...

2 Likes

Thanks, @scottmcm! I didn’t know.

This solves many cases of RVO, and infallible initialisation for placement new using the same trick!

The only thing that Rust cannot do now is RVO for fallible constructors, as I mentioned earlier, because the return value has to be wrapped in a Try impl.

See also: my strawman proposal from the other thread, for a trait that would allow function call arguments to be constructed in place, as a supplement to guaranteed copy elision for assignments, returns, etc., as proposed here. This would allow plain old container.push(Foo::try_new(5)?); to do the right thing.

In that context it was just a simple example, I didn't want people discussing it as if it were a full proposal.

Sure, but I think one of the hard questions about placement is to try and make it ergonomic to do things that don't touch the stack. If it's a stated limitation of the proposal then that's fine, but I don't think saying "you can always use raw pointers" is relevant without additional notes on how it relates to the RFC.

I referring to arbitrary other wrapper types like Either that aren't necessarily about errors. Again, it's not necessarily a bad thing that it's not handled, but one of the failings I observed about the last proposal was that it wasn't explicit enough about things it can't do, so getting all this up front is nice.

I'm interested in solving this:

enum X { A([u8; 1024], [u8; 1024]) }
let x: &uninit X = ...;
x <- X::A(foo(), bar());

How can I know that foo and bar are constructing their return values into the correct place? The compiler certainly may do what I want, but that's not sufficient for placement. With normal structs this is easier because you can actually take a reference to their fields, but you can't do that with enum fields without match...except it's uninitialized, so match isn't allowed - you could imagine a new language construct that looks a bit like match (but for construction) to solve this though.

let X::A(ref mut v, ref mut w) = unsafe {
    X::A(std::mem::uninitialized(), std::mem::uninitialized())
};

*v = foo();
*w = bar();

I’m puzzled by this thread. What is being proposed here? I see roughly three options, but none of them are very convincing to me:

  • Is the proposal to let people manually write out all functions that want to return large values with out arguments? That would be a huge pain, very repetitive and error prone, and also penalizes callers that don’t need placement — even if it’s done via special new reference types (which have their own large complexity cost).
  • Is the proposal that the compiler desugaring placement to such functions at AST/HIR level? Why would we do that, instead of just doing proper move propagation and performing RVO and other “ABI optimizations” in MIR and codegen?
  • Maybe this is not supposed to be a specific implementation strategy? But rather a thinking tool? Then it doesn’t seem to really help me with thinking about the hard questions of how to expose ergonomic guaranteed placement to users. It’s already quite clear what codegen we want, the question is how to get there.
5 Likes

I too am confused by this proposal.

I started this thread because I wanted to compare constructors in C++ with initialiser functions and RVO in Rust.

At that time I did not know how RVO was implemented in Rust, which is why I came up with a sketch myself (the caller supplies the initialiser function with a pointer to the data to be initialised in-place).

I wanted to highlight that even with RVO as currently designed, there’s a difference between C++ and Rust.
C++ can be more performant because its constructors can do in-place initialisation and handle errors (with exceptions). In Rust, wrapping return values in, say Result<>, makes RVO impossible in some situations, thus Rust cannot compete with C++ in this regard.

There is nothing about Rust or the return type Result<T, E> that prevents efficiently placing of T into some location and communicating errors out-of-band. It's just not implemented yet, but see Result-based error handling optimizes very poorly for large payloads · Issue #42047 · rust-lang/rust · GitHub

2 Likes

Ah, beautiful! :smile:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.