[lang-team-minutes] feature status report: placement in and box

I think without concrete uses in, say libcollection, this feature should not be established.

This may be a foolish question, but why is new syntax even needed in the first place? My understanding is that the purpose of box EXPR is to force an optimization that will occur anyway in release mode. Why not add an attribute that can be specified on Box::new, Vec::push, etc. that forcibly inlines the argument?

For box EXPR/Box::new, that may be possible, but it would have to be a rather ad-hoc, magical attribute. And it would presumably come with some weird rules for the implementation of new to guarantee that the optimization actually applies (remember that this is intended to apply not just to Box but also to Rc, Arc, and third-party types). So I doubt whether that feature would actually be easier to implement or less of an addition to the language. Then there’s the fact that box EXPR is simple more ergonomic, and nicely matches box patterns which are desirable too.

With “placement in” syntax, e.g. as replacement for Vec::new (edit: Vec::push, sorry for the confusion), it’s not really possible to avoid new syntax, see this old FAQ (particularly question #4).

Box patterns? That would be a definite usability improvement for working with boxes…

2 Likes

I suppose what I’m asking is whether we’ve considered some equivalent mechanism to how C++ forwards arguments (I’m not 100% on how this pans out in C++, so this is just wishful thinking on my part).

With "placement in" syntax, e.g. as replacement for Vec::new (edit: Vec::push, sorry for the confusion), it's not really possible to avoid new syntax, see this old FAQ

I remain unconvinced that we cannot simply use closures for emplacement, i.e. do this: vec.emplace(|| LargeValue { ... })


The FAQ (A12) claims that this is incompatible with fallible constructors, i.e. the new syntax allows one to write vec <- try!(run_code()) which will bail out from the current function on error, whereas vec.emplace(|| try!(run_code()) form cannot do that, obviously.

The catch here is that in order for run_code() to be fallible, it must return a Result<T,E>, which wraps the value we want to emplace. Well, guess where this intermediate value will be stored? That's right, - on the stack. Which defeats the whole purpose of emplacement. The vec <- try!(run_code()) form will appear to work, but will not in fact do what the programmer had intended!

What would work with arrows is this: vec <- run_code(try!(compute_argument())). However computation of arguments is pretty easy to factor out of the construction expression:

let arg = try!(compute_argument());
vec.emplace(|| run_code(arg));

So... do we gain enough by adding all that placement machinery into the language, when a simple solution works nearly as well?

4 Likes

A12 is about desugaring PLACE <- EXPR (or whatever the syntax would be) to something like PLACE.emplace(|| EXPR). Such a desugaring would silently change the meaning of EXPR w.r.t. control flow. It’s true that people could write out the closure form themselves and then they’d have no one to blame but themselves if they tried to return (of break, or …) from the closure.

So, I misspoke when I said new syntax is strictly necessary. If we require people to write out closures manually, that could generate the desired code. For that matter, we could also have a macro that does the current desugaring (as long as we decided that we don’t want auto-ref). One of the placement RFCs in fact gave an example implementation in the form of a macro.

So I am again reduced to an appeal to ergonomics: Placement is strictly more efficient than other forms of construction, so it should be more convenient than those other methods, to ensure it is used over them.

2 Likes

AFAIK, C++ uses:

  1. Variadic template methods that basically accept any possible combination of arguments.
  • Perfect forwarding (RValue-References + Reference collapsing + std::forward) to forward the arguments to constructor
  • The fact that constructors are already a special case, not just static methods that return a value.

I think especially point 1 is problematic. It means unrestricted overloading with variable arguments and without any constraints (traits), which is very far from how generics work in Rust.

2 Likes

Yes, especially that this in a C++ constructor is a pointer to as-yet-uninitialized memory.

I think a fairly direct parallel in Rust would be something like:

struct Vec<T> {
    fn emplace_back<C>(&mut self, ctor: C)
        where C: FnOnce(*mut T)
    {
        let this: *mut T = self.reserve_uninitialized_slot();
        ctor(this);
        // ... plus some panic handling
    }
}

The closure's environment can take the place of arbitrary variadic templates, but making the initialization happen in place is the tricky part to me.

Whatever solution is chosen must make the following program always correct (playground):

const N: usize = 512;  // works in my platform for 256 but not 512 =/
type FixedSizeMatrix = [[f64; N]; N];
let u : Box<FixedSizeMatrix> = Box::new([[0.; N]; N]); 

The current issue is that inside Box::new(EXPR), result of EXPR is allocated on the stack, and then passed to new, which moves it to the heap. It works in release because the optimizer removes the unnecessary stack allocations.

IIUC placement in is a language construct (analogous to placement new) to force the result of EXPR to be allocated at a concrete memory location, which solves this problem (and others). I do not want to run into, what Bjarne describes as: "Something must be done. This is something. Therefore we must do this".

The original placement box RFC says:

This provides a way for a user to specify (1.) how the backing storage for some datum should be allocated, (2.) that the allocation should be ordered before the evaluation of the datum, and (3.) that the datum should preferably be stored directly into the backing storage (rather than allocating temporary storage on the stack and then copying the datum from the stack into the backing storage)

After re-reading the RFC, I see that there are many use-cases being discussed. But if I focus on the simplest example shown above, I don't feel that the following question has been answered with sufficient clarity in the RFC and associated discussions:

Why cannot the example above be made to work correctly and reliably without extra syntax?

That is, without focusing on other usages / applications of a possible placement in language feature, could the language require that when the result of an expression is directly consumed by another expression, the expression must be inlined when possible (for some definition of possible) and the temporary storage guaranteed to be eliminated?

1 Like

Not in the presence of panicking and other divergent expressions. We want the heap storage to be allocated before the expression is evaluated. If the expression panics, the heap storage needs to be freed without executing the destructor. Maybe it can be done without custom syntax, but it can’t be done without a placer interface that the containers have to implement.

I think there’s two seperate issues - what the syntax looks like for the user and what the syntax looks like for the library.

Since user code is more common, obviously we should optimize for that case. In particular, it would be great if you could just write Box::new(expr) or Rc::new(expr) or vec.push(expr) and have the compiler do the right thing.

As far as I can tell, the only reason this optimization can’t be done completely automatically is order of evaluation issues in the case of placement (note that there is no such concern for placeless allocation like Box::new and Rc::new).

In the case of Vec::push, doing the allocation changes the state of the vec, and there’s interesting questions surrounding what will happen if EXPR involves borrowing vec, as well as if it panics or completes abruptly. This sounds similar to the Non Lexical Lifetime issue, so maybe we should wait to see how they decide to handle stuff like vec.push(vec.capacity()) first.

Sure, but I've seen similar arguments used against direct usage of closures for emplacement.

This sounds like an advantage of the closure form, really: vec.place_back(|| run_code()?) will give you a type error, whereas vec <- run_code()? will compile, but won't do the intended optimization.

vec.place_back() <- x is not even shorter than vec.place_back(|| x)

To win on the number of characters typed, we'd need to figure out auto-ref'ing, so one could write vec <- x. Which opens a wholly new can of worms, it seems.

Without auto-ref we could still have &mut vec <- x, or plain vec <- x if vec is actually a &mut Vec (not at all rare). But yes, auto-ref is a big win. I don't see what can of worms it opens, other than the current implementation (HIR expansion) being unsuitable — but at least for box, the implementation may already change for other reasons (type inference) as outlined in the OP.

And it's not just, and not primarily, about the number of characters typed. In my and other people's vision, <- would be the default, the most simple and aesthetic way of inserting in a collection: the <- syntax is visually evocative, it's the same for different collections (Vec::push vs. HashSet::insert), and it has less visual noise than the alternatives (no parentheses, no || prefix).

While reading this, it occurred to me that the closure form actually isn't equivalent in the number of moves! In the following example (after whatever temporaries and moves are involved in try! or ?), the payload of the Ok variant is moved into the local arg, then moved into the closure object, and then finally moved into the Vec's backing buffer.

let arg = try!(compute_argument());
vec.emplace(|| arg);

Contrast this with the <- equivalent (vec <- try!(arg);), which does indeed need a temporary for the Result just like the above code, but in the Ok case directly (again, after whatever happens in the try!) copies the payload into the Vec's memory. And despite the closure being quite inlinable, the move from the Result temporary into the local occurs before the memory allocation, so we can't be sure it gets reordered. (Not to mention that the closure isn't 100% guaranteed to be inlined, the fact that LLVM has been and still is less-than-stellar about optimizing memcpys, and a million other minor wrinkles.)

So there's that. But even if it were fully equivalent, I don't think we'd want the error that the closure implies. That would assume people only use placement (in whatever form) when they absolutely positively need the optimization and can not tolerate anything less. This is antithetical to the vision of placement becoming the default syntax — it doesn't always have to be always faster (the ? case needs a temporary regardless of how you approach it), it just has to be always at least as fast as the alternatives (Vec::push and your hypothetical Vec::emplace).

Besides, Rust makes a point of being relatively explicit about costs, if you know what to look for, but it doesn't go out of its way to actively penalize slight inefficencies. If someone does not know, or does not care, that a ? expression involves an extra temporary, forcing them to move the ? out of the closure doesn't help them, it just force them to write uglier code and prolong the "fight the compiler" phase. Furthermore, even if we wanted to highlight these situations, a lint could do the job just as well, but one could silence it.

I just want to mention again that i don’t think it makes any sense to settle on a design for this until we’ve figured out a design for &in / &place pointers (ie. the opposite of &move pointers).

Neither &move nor “out-pointers” are really ergonomic without being parametrized on the allocation and owning it.

IOW, the Place types in the RFC are one of the few realistic versions of out-pointers for Rust (you could also imagine having a single type with a generic parameter, but the two are mostly equivalent), there just is no Place type provided for stack slots, only everything else. That makes p <- x what would otherwise be *p = x.

1 Like

I really don't like vec <- x because it seems confusing and ambiguous to me. For example, with a vec, you usually want to push to the back, but for a deque, you can efficiently append to either end. And it's not like you can't insert into the beginning of a vec either, it's just slower than inserting to the back. So it's not immediately clear what vec <-x even means.

Apart from that, Go uses <- for an unrelated operation (sending and receiving on a channel), so that would increase confusion as well.

2 Likes

Placement could be conceivably implemented for channels as well, since they're basically collections purpose-built for communication and synchronization. It makes enough sense to me that one could do

tx <- MyMessage(42)

instead of

tx.send(MyMessage(42))

with all of the benefits of in-place allocation that this could potentially get you.

Are there any updates on this? Is box syntax dependent on the rest of the functionality here, or would it be possible to stabilise box as it works in nightly before having the rest of the details sorted?

One problem with stabilizing box as it works in nightly: It might prematurely stabilize a strong connection between box <expr> and Box<T> (for an expr: T), which in turn might make it difficult to generalize box in the future to other container types in a backward-compatible fashion.

Basically, we’ve been holding off stabilizing box until we make more progress on generalizing box (or, alternatively, deciding that generalizing box is not worth the effort/cost…)