Closures don't move Copy types by default?


#1

I ended up with some code that looked like this: https://is.gd/MFLKC9

To my surprise, the compiler complains that mask does not live long enough. But mask is just an integer, I would expect it to be copied, right? Or if the closure is capturing it by reference… well, if you try dereferencing mask in one of the closures the compiler will insist that no, it’s not a reference. So we have the compiler insisting that a non-reference value is being borrowed. This seems contradictory. Making the closure move fixes this problem, conveniently enough.

I’m really not sure if this is intended behaviour or a compiler bug, so it was suggested I post the issue here. Thanks in advance.


#2

This is – at present, at least – the expected behavior. If you do not supply the move keyword, then the closure accesses all the variables from the stack by reference, unless it cannot do so (i.e., because it must be moved).

Keep in mind that both copying and using in place have legitimate meanings when mutability is involved. There is a kind of ambiguity that arises:

let mut i = 1;
let x = || i;
i += 1;
let j = x(); // is `x()` going to return `1` or `2` here?

Right now, x will return 2, because – since you did not write move – the closure is “attached” to the creating stack frame and uses its values in place. If Copy types were moved by default, it would return 1.

You could imagine taking the binding of the variable into account (i.e., here we would use in place, because i is declared mut, but if it were not mut, we would copy). That makes me uncomfortable because marking a variable as mut or not mut currently has no affect whatsoever on the semantics of your program, but in this case it would change the meaning of the code quite drastically.

In a way, I feel we picked the wrong keyword for closures. I prefer to think of move closures as “detached” from the surrounding stack frame (in which case they must copy all the data they must use into themselves), whereas non-move closures always use the variables “in place”.


#3

Yeah, I just expected that the way you “use a variable in place without copying it” is via a reference, and thus that if a closure is “attached” to the surrounding scope, then the variables it uses from that scope would be explicit references. In-place variables don’t have lifetimes after all. But in your example, x returns i32, not &i32, and in which case I would be able to say *i to copy the Copy value the reference is pointing to.

To make it more explicit:

// This works
let i = 1;
let x: &Fn() -> i32 = &|| { i };

// This does not
let i = 1;
let x: &Fn() -> &i32 = &|| { i };

i is i32, not &i32, even inside the closure where “the closure accesses all the variables from the stack by reference”. Except the type of the variable is not a reference. Though I can see where suddenly turning i: i32 into i: &i32 that would be the less-than-expected behaviour as well. So I can’t say either way is right or wrong.

It really would be nice to be able to explicitly describe the environment of a closure in Rust, if only so one can explain the differences between these cases more clearly. Right now pretty much everything involving closures seems to come down to “well there’s some special behaviour behind the scenes and you just have to know what that behaviour is without being able to explicitly see the differences”. (But that’s another rant.)


#4

Without moving, it doesn’t compile at all:

error[E0506]: cannot assign to i because it is borrowed

So in this case, moving Copy types would only enable patterns that were errors before. And all the interior-mutable types are not Copy, so that doesn’t change here either.

Are there other examples of valid code that would change semantics?


#5

It doesn’t guarantee you’re fully detached though. You can move a local reference into a closure, thus still carry a non-'static lifetime.


#6

Yep, anything that doesn’t involve aliasing:

let mut i = 1;
{
    let mut x = || i += 1;
    x();
    x();
}
println!("{}", i); // prints 3

#7

OK, capturing a &mut versus copying the value makes a big difference.


#8

True, good point. I still consider the example relevant – that is, if I saw that code, and it compiled, I would expect it to be mutating i in place (but it wouldn’t be). I would find that surprising, given the “mental model” I described elsewhere (move closures are “detached” from the stack frame’s local variables and have their own copy; normal closures are not).

If UnsafeCell<T> were Copy, as has been proposed, or we had a way to unsafe impl Copy, then one could readily make an example where the change in behavior would be observed in an interesting way. As it is, hmm, I’m not sure if it can easily be done. (There’s no real reason that Cell-like types can’t be copy, except that it would be such an easy footgun, more-or-less precisely for the reasons I’m showing here.)

I think the canonical example that made us not even try to infer “move” was something like this. It’s not quite the same scenario though:

let mut i = 1;
{
    let x = || i += 1; // which `i` is mutated here?
    x();
}
println!("{}", i);

More generally, I think that if the type of the variable being captured is “freeze” (i.e., no interior mutability, and hence truly immutable when shared), then I suppose that closures which use it will have the property that they never observe mutation from the outside (and hence making something move can only eliminate errors, precisely when such a mutation would be taking place).


#9

My personal take on this is that it is a mistake to conflate mutable references with values. Values are immutable by default, and variables are immutable (like in mathematics they stand for a unknown value). Only objects should be mutable (that is containers). Another way to think about it is that variables do not have an address (they are not l-values) but containers are an address (they are l-values). I think all these confusing access modes stem from an a failure of the abstraction to make these very different things distinct.

By distinct I mean distinct in the type system, so ‘Int’ is an immutable integer value or variable. If we want something mutable, the Int itself cannot change because it is a value, so you need an integer container. For example a single element Int array would work “[Int; 1]” you can change the value in the location. In other words having a location and mutability are essentially the same thing (although we might enforce read-only access to such a location). We might want to introduce a special case container in the type system for a singleton mutable location, something like “Mut Int”. Note here “Mut” is a type constructor just like Array, it is not a separate annotation. (A further note is that you cannot have a reference to a variable because it has no location, you can only reference a container).

This is the approach I am taking in my own language project, and is an example of what I said in my post about chalk of designing the language semantics around what is natural in the logic of the type system. I have always found l-values and r-values in ‘C’ to be hard to explain to beginners, and I think this is again because the type system abstraction does not fit the semantics well.


#10

That’s a pretty good example why you wouldn’t want to move Copy types silently, but then, why are these variable names that act like references, not actually references? That would make everything perfectly reasonable and consistent with the rest of the language:

let mut i = 1;
{
    let x = || i += 1; // which `i` is mutated here?
    x();
}
println!("{}", i);

Here, the type of i in the closure is not a reference type, but is magically treated like a reference in that it can’t be allowed to outlive the lifetime of i outside the closure and the borrow checker will make sure that is the case. That would also make the difference between a move closure and a normal closure completely obvious: external variables in a normal closure are references, external variables in a move closure are not. (That could be useful in the grand quest of being able to actually write down a bloody closure’s type, but that’s a different saga.) So if it acts like a reference, why is it not a reference?

The obvious answer is that as far as the compiler is concerned, a variable inside a closure declared outside the scope of it isn’t a reference. There’s no separate pointer pointing to another part of the stack frame, the compiler already knows where i is and can just use that directly. But it seems that from the perspective of how Rust the language views the world this is a weird piece of double-think.