Borrow the full stable name in closures for ergonomics

An annoying thing in Rust is that closures only borrow and capture the first part of a stable name, rather than all of it.

This often results in unexpected spurious borrow check errors when a closure attempts to mutably borrow and capture “self” (or some other big structure variable) when you of course expected to only borrow a relevant field, requiring to manually desugar the closure capture in the proper way.

It also means that moving code inside an once closure that is executed immediately can result in code that unexpectedly does not compile due to this issue.

I think this should be fixed, and as far as I can tell this should be a compatible change.

The size of closure types would change, and broken unsafe code modifying data that is borrowed elsewhere would cause breakage, but I don’t think these would be problems in practice, and are not guaranteed anyway.

To be clear, what would happen is that:

take_closure(|| a.b.c.foo())

would be desugared to:

let capture = &a.b.c;
take_closure(move || capture.foo());

as opposed to the current and annoying:

let capture = &a;
take_closure(move || capture.b.c.foo());

Obviously the “stable name” must stop whenever an automatic deref or anything else that results in Rust code executing would be required, as moving that out of a closure would be an incompatible change due to possible side effects.

if the user wants to override this behavior, they can just introduce a variable in the closure:

take_closure(|| {let a = &a; a.b.c.foo()})

Note that a direct implementation of this will result in increased memory usage for stored closures due to a higher number of capture variables. Thus, it’s important to, as an optimization, capture the shortest part of the stable name that would not cause borrow check issues.

2 Likes

I agree that's necessary, but it's a significant downside to the proposal, as it makes it harder for users to know what will be captured.

The current alternative is to manually capture the deeper component if that's what you really want.

Why would the user care about what’s captured exactly?

The design is such that semantics are guaranteed to be the same (if that’s not actually the case with my proposal as written, then it should of course be fixed as needed before stabilization).

Thus, if the code passes the borrow checker, there’s no need to care about it; if it doesn’t, then the compiler errors should show what was borrowed exactly.

If the user really wants to fully control capture, they can introduce a variable both inside and outside the closure:

// we REALLY want to capture exactly a.b for some obscure reason
let capture = &a.b;
take_closure(move || {let b = capture; b.c.foo()}}

Since I think the common case is by far not caring about it (in fact I can’t imagine any case where one would care, but not need to use an explicit Fn trait implementation possibly with #[repr©] instead of a closure), it seems reasonable that the shortest amount of typing should result in the implicit behavior.

I don't know what the full language ramifications are, but big +1 if closures didn't have to capture self just to capture self.foo. Do non lexical lifetimes have any bearing on this particular inconvenience?

I agree that's necessary, but it's a significant downside to the proposal, as it makes it harder for users to know what will be captured.

When learning Rust, when I passed self.foo to a closure, I actually expected foo to be captured rather than the entire self. Isn't that consistent with how function parameters are passed?

2 Likes

Users need to understand what’s captured for the exact same borrow-check issues that are motivating you here in the first place. Sometimes those errors are not just spurious quirks of how fields are captured, but may be a real conflict.

Right now, I can look at a.b.c in a closure and know that a is borrowed. If we could consistently make it so only c is borrowed, I think that would be OK, probably even preferable. But having a big caveat that it depends on whether Deref is involved will make this harder to understand.

That’s not necessarily a showstopper, but it’s important to consider.

1 Like

Think about move closures too. If you write move || a.b.c.foo(), it can’t move c if a or b implement Drop. But you might have hoped to only move c and still have access to other fields, just like the borrowing issues.

1 Like

The difference is that function parameters are evaluated right at the call site, whereas captured values shouldn't evaluate any code until the closure is invoked (if ever).

Thanks for explaining this distinction.

The deref/drop issue can be explained as an “info” in borrow checker error messages.

[borrow check error]

`a.b` has been captured here ---^ instead of `a.b.c` because `a.b` implements Drop|Deref and thus behavior would differ.
If you want to capture `a.b.c` instead, introduce a variable outside the closure like this: `let [mut] capture = &|&mut|[blank] a.b.c;`

borrowchk is a major problem of closures. Often when I wanted to use a closure to have less repetition, I got borrow issues. Instead I now just define a function local macro. You dont need to pass the “captured” variables to a macro as arguments as long as they are in scope where the macro is defined, so it can be very short. Borrowchck runs after macro expansion, so everything is fine :).

I’m in favor of this and I have been thinking for some time of making a similar proposal. This seems to fit a theme. That is, we chose to always borrow local variables because it’s a simple and consistent rule. But I think that we should make an effort to be smarter and less annoying, even if we can’t do a perfect job.

I imagine that roughly the way this would work is that the closure would borrow paths like a.b.c that involve either field access or deref of & or &mut references.

Some complications that come to mind:

  • if you have a field foo: Box<Foo>, and you borrow self.foo.bar, this involves running a custom deref, and hence the closure will still just capture self.foo.
  • we want to do some coallescing, so if you access self.foo.bar and self.foo in one closure, we naturally want to capture self.foo.

Other than this, I believe the proposal ought to be backwards compatible, modulo (as you say) the low-level details of what precisely gets captured (which I also don’t consider to be stable).

2 Likes

Good catch. I don't consider this a show-stopper, but it's good not to forget.

This is definitely the thing to be aware of. And we can do a trial period where we enable the new behavior only on nightly. But my guess is that users will not be surprised or confused any more so than they are today. Essentially, today, when you run into this, you learn that the compiler always borrows just local variables, and then you know how to fix this. Under the new model, you learn that the compiler always borrows some prefix of the path -- and sometimes it's not as long as you like, and then you apply the refactoring. But the principle doesn't strike me as all that different.

I think this falls pretty squarely into the usual pattern that we've identified, where we start out by making the borrow checker (and other language rules) as simple and predictable as we can. This is probably a good starting point. But ultimately the rules fade into the background -- and I think that it's preferable to make them as smart as we can -- without making it too hard to figure out what's going on when you do run into a problem. And I don't see this as making it significantly harder to figure out what's going on when you do hit a problem.

(I think I am just sort of restating a twist on the language ergonomic guidelines that Aaron blogged about a while back. In short, I think this is a case where the "elided information" has very low power and also fairly low context dependence.)

2 Likes

I just realized that these Deref/Drop caveats already occur even in non-closure code, e.g.

fn foo(x: &mut (i32, i32)) -> i32 {
    let a = &x.0; // borrows only the first field
    x.1 += 1; // the second field can still be mutably borrowed
    *a
}

fn bar(x: &mut Box<(i32, i32)>) -> i32 {
    let a = &x.0; // here `x` gets entirely borrowed by `Deref`
    x.1 += 1; // error[E0506]: cannot assign to `x.1` because it is borrowed
    *a
}

So… I guess I’ve convinced myself that applying similar rules when dealing with closures wouldn’t actually change the language difficulty much. :slight_smile:

3 Likes

It would be nice to have something that offers a bit more control. It seems like there’s a lot of confusion and inflexibility over how the captures work: e.g. https://users.rust-lang.org/t/chaining-futures/11251/11

Consider the typical Tokio code:

let core = Core::new();
let h = core.handle();
my_stream.for_each(move |...| {
    let h2 = h.clone();
    let h3 = h.clone();
    my_future.and_then(move |...| {
        do_something(h2)
    }).and_then(move |...| {
        do_another_thing(h3)
    })
})

The way I see it, there way captures work right now is a bit non-local: the h2 and h3 variables are declared in the outer closure, but they really belong to the inner closure. Likewise, h is declared outside the outer closure, but it actually belongs to the outer closure.

Instead, it may be helpful to explicitly write which parts are being captured within the closure itself:

let core = Core::new();
my_stream.for_each(|...| {
    let h = move(core.handle());
    my_future.and_then(|...| {
        do_something(move(h.clone()))
    }).and_then(|...| {
        do_another_thing(move(h.clone()))
    })
})

The hypothetical move(...) syntax helps clarify what is actually being captured. It’d be just a shorthand for:

let core = Core::new();
my_stream.for_each({ let __1 = core.handle(); move |...| {
    let h = __1;
    my_future.and_then({ let __2 = h.clone(); move |...| {
        do_something(__2)
    } }).and_then({ let __3 = h.clone(); move |...| {
        do_another_thing(__3)
    } })
} })

If someone is interested in turning this into an RFC, I would love to mentor it! Please let me know.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.