Local closures borrowing ergonomics

There's a relatively frequent annoying problem with closures in Rust, which I haven't seen discussed a lot. I am wondering if I miss any relevant discussions?

The problem comes up when using closures to abstract some code repetition locally.

For example, let's say you have:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();

then, you have a bunch of code that adds Ok values there:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
if cond1 {
    acc.push(Ok(x1));
    acc.push(Ok(x2));
}
if cond2 {
    acc.push(Ok(y));
}

You can abstract over that with a closuer:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
let mut add = |v| acc.push(Ok(v));
if cond1 {
    add(x1);
    add(x2);
}
if cond2 {
    add(y3);
}

The problem now though is that the add closure borrows the vec. If there's even one case where you want to use the env without the helper function, you just can't:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
let mut add = |v| acc.push(Ok(v));
if cond1 {
    add(x1);
    add(x2);
}
if unlucky {
    acc.push(Err(e)); // :-(
}
if cond2 {
    add(y3);
}

The problem here is that add borrows vec for while the add is live, while what we actually want here is to only borrow acc for the duration of the call itself, such that we can use it in between the call to add.

The two common workarounds here are:

making add a macro:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
macro_rules! add { ($v:expr) => (acc.push(Ok($v))) }

adding acc as an explicit parameter to closure:

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
let mut add = |acc: &mut Vec<Result<Item, Error>>, v| acc.push(Ok(v));

Neither version is particularly ergonomic or nice.

Are there any hypothetical or in-progress language features which can help with this pattern?

12 Likes

Why are macros so bad? Is the definition site or the use site a bigger problem?

I am asking because, well, ‘in-progress’ is a bit of a stretch, but I have been thinking about designing a macro fn facility, a limited form of macros defined (and used) mostly like functions, but one that unlike normal functions would be able to (a) have argument types inferred and (b) express call-by-name. I conceived of it primarily to address the unwrap_or versus unwrap_or_else (genericity over side effects) problem. I haven’t come up with a satisfactory design for this feature yet, but this sounds like a potential use case for a scope-bound variant of it.

3 Likes

You can use a local RefCell for situations like this, but I agree that it seems like there should be a way to do this statically.

I see a closure not as a function, but as a struct and a function, and with that point of view I would expect the closure to keep hold of the borrow even between calls. I would be very surprised if a struct yielded its borrow between uses, and I would be equally surprised if a closure yielded its borrow. For example

let mut counter = 0;
let mut sequence = || {
    counter += 1;
    counter
};

If counter is modified between calls to sequence, then sequence is not a sequence after all.

8 Likes

Just off the top of my head, feels like we'd need to do escape analysis on the closure value to get this to work... I guess the borrow checker could help with this, but that seems hard given the current model.

I agree with this, but I have also encountered the situation like the above, where I only wanted to borrow while the closure was live.

Maybe a variant 'function like closure' could be introduced, one that implicitly adds local variables as arguments, rather than into an implicit struct.

let add = fn |v| acc.push(Ok(v))

I'm not sure how you would pass this closure into other functions though. It would be frustrating if

add(x1) // worked
option.map(|| add(x2)) // didn't work

This idea would also be complicating the language, and for a pretty niche quality of life improvement.

Either way, changing the behavior of the existing closure would break assumptions people have about how they work, or at least complicate them.

1 Like

I'm actually already turning on this idea, maybe I should collect my thoughts a bit more, but I think improving type inference (if possible) would be a bigger win.

I wouldn't mind writing:

let mut add = |acc, v| acc.push(Ok(v));

It's the writing of:

let mut add = |acc: &mut Vec<Result<Item, Error>>, v| acc.push(Ok(v));

That I find ugly.

But I also only come across this type of situation a few times per code base, so I haven't really found it to be a big issue.

3 Likes

Is the following better?

let add = |acc: &mut Vec<_>, v| acc.push(Ok(v));
2 Likes

Yeah in that case, and that does work, but I was using that example more to invoke the spirit of the problem, and because it was the original one used. Usually when the problem comes up for me it's because the type is wrapped in at least one layer of Options or Results:

let is_json = |r : Result<http::Response<_>, Error>| {...}

Edit: Also while it is "better" than the previous example, I still tend to view both as unergonomic. I don't like having to define types in closures at all. For me, not writing types in closures is the whole point, captures are just a bonus. Usually when forced to type a closure I just delete it and accept the dry code, especially when the type definitions are longer than the expression. But I'll accept that this opinion might be unreasonable.

Ironically, representing the closure as a struct provides enough control to achieve what is desired.

struct Add<'a> {acc: &'a mut Vec<Result<i32, ()>>}
impl Add<'_> {
    fn call(&mut self, v: i32) {self.acc.push(Ok(v));}
}

let mut acc: Vec<Result<i32, ()>> =  Vec::new();
let mut add = Add {acc: &mut acc};
add.call(1);
add.call(2);
{
    let acc = &mut add.acc;
    acc.push(Err(()));
}
add.call(3);
6 Likes

A possible solution is to have the compiler track &muts so that the compiler knows that there is an &mut to "acc" inside the "add" variable and when it sees the "acc.push(Err(e))" line it can automatically rewrite it to "add.acc.push(Err(e))" that would reborrow the &mut inside the closure (where we must also allow access to closure internals by code in the same module or at least function).

It might be too magical though, and lead to unintentional invariant breaking if an &mut is accidentally made public in a struct (or if no module boundary is added) when invariants need to be preserved.

Another possible alternative is to let the user access closure captures by name and do this manually, i.e. writing "add.acc.push(Err(e))" in the code.

6 Likes

That's interesting -- maybe there could be a pub declaration on the closure? It would still only have local effect though, since the actual type is anonymous and can't escape. (Even impl Trait still doesn't allow field access.)

  • let mut add = |v| acc.push(Ok(v)) captures acc by &mut (because of the usage)
  • let mut add = move |v| acc.push(Ok(v)) captures acc by value (because of the move keyword)
  • let mut add = pub |v| acc.push(Ok(v)) would capture acc by &mut (because of the usage), but make it possible to re-borrow it between call (because of the pub keyword)

I really like the idea, but I don’t think that pub is the clearer possible keyword.

I think that re-borrowing is a better idea than adding an implicit argument, because this doesn’t change the Fn trait being implemented. When re-borrowing, I would personally prefer to have the re-borrow implicit (ie. being able to write acc.push(Err(e))) instead of explicit (add.acc.push(Err(e))).

5 Likes

I prefer the explicit access because it can reasonably work with move closures too.

Another possibly weird case is if the closure only needs to capture by &, but outside you still want some &mut access. I think that's still possible to support, but would need some deeper magic or more explicit declaration of captures.

This would be my most preferred syntax for accessing the captured value, but I feel like I'd probably want some way to explicitly denote the incoming capture and name it:

let mut acc: Vec<Result<i32, ()>> =  Vec::new();
let mut add = { acc: &mut acc } |v| acc.push(Ok(v));
add(1);
add(2);
add.acc.push(Err(()));
add(3);

I would prefer no new syntax for a new type of a closure (like pub || etc). That's because it adds more things that Rust users need to learn. It will be yet another case where new Rust users get stuck fighting with the borrow checker, have to ask in the forum, and are told to use an obscure syntax they wouldn't have guessed themselves.

If the borrow checker could make it "just work" with the normal syntax, it'd be great. Perhaps the logic could be:

If the closure doesn't escape the current function (or its type isn't observed anywhere), then change the closure from holding captured references to getting them as arguments on every call site, i.e. instead of being FnMut(T) with a capture, let it be fn(&mut Vec<T>, T) with an implicit argument re-borrowed on every call.

let mut acc: Vec<Result<Item, Error>> =  Vec::new();
let mut add = |v| acc.push(Ok(v));
if cond1 {
    add(/* implicit: &mut add, */ x1);
    add(/* implicit: &mut add, */ x2);
}
if unlucky {
    acc.push(Err(e)); // :-)
}
if cond2 {
    add(/* implicit: &mut add, */ y3);
}

or maybe closures could be coerced to either Fn(explicit args) or fn(catpured args, explicit args). Basically UFCS for closures. Methods with &self can desugar a function with a regular argument, so why wouldn't closures desugar their captures to arguments?

7 Likes

Some questions that I have with this:

  • Does this mean that if I do end up passing the closure to something (e.g., an Iterator method), I can get errors all of a sudden?
  • Will drop(acc) appropriately indicate that add needs to be dropped first?
  • Do the unsafe guidelines need updated for what can be done with "smuggled" parameters? For example, if a shared ref gets stored into some other vector:
let mut store: Vec<&_> = Vec::new();
let mut add = |v| { acc.push(Ok(v)); store.push(acc) };
if cond1 {
    add(/* implicit: &mut add, */ x1);
    add(/* implicit: &mut add, */ x2);
}
if unlucky {
    acc.push(Err(e)); // Does the `store` borrow get tracked here?
}
if cond2 {
    add(/* implicit: &mut add, */ y3);
}
drop(acc); // Allowed?
1 Like

Very good questions!

Does this mean that if I do end up passing the closure to something (e.g., an Iterator method), I can get errors all of a sudden?

Yes.

Will drop(acc) appropriately indicate that add needs to be dropped first?

I think it'd behave just like use of multiple variables, so it depends where you use them:

let acc = Vec::new();
let add = |v| acc.push(v);
add(1)
drop(acc);
add(2);

desugars to:

let acc = Vec::new();
add(&mut acc, 1);
drop(acc);
add(&mut acc, 2);

so you get diagnostics like for the desugared code block (can't borrow in add after move to drop).

if a shared ref gets stored into some other vector:

Is this even allowed? It requires borrowing for a lifetime longer than the closure's. But yeah, it's a tricky case. Perhaps arguments escaping from the closure should also prevent coercion/desugaring to the implicit-reborrow closure type.

Seems that it is already disallowed. So unsafe probably also needs to adhere to such guidelines.

fn main() {
    let mut store = Vec::new();
    let mut acc = Vec::new();
    let mut add = |v| { acc.push(v); store.push(&acc); };
    add(1);
    add(2);
    println!("store: {}", store.len());
    println!("acc: {}", acc.len());
}
error[E0521]: borrowed data escapes outside of closure

They are limited in that they are always inlined. Not bad per se, but not ideal, either.