Add a lint to warn against early returns inside closures

At a quick glance, an early return statement inside a closure can be easily mistaken for an early exit from the enclosing function (if there is one). Therefore an early return statement in a closure should be replaced with a labeled break out of the (explicitly labeled) closure body block. I think the lint (for example closure_early_return) should warn by default.

fn foo() {
    for i in 0..10 {
        if i % 2 == 1 {
            // Exits early from `foo`
            return;
        }
        println!("{i}");
    }
}

fn bar() {
    (0..10).for_each(|i| {
        if i % 2 == 1 {
            // Exits early from this closure (not immediately obvious)
            return;
        }
        println!("{i}");
    });
}

fn better_bar() {
    (0..10).for_each(|i| 'this_closure: {
        if i % 2 == 1 {
            // Exits early from this closure (obvious at first glance)
            break 'this_closure;
        }
        println!("{i}");
    });
}

I understand the interest in this, but I'm afraid it would result in far too many false positives.

20 Likes

Am I right in thinking that all false positives would be cases where the closure is not inside a function or another closure, or are there other cases? Wouldn't Clippy be able to detect those cases and allow them without a warning?

People actually want to return things from closures, and sometimes they write that as an early-exit from the closure. Plenty of times the closure even has the same return type as the enclosing function. I do think the kind of mistake you’ve identified happens in real code, but it’s also the most natural way to write a closure-scoped early exit in Rust, and it’s too late to penalize that (by default, at least) just because it overlaps with cases where that’s a mistake.

8 Likes

I don't see how returning things from closures changes anything because you can just as easily break things out of closures as you can return them. The penalty is that you need to label your closure block, but the label is what makes the control flow more obvious compared to a return statement (and you could just type 'c as the label if the name is not important).

I didn't assume that this lint would help catch mistakes (I never even thought of that) and it seems unlikely that it would. But when you're reading code and mistake a closure return for a function return, it wastes your time at least for a second until you realize the mistake in your thinking. The purpose of this lint is to encourage writing code that's easier (and faster) to read.

1 Like

Much as I like label-break-value, it's intended to be rare. I really don't think we'd have rustc recommend replacing a return with LBV.

The break is also potentially confusing to someone skimming as they might expect it to break the loop, like return ControlFlow::Break(()) does in try_for_each.

TBH, if I was going to lint that it would be a clippy lint for "why are you using .for_each here? use a normal for loop."

15 Likes

I'd argue that it doesn't matter whether a language feature was intended to be rare or common - once you add a language feature, it becomes rare, common or something in-between simply based on how much programmers use the feature. If the use of label-break-value is currently rare, then I'd say that's a plus for this lint because then the first thing people would think when they saw a labeled break statement is that it's probably exiting a closure with a return value.

A labeled break statement in a closure might confuse people into thinking that it's breaking out of the loop-like function that's calling the closure, but that's caused by the fact that thus far nobody has written early exits from closures using break statements. Labeled break statements in closures would stop being confusing when you saw them everywhere, and hopefully they would also stop feeling like a hack.

A return statement gives you no hint about "where" the value/unit is being returned to - you have to know which scope you're looking at (and closures being prevalent in Rust makes it not so easy to keep track of at all times). A labeled break tells you immediately that it's not directly returning from a function, and the label tells you what your eyes should be trying to find in order to understand the control flow.

Lol, obviously the lint should be:

replace the loop with `print!("0\n2\n4\n6\n8\n");`

But you can't do it "just as easily"; it's substantially more obtrusive and syntactically noisy.

This also breaks the property that functions and closures can generally write the same code the same way, which makes it easier to refactor part of a function into a closure.

And frankly, return isn't the most common case likely to cause confusion there; confusion around ? seems much more common. And we definitely don't want to make that noisier.

7 Likes

Here's my earlier statement rephrased: I don't see how returning things from closures changes anything because you can just as easily add a return value after the break 'label thingy as you can after the return keyword.

My point was that whether the closure returns () or something else is not relevant to this discussion.

I don't know what you mean by the labeled break syntax being obtrusive but if you mean that it feels hacky because a feature is being used not for its intended purpose, then I agree. The label before the block and after the break keyword are syntactical noise, but they also make the behavior of the code more obvious (especially if you've come to expect this pattern).

(Don't know how the following is relevant.) I think that you can pretty much understand what the code is doing even if you ignore the ? symbols, but you can't ignore return statements.

Large closures can lull the reader into thinking that he's reading regular function code because the closure code looks the same and it can access the enclosing function's local variables. The return statements in closures break that illusion, and yet they look just like regular function return statements (which is the crux of the problem). Replacing closure return statements with labeled break statements makes it obvious that the illusion is being broken.

I just realized that you cannot use this pattern in closures that rustc cannot infer the return type for, because then you have to explictly write the return type for the closure, and in that case rustc doesn't allow adding a label to the closure block. Not sure if this is how Rust is actually supposed to work though.

1 Like

It seems preferable so that something like || -> &i32 'foo: { &42 } isn't accepted. You can always just add braces: || -> &i32 { 'foo: { &42 } }.

With break or continue you need to find the innermost for, while, or loop. With return or ?, you look for the innermost fn or closure (and also try-blocks for ?). All of those could be rewritten using labelled breaks so that you would only ever have to look for the containing block with the same label. But given that these "shorthand" forms exist, it makes sense to use them. That is, replacing break 'this_closure with return just seems like using return for its intended purpose.

If it was idiomatic to always use labelled breaks instead of return within closures (but not fns), people new to the language would be wondering if that's because return would in fact return from the containing fn, and might even expect it to. That wouldn't be an issue in a new language that could forbid return within closures, but Rust can't.

3 Likes

The desugaring of ? includes a return.

If you ignore ?, you don't understand what the code is doing.

Otherwise, if you misunderstand where return goes, you misunderstand where ? goes, or you don't understand ?.

3 Likes

Please vote your preference (I created the poll such that the votes should be anonymous):

  • I think programmers should write early exits in closures using return
  • I think programmers should write early exits in closures using labeled break
0 voters

If we want to people to start using labeled break for clarity, I think a more fruitful (and possibly less controversial?) starting place would be to lint against unlabeled break in nested loops. Though based on existing practices, I’m not sure even that would reach consensus.

3 Likes

Currently with 37 votes in, the poll shows that 97% think that labeled break should not be used for early exits in closures. I want to point out to the small minority of us who thought that labeled break should be used for that purpose, that it's important that people write code in idiomatic manner because code consistency makes it easier for other people to read. This is why I introduced this as a warn-by-default lint, which would have forced everybody along the ride. Therefore we, the small minority, should not start using labeled breaks for early exists in closures, even though we thought that it was a good idea - code consistency is more important than our personal preferences.

18 Likes

A very humble and thoughtful response. Why the Rust community is great!

6 Likes

I think you may want to have this in clippy (official lint tool) instead?

Sure, if that's more appropriate category for this thread instead of "tools and infrastructure". I can't find the clippy category though.

UPDATE: I just realized you probably weren't talking about discussion thread categories. My answer (in case you were even asking me) is that I wouldn't want this lint added to any tool (anymore).

2 Likes

my bad, I meant your purposal may be accepted on clippy.

I think it could be a nice #[clippy::restriction] lint. There is certainly potential for confusion. I wouldn't want early returns to become labeled breaks (whose introduction to the language was itself very contentious, even if for no good reason), but I can totally see someonenot me enforcing the rule "no early returns in closures". After all, any early return can always be refactored to use a single return point at the end or, more idiomatically, to rely on the "block returns its last expression" rule to use explicit nested branching without any explicit return keyword. There is a #[clippy::restriction] lint which demands to unconditionally use return keyword for returning values from a function. Why shouldn't there be a somewhat opposite lint for explicit returns from closures?

3 Likes