Pre-RFC: Break with value in for/while loops

Surely you wouldn't explicitly write break Default::default() under this proposal unless you wanted to conflate the two cases, but it may be the case that you do break val where val happens to be the same as what Default::default() evaluates to, and later code mistakes that value for the case that the loop was exhausted. This is precisely why we have discriminated unions in the first place.

Or it could be resolved by making for loops always evaluate to an Option<_>. It will break compatibility, but that can be easily handled with an edition change.

1 Like

I'd argue that such cases will be extremely rare and they can be solved (or "worked around", does not matter) well enough with a combinator changing return type of a generator. For example we could have a combinator which will map return type of a generator from T to Option<T>, so you will be able to terminate loops over an infinite generator with break None;. Also this solution will work without any issues in generic contexts as well, when T is not known.

1 Like

I guess my thoughts are it is trivial to implement impl Into () for Option<()> how terrible would it be to automatically call into() when the expected result of the for loop is of type (), and the for loop returns Option<()>?

I mean it seems like it would be backwards compatible, and if you want to preserve the difference between Some(()) and None, you probably wouldn't be converting it to unit.

Perhaps a terrible idea, and a suggestion somewhat out of character as I generally do care about preserving type information, but find it hard to care in this specific situation.

edit: I'm guessing that perhaps there is some case involving generics where this wouldn't actually work. actually since into is reflexive it should work, but it seems we would have to always emit into() for for loops, which sounds less desirable to me. shrug

I've written this already but in a way that involved a lot of jargon, so I want to be clear about the consequences of a change like that.

Any time you write a function without a return type that ends in a for or while loop, it would now need to be followed by a semicolon. For example, this function would need to be written like this:

fn main() {
    let mut listener = TcpListener::bind(env!("SOCKET_ADDR")).unwrap();
    for stream in listener.incoming() {
        handle_connection(stream);
    }; // <- note the semicolon!
}

This is of course possible with an edition change, and even easy to autofix, but is it desirable? I really don't think so: it's a nice property of the way Rust's expression/semicolon system has worked out that block expressions normally do not need to be semicolon terminated.

4 Likes

I think that it goes a bit farther than that,

Every time you have a for loop followed by another expression the for loop would have to be terminated by a semicolon.

for × in iter {
    break 0
}; // <---

let x = 0; // any other expression 

Similar to how match that yields a (edit: non-unit) value must be terminated by a semicolon

match () {
    () => 0
}; // <---

let x = 0;
4 Likes

Generator<Return=!> is a quite natural way to model an endless counter or an event source. Making it awkward to break an loop based on it will make generators a much less attractive feature.

You keep bringing up examples with generators returning Result<_, E>. But what if the loop body wants to raise an error not covered by E? (For example, the loop body receives a raw packet of data from the generator and wants to raise a ParseError, as opposed to an io::Error that the original generator can raise.) You'd have to keep using combinators just to to extend the range of possible errors; in other words, you'd be fighting the type checker instead of working with it. This design doesn't work all that well even for your own use case, never mind others.

And like I said before, if you don't mind adding combinators, you might as well add an iteration combinator instead of extending the semantics of the for loop.

I think a more expressive language is worth the price of occasionally having to type a seemingly superfluous semicolon. It's just syntax after all. If I cannot convince you of this, then I believe there is no point in continuing this line of discussion.

(Pedantic point: every match yields a value, and every function has a return type. Sometimes the type is (), but that's still a type like any other.)

The point here is not really about the cost of typing a semicolon (which I agree is not a compelling point on its own), but about the cost of breaking changes: having to "fix" all the existing Rust code that today doesn't have these semicolons, updating all the tooling/docs in the ecosystem for these new rules, and having to re-teach all Rust users what the semicolon rules are.

Editions do make breaking changes possible, but they don't change the fact that the bar for breaking changes is (and should be) very, very high. The motivation for this change is just nowhere near strong enough.

3 Likes

I really don't think that you will have reteach the semicolon rules. Since they won't have changed.

I do, however, submit to the fact that forcing all for/while loops to have a semi-colon after them is a very large change.

Though I have not yet been convinced that having the type of a for/while loop change from () to Option<T> if there exists a break T expression in the loop body constitutes a drastic enough oddity to not warrant perusing that direction.

1 Like

Exactly, and I would even say it makes sense.

for x in data {
    ...
    if ... {
        break; // very clear that we don't care about the for-loop as an expression
    }
}
let val = for x in data {
    ...
    if ... {
        break found; // very clear that we want the value, from here and from the `let val`
    }
}

I voiced above that break; interchangeable with break (); seems to have been taken as an axiom in past discussions of this feature, but isn't actually valuable. I would much rather see break (); explicitly if someone wants a loop that evaluates to Some(()).

2 Likes

I very strongly agree. Since the empty tuple would then be the idiomatic way of getting else/then functionality without new keywords

I disagree, it makes the language more consistent, which is valuable. It makes Rust easier to teach/learn, to write and audit unsafe code, and to write macros. The last point especially would be hurt if we change the meaning of break to be different from break (), as we would lose the correlation to return, where return is exactly the same as return (). (it makes it harder to abstract over break and return). But I do note that this isn't an important point on it's own.

Breaking consistency should have a high bar. I don't think that break changing meaning is sufficiently motivated. I would find it surprising if break value (where value: T) yielded a Option<T>

I also find the motivation of find a value using a for loop to be poorly motivated given that you could just use Iterator::find, or if you have strange control flow, use Iterator::try_fold, or if you have really strange control flow, then you are probably not going to benefit from making for or you are going to be using loop instead of for anyways to signify the strange control flow.

I think we can special case ! here, so that we can just break with a value, and have it do the right thing. ! is already very special, so I don't think that this would be that big of an extension.

Okay, that at least it's some argument. I'm not terribly convinced by it; the rustfix is next to trivial and the compiler should be able to provide a helpful and relevant error message, so even if people use outdated guides, eventually they'll get it right.

First of all, I think you're being inconsistent about your preference for consistency.

Second of all, ! isn't all that special; it's just a fancy empty enum. A couple of posts later I gave a more general example with Result, of basically the same problem that the generator would be able to unreasonably constrain the type that the loop can return, which would have to be worked around with combinators (and in the motivating use case, no less). ! is just a particularly striking example of this. When every single use case brought up for the feature necessitates the addition of special cases and extra constructs like combinators, it should make you question whether it was well-designed in the first place.

(I realise that I will never convince some people by appealing to conceptual purity. But this is what ignoring conceptual purity looks like.)

Well, much like @RustyYato, I'd prefer if break worked consistently, but I might accept this solution as a concession to backwards compatibility.

A fancy enum? How about being the only "type" that can be directly coerced to any other type? It is already special. In fact did you know that ! isn't actually a type? Currently it can only be used in the return type of a function (ignoring stability bugs). That will be changing soon, in

So, the never type is quite special already. Once !, becomes just another uninhabited type, we can extend this special casing to all uninhabited types.

Actually, thinking about this more, the never type wouldn't need to be special cased.

That is a good example, and yes in my model it would require a combinator. Another reason to keep combinators is that it makes Rust really easy to demystify, just look at how simple the for loop desugarring currently is. Adding generators will make it a bit more complex, but not by much. Whereas fundementally changing break will create complications.

We need those combinators anyways, so they aren't just extra bits. That's like asking for Iterator without any of it's combinators. It just would be powerful at all.

For me conceptual purity looks like making as few special cases as possible, while still retaining the power of other solutions. I think the generator solution using combinators works out quite well, because it is so simple in design.

2 Likes

Any other uninhabited enum can also be easily converted to any type, it's just a match x {}. There is vacuously only one way to perform such a conversion, so the question of whether to make it implicit is purely a matter of convienience, not correctness; it's just that Rust mostly avoids implicit conversions. The one from ! was added mostly for backwards-compatibility reasons, so that code continues to compile after the type of return changes from () to !. So ! is a bit special, but not enough to justify singling it out and introducing wildly non-uniform type system behaviour wherever it appears.

In other words, yes, I am quite familiar with that RFC. I know it's not stable yet, we'll get there, don't worry.

Have you thought about how this is going to interact with generics and non-exhaustive empty enums? The whole question of whether a type is inhabited or not is going to significantly compilcate reasoning about types wherever a for loop appears.

Decoupling the return type of the loop from the return type of the generator avoids this problem. This is what the Option<_> proposal does. There are other possibilities, of course; one may introduce for-else instead, or whatever keyword you want, I'm not particularly attached to the else keyword*. In the generators thread, I quite liked @canndrew's proposal, except the continue bit. Personally, I'd be more or less just as happy with that instead.

Consider then that the Option<_> proposal (in its pure form, at least) introduces fewer special cases and is much simpler than the one you champion, with inhabitedness checking. And the for-else proposal I linked above (fine, for-then) is just as expressive as both of ours.

What I meant by conceptual purity is making features orthogonal: useful independently of each other and interacting in simple, predictable ways. A well-designed for loop would be usable on its own, even if the language did not have any combinators at all. (Combinators aren't going anywhere, of course. But the for loop should still be usable as it stands.) The generator proposal hopelessly tangles together for loops and combinators even in its motivating use case (I cannot stress this enough), and you're suggesting to add reasoning about inhabitedness to the mix? No, just no.


* It does generalise well, though. Read especially the last bit. What a shame that issue got derailed by people who don't understand type theory.

3 posts were split to a new topic: How special is the ! type?

Wondering about the semicolon with a different approach, why not have something to void that?

for x in iter {
    dreak 0  // dreak could be anything or even break_raw
}  // <--

In the above case, dreak could made the for loop automatically get the inner value.

Or even using semicolon for control, nice idea but may be confusing to beginners at first.

Either way, I still think match follows by for loop seemed ugly and can be yet another iterator-like options.

match for x in iter { if x == 4 { break x } } {
    None => /* then */,
    Some(_) => /* else */,
}

I'm not sure what you are proposing, could you show some examples?

While we're adding Option as a possible value of a block, should we also change if?

if true {1} else {2} evaluates to i32, but if true {1} is (). Should if true {1} evaluate to Option<i32> (Some(1)) instead?

That would definitely more consistent but it would have to then follow that if false { 1 } would evaluate to None.

Perhaps each kind of block should have an associated return type. A foor loop could return a

enum ForReturn<T,I:Iterator>{
  Finished,
  Broken{
    value: T,
    remainder_of_the_iterator:I,
  },
}

and similar for other blocks. It could have more information associated rather than only the value, as the remainder of the iterator in the example. Then, the unit type () would need some type coercion from ForReturn<T> for backward compatibility. Not sure if that is possible; we certainly do not want such coercion from Option.

The case of if feels weird. I would like to if true {x} and if true {x} else {x} to return the same thing, but that seems impossible. So, proceeding as above, we could have IfReturn and perhaps IfElseReturn. Their difference could cause confusion. Also IfElseReturn<T> would also require coercion to T.

And of course, this helps nothing with the issue of the semicolons.