Pre-RFC: Break with value in for/while loops

Okay, that at least it's some argument. I'm not terribly convinced by it; the rustfix is next to trivial and the compiler should be able to provide a helpful and relevant error message, so even if people use outdated guides, eventually they'll get it right.

First of all, I think you're being inconsistent about your preference for consistency.

Second of all, ! isn't all that special; it's just a fancy empty enum. A couple of posts later I gave a more general example with Result, of basically the same problem that the generator would be able to unreasonably constrain the type that the loop can return, which would have to be worked around with combinators (and in the motivating use case, no less). ! is just a particularly striking example of this. When every single use case brought up for the feature necessitates the addition of special cases and extra constructs like combinators, it should make you question whether it was well-designed in the first place.

(I realise that I will never convince some people by appealing to conceptual purity. But this is what ignoring conceptual purity looks like.)

Well, much like @RustyYato, I'd prefer if break worked consistently, but I might accept this solution as a concession to backwards compatibility.

A fancy enum? How about being the only "type" that can be directly coerced to any other type? It is already special. In fact did you know that ! isn't actually a type? Currently it can only be used in the return type of a function (ignoring stability bugs). That will be changing soon, in

So, the never type is quite special already. Once !, becomes just another uninhabited type, we can extend this special casing to all uninhabited types.

Actually, thinking about this more, the never type wouldn't need to be special cased.

That is a good example, and yes in my model it would require a combinator. Another reason to keep combinators is that it makes Rust really easy to demystify, just look at how simple the for loop desugarring currently is. Adding generators will make it a bit more complex, but not by much. Whereas fundementally changing break will create complications.

We need those combinators anyways, so they aren't just extra bits. That's like asking for Iterator without any of it's combinators. It just would be powerful at all.

For me conceptual purity looks like making as few special cases as possible, while still retaining the power of other solutions. I think the generator solution using combinators works out quite well, because it is so simple in design.

2 Likes

Any other uninhabited enum can also be easily converted to any type, it's just a match x {}. There is vacuously only one way to perform such a conversion, so the question of whether to make it implicit is purely a matter of convienience, not correctness; it's just that Rust mostly avoids implicit conversions. The one from ! was added mostly for backwards-compatibility reasons, so that code continues to compile after the type of return changes from () to !. So ! is a bit special, but not enough to justify singling it out and introducing wildly non-uniform type system behaviour wherever it appears.

In other words, yes, I am quite familiar with that RFC. I know it's not stable yet, we'll get there, don't worry.

Have you thought about how this is going to interact with generics and non-exhaustive empty enums? The whole question of whether a type is inhabited or not is going to significantly compilcate reasoning about types wherever a for loop appears.

Decoupling the return type of the loop from the return type of the generator avoids this problem. This is what the Option<_> proposal does. There are other possibilities, of course; one may introduce for-else instead, or whatever keyword you want, I'm not particularly attached to the else keyword*. In the generators thread, I quite liked @canndrew's proposal, except the continue bit. Personally, I'd be more or less just as happy with that instead.

Consider then that the Option<_> proposal (in its pure form, at least) introduces fewer special cases and is much simpler than the one you champion, with inhabitedness checking. And the for-else proposal I linked above (fine, for-then) is just as expressive as both of ours.

What I meant by conceptual purity is making features orthogonal: useful independently of each other and interacting in simple, predictable ways. A well-designed for loop would be usable on its own, even if the language did not have any combinators at all. (Combinators aren't going anywhere, of course. But the for loop should still be usable as it stands.) The generator proposal hopelessly tangles together for loops and combinators even in its motivating use case (I cannot stress this enough), and you're suggesting to add reasoning about inhabitedness to the mix? No, just no.


* It does generalise well, though. Read especially the last bit. What a shame that issue got derailed by people who don't understand type theory.

1 Like

3 posts were split to a new topic: How special is the ! type?

Wondering about the semicolon with a different approach, why not have something to void that?

for x in iter {
    dreak 0  // dreak could be anything or even break_raw
}  // <--

In the above case, dreak could made the for loop automatically get the inner value.

Or even using semicolon for control, nice idea but may be confusing to beginners at first.

Either way, I still think match follows by for loop seemed ugly and can be yet another iterator-like options.

match for x in iter { if x == 4 { break x } } {
    None => /* then */,
    Some(_) => /* else */,
}

I'm not sure what you are proposing, could you show some examples?

While we're adding Option as a possible value of a block, should we also change if?

if true {1} else {2} evaluates to i32, but if true {1} is (). Should if true {1} evaluate to Option<i32> (Some(1)) instead?

That would definitely more consistent but it would have to then follow that if false { 1 } would evaluate to None.

Perhaps each kind of block should have an associated return type. A foor loop could return a

enum ForReturn<T,I:Iterator>{
  Finished,
  Broken{
    value: T,
    remainder_of_the_iterator:I,
  },
}

and similar for other blocks. It could have more information associated rather than only the value, as the remainder of the iterator in the example. Then, the unit type () would need some type coercion from ForReturn<T> for backward compatibility. Not sure if that is possible; we certainly do not want such coercion from Option.

The case of if feels weird. I would like to if true {x} and if true {x} else {x} to return the same thing, but that seems impossible. So, proceeding as above, we could have IfReturn and perhaps IfElseReturn. Their difference could cause confusion. Also IfElseReturn<T> would also require coercion to T.

And of course, this helps nothing with the issue of the semicolons.

The problem with this that I see is two fold.

  1. Name bike-shed, the values are not returned but resulting from the expression
  2. This would really clutter up the resulting types. Whereas Option<T> being used everywhere is imo cleaner and easier to think about. Also, it would be nice if if {} and if {} else {} resulted in the same type but they cannot in general because the former might not be taken and the later a branch is always taken. Which nicely maps onto Some / None

Wouldn't it be good to assume

if ... {...}

is a shorthand for

if ... {...} else {()}

?

This allows if ... {1} else {2} but makes if ... {1} an error

Update: seem this is already the case :slight_smile:

2 Likes
for x in iter {
    dreak 0  // dreak could be anything or even break_raw
}  // <--

The loop returns an Option in the case of break but returns the inner value in Option in the case of dreak (random given name but can be any identifier).

Say if the loop is using dreak then it acts like a pre-unwrapped output of Option, not sure if adding this as a keyword would be good.

let v = 'l: {for i in expr {
   ... break 'l 24; ...
} 42}

?

2 Likes

So, you just want to use a new keyword instead of break?

That is one of a solution for the semicolon issue if this is there.

Yes, that is a nice way to make equivalent code (and not forget the semicolon).

The problem can also be solved with iterators by just putting the body of the loop as a function.

    let w= (0..10).find_map(|t|{
        println!("doing {}",t);
        if t==5{
            Some(0.6)
        }else{
            None
        }
    });

How many ways do we want to have to solve the problem?

Making all blocks able to return values sounds good for coherence. But that would entail making other changes in a pretty stable part of rust, as some blocks requiring semicolons and not others. I can see that would have sense in a new language, but not so much about changing an existing one.

Anyway, if I see someone wanting the feature I look for the best way to make it fit.

1 Like

Coming from https://github.com/rust-lang/rfcs/issues/1767#issuecomment-583691737 with my proposal dumped below. I think the dispute over whether or not the loop should return an Option is completely unnecessary given the proper desugaring rule.

I would really love to have this feature, so I try to make my proposal:

  • be useful in everyday situations
  • have a clear and intuitive formal semantics and typing rule
  • avoid the semantical confusion of else

Long story short, the while loop is followed by a then clause:

while BOOL {BLOCK1} then {BLOCK2}

This should be desugared to, and therefore have the same type and semantics with:

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}

just as the usual while loop

while BOOL {BLOCK1} // then {}

have already and always been desugared to

loop {if (BOOL) {BLOCK1} else {break {}}}

It requires a bit more care for for but the story remains basically the same.

Note that the break in the then clause is harmless but redundant (just as the last expression in a function prefixed with return), since it will be desugared to break (break ...).

The choice of then over else or final is explained in #961

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

This syntax can be used wherever the loop is meant to find something instead of to do something. Without this feature, we usually do the finding and then put the result somewhere, which is a clumsy emulation of just to find something.

For example:

while l<=r {
  let m = (l+r)/2;
  if a[m] < v {
    l = m+1
  } else if a[m] > v {
    r = m-1
  } else {
    break Some(m)
  }
} then {
  println!("Not found");
  None
}

which means:

loop {
  if (l<=r) {
    let m = (l+r)/2;
    if a[m] < v {
      l = m+1
    } else if a[m] > v {
      r = m-1
    } else {
      break Some(m)
    }
  } else {
    break {
      println!("Not found");
      None
    }
  }
}

Even this desugared version is cleaner than something like

{
  let mut result = None;
  while l<=r {
    let m = (l+r)/2;
    if a[m]<v {
      l = m+1
    } else if a[m]>v {
      r = m-1
    } else {
      result = Some(m);
      break
    }
  }
  if result==None {
    println!("Not found");
  }
  result
}
3 Likes

I really like this solution. However, having proposed that syntax before I know that you will have two problems.

  • this will need to be a new keyword because that syntax would currently be accepted.
  • therefore this will only be able to do it after the next edition.

In this case I would say that else remains an option since it is rather justified and explainable by the desugaring rule

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}
2 Likes

So it's just forelse with a different keyword. This option has been brought up before, but it's unclear how wide support it enjoys. (Personally though, I find it satisfactory enough.)

So why not have a straw poll? The syntax can be bikeshedded later.

  • All loops directly return the value passed to break; if a for/while loop is exhausted, execute an extra block (which defaults to an empty block, evaluating to ())
  • for and while loops return an Option<_> (with Some wrapping the value passed to break and None representing an exhausted loop) in some or all circumstances
  • for loops evaluate to the value that the generator returns when it finishes, break can override the value; while loops are left as-is
  • Status quo: only allow loop loops to break with values

0 voters