Pre-RFC: Break with value in for/while loops

for x in iter {
    dreak 0  // dreak could be anything or even break_raw
}  // <--

The loop returns an Option in the case of break but returns the inner value in Option in the case of dreak (random given name but can be any identifier).

Say if the loop is using dreak then it acts like a pre-unwrapped output of Option, not sure if adding this as a keyword would be good.

let v = 'l: {for i in expr {
   ... break 'l 24; ...
} 42}

?

2 Likes

So, you just want to use a new keyword instead of break?

That is one of a solution for the semicolon issue if this is there.

Yes, that is a nice way to make equivalent code (and not forget the semicolon).

The problem can also be solved with iterators by just putting the body of the loop as a function.

    let w= (0..10).find_map(|t|{
        println!("doing {}",t);
        if t==5{
            Some(0.6)
        }else{
            None
        }
    });

How many ways do we want to have to solve the problem?

Making all blocks able to return values sounds good for coherence. But that would entail making other changes in a pretty stable part of rust, as some blocks requiring semicolons and not others. I can see that would have sense in a new language, but not so much about changing an existing one.

Anyway, if I see someone wanting the feature I look for the best way to make it fit.

1 Like

Coming from https://github.com/rust-lang/rfcs/issues/1767#issuecomment-583691737 with my proposal dumped below. I think the dispute over whether or not the loop should return an Option is completely unnecessary given the proper desugaring rule.

I would really love to have this feature, so I try to make my proposal:

  • be useful in everyday situations
  • have a clear and intuitive formal semantics and typing rule
  • avoid the semantical confusion of else

Long story short, the while loop is followed by a then clause:

while BOOL {BLOCK1} then {BLOCK2}

This should be desugared to, and therefore have the same type and semantics with:

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}

just as the usual while loop

while BOOL {BLOCK1} // then {}

have already and always been desugared to

loop {if (BOOL) {BLOCK1} else {break {}}}

It requires a bit more care for for but the story remains basically the same.

Note that the break in the then clause is harmless but redundant (just as the last expression in a function prefixed with return), since it will be desugared to break (break ...).

The choice of then over else or final is explained in #961

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

This syntax can be used wherever the loop is meant to find something instead of to do something. Without this feature, we usually do the finding and then put the result somewhere, which is a clumsy emulation of just to find something.

For example:

while l<=r {
  let m = (l+r)/2;
  if a[m] < v {
    l = m+1
  } else if a[m] > v {
    r = m-1
  } else {
    break Some(m)
  }
} then {
  println!("Not found");
  None
}

which means:

loop {
  if (l<=r) {
    let m = (l+r)/2;
    if a[m] < v {
      l = m+1
    } else if a[m] > v {
      r = m-1
    } else {
      break Some(m)
    }
  } else {
    break {
      println!("Not found");
      None
    }
  }
}

Even this desugared version is cleaner than something like

{
  let mut result = None;
  while l<=r {
    let m = (l+r)/2;
    if a[m]<v {
      l = m+1
    } else if a[m]>v {
      r = m-1
    } else {
      result = Some(m);
      break
    }
  }
  if result==None {
    println!("Not found");
  }
  result
}
3 Likes

I really like this solution. However, having proposed that syntax before I know that you will have two problems.

  • this will need to be a new keyword because that syntax would currently be accepted.
  • therefore this will only be able to do it after the next edition.

In this case I would say that else remains an option since it is rather justified and explainable by the desugaring rule

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}
2 Likes

So it's just forelse with a different keyword. This option has been brought up before, but it's unclear how wide support it enjoys. (Personally though, I find it satisfactory enough.)

So why not have a straw poll? The syntax can be bikeshedded later.

  • All loops directly return the value passed to break; if a for/while loop is exhausted, execute an extra block (which defaults to an empty block, evaluating to ())
  • for and while loops return an Option<_> (with Some wrapping the value passed to break and None representing an exhausted loop) in some or all circumstances
  • for loops evaluate to the value that the generator returns when it finishes, break can override the value; while loops are left as-is
  • Status quo: only allow loop loops to break with values

0 voters

Just to advertise the extra-block proposal:

  • It is better or equal to the status quo because it is backward compatible and it is a natural generalization for a sound and intuitive reason (see here). In another word, it is not really an "extra block" but rather an indispensable component that has been hidden and left implicit.
  • It is better than the Option proposal because of
    • backward compatibility
    • more natural behavior in mutual simulation (see below) due to intrinsic generality
  • It is better than the for with return-by-generator proposal because of
    • backward compatibility
    • that it is impossible to simulate the extra-block behavior with the return-by-generator for without heavy engineering, because the return type is forced to coincide with the return type of the generator, which to me is an awkward byproduct of the design, especially when the loop is returned from an inside break
    • on the other hand, it is seamless to simulate the return-by-generator behavior with an extra block, if it is ever worth simulating
    • that it is too ad-hoc to force the for loop, to evaluate to such a specific thing as what the generator left over, even though it might not be completely useless in specific cases
// Mutual simulation between `while` with extra-block and Option

// from extra-block to Option
while BOOL {
  ...
  break (Some x)
  ...
} else-or-whatever {
  None
}

// from Option to extra-block
match (while BOOL {
  ...
  break x
  ...
}) {
  None => {extra-block},
  Some(x) => x,
}
// Simulation of return-by-generator `for` with an extra block

for i in GENERATOR {
  ...
  break x
  ...
} else-or-whatever(result) {
  result
}

// Simulation of the extra block with a return-by-generator `for`
for i in GENERATOR-RETURNING-TYPE-I32 {
  ...
  break OOPS-IMPOSSIBLE-TO-RETURN-A-BOOL
// unless you are willing to do some surgery on the generator
3 Likes

Python has this feature. In an informal survey of Python users less than 25% of respondents could select the correct answer for how it is evaluated. Most troublingly, more than 55%* selected an answer with the wrong semantics (as opposed to the other options - that they don't know or that they think it isn't valid).

I know that people will continue discussing how to add this feature indefinitely, but every variation on the design has been considered and rejected by the lang team severla years ago. We have very little bandwidth and we are unlikely to revisit this question without strong, new motivating arguments .

*There were three other options that were wrong semantics. The single most popular wrong semantics was chosen by more than one and a half times as many people as the correct answer.

13 Likes

This might be less of a problem for Rust, though. Given that loops in Rust are expressions, they always have to evaluate to something, so one can always ask ‘if the else block only runs if the loop doesn't run at all, what does the loop evaluate to otherwise?’ or ‘if a loop is broken with a value, where does the value of the else block go?’. When the user asks themself these questions, they'll realise they're probably missing something. In Python, loops are not expressions, so those questions are meaningless, and therefore it's easier to end up with a misconception. Given that this feature has been specifically requested so that all loops can return meaningful values, I expect usages where such questions are relevant and illuminating to be more common.

Not to mention users won't be misled by the keyword being else if we choose a different one. Perhaps finish?

6 Likes

I would expect if Python had used that keyword (or then or whatever), that the option "The block always executes after the loop runs" would have been the most popular - another wrong answer.

2 Likes

Not entirely inconceivable. But that would be redundant to just writing the code after the loop, without any extra keywords. Someone who stops for a second to think about it would realise this. And the other obvious interpretation is the correct one.

The point is, that poll is just one data point (by the way, a nine years old one and by its own admission 'totally unscientific'), it's not so clear what factors had influenced it the most, and in Rust's case some of them could be mitigated.

3 Likes

What does break with value provide that just shoving it inside a closure? This looks like the definition of "syntactic sprinkles", you don't need it and it doesn't really make a difference to code anyway.

very contrived, but self contained example (unlike the OP).

fn main() {
    let temp: i32 = (|| -> i32 {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            return q;
        }
    } 
    return q;
    })();
    println!("example: {}", temp);
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a98e89ec9a6a1940e80d0997c8a1cbc3

Yeah I know it will only print out 6, but it is just an example. A closure is like 10 extra characters no matter the size of what is inside, and instead of breaks you can just use returns and it all still works.

Control flow cannot escape a closure, but it can escape a labelled block. Say, the ? operator can only bubble up to the closure call site; if you want to pass errors along out of the containing function, you’d have to use ? twice.

can you provide an example that illustrates this deficiency in the closure system?

Adapting your own example:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ba97c5977b528e792dc37efa2901e9cf

Okay I see your point

With labeled blocks, you can do

#![feature(label_break_value)]

fn main() {
    let temp: i32 = 'temp: {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            break 'temp q;
        }
    } 
    
    q
    };
    println!("example: {}", temp);
}

Or on stable,

fn main() {
    // abusing loop, even through there is no actual loop
    let temp: i32 = 'temp: loop {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            break 'temp q;
        }
    } 
    
    break q
    };
    println!("example: {}", temp);
}

With the help of a macro you can almost get the first version on stable. This doesn't have any of the issues a closure presents.

6 Likes