Pre-RFC: Break with value in for/while loops

exprosic · February 8, 2020, 5:46pm

Just to advertise the extra-block proposal:

It is better or equal to the status quo because it is backward compatible and it is a natural generalization for a sound and intuitive reason (see here). In another word, it is not really an "extra block" but rather an indispensable component that has been hidden and left implicit.
It is better than the Option proposal because of
- backward compatibility
- more natural behavior in mutual simulation (see below) due to intrinsic generality
It is better than the for with return-by-generator proposal because of
- backward compatibility
- that it is impossible to simulate the extra-block behavior with the return-by-generator for without heavy engineering, because the return type is forced to coincide with the return type of the generator, which to me is an awkward byproduct of the design, especially when the loop is returned from an inside break
- on the other hand, it is seamless to simulate the return-by-generator behavior with an extra block, if it is ever worth simulating
- that it is too ad-hoc to force the for loop, to evaluate to such a specific thing as what the generator left over, even though it might not be completely useless in specific cases

// Mutual simulation between `while` with extra-block and Option

// from extra-block to Option
while BOOL {
  ...
  break (Some x)
  ...
} else-or-whatever {
  None
}

// from Option to extra-block
match (while BOOL {
  ...
  break x
  ...
}) {
  None => {extra-block},
  Some(x) => x,
}

// Simulation of return-by-generator `for` with an extra block

for i in GENERATOR {
  ...
  break x
  ...
} else-or-whatever(result) {
  result
}

// Simulation of the extra block with a return-by-generator `for`
for i in GENERATOR-RETURNING-TYPE-I32 {
  ...
  break OOPS-IMPOSSIBLE-TO-RETURN-A-BOOL
// unless you are willing to do some surgery on the generator

withoutboats · February 9, 2020, 5:30pm

Python has this feature. In an informal survey of Python users less than 25% of respondents could select the correct answer for how it is evaluated. Most troublingly, more than 55%* selected an answer with the wrong semantics (as opposed to the other options - that they don't know or that they think it isn't valid).

I know that people will continue discussing how to add this feature indefinitely, but every variation on the design has been considered and rejected by the lang team severla years ago. We have very little bandwidth and we are unlikely to revisit this question without strong, new motivating arguments .

*There were three other options that were wrong semantics. The single most popular wrong semantics was chosen by more than one and a half times as many people as the correct answer.

felix.s · February 10, 2020, 11:47am

This might be less of a problem for Rust, though. Given that loops in Rust are expressions, they always have to evaluate to something, so one can always ask ‘if the else block only runs if the loop doesn't run at all, what does the loop evaluate to otherwise?’ or ‘if a loop is broken with a value, where does the value of the else block go?’. When the user asks themself these questions, they'll realise they're probably missing something. In Python, loops are not expressions, so those questions are meaningless, and therefore it's easier to end up with a misconception. Given that this feature has been specifically requested so that all loops can return meaningful values, I expect usages where such questions are relevant and illuminating to be more common.

Not to mention users won't be misled by the keyword being else if we choose a different one. Perhaps finish?

withoutboats · February 10, 2020, 2:14pm

I would expect if Python had used that keyword (or then or whatever), that the option "The block always executes after the loop runs" would have been the most popular - another wrong answer.

felix.s · February 10, 2020, 3:10pm

Not entirely inconceivable. But that would be redundant to just writing the code after the loop, without any extra keywords. Someone who stops for a second to think about it would realise this. And the other obvious interpretation is the correct one.

The point is, that poll is just one data point (by the way, a nine years old one and by its own admission 'totally unscientific'), it's not so clear what factors had influenced it the most, and in Rust's case some of them could be mitigated.

Cazadorro · February 10, 2020, 5:37pm

What does break with value provide that just shoving it inside a closure? This looks like the definition of "syntactic sprinkles", you don't need it and it doesn't really make a difference to code anyway.

very contrived, but self contained example (unlike the OP).

fn main() {
    let temp: i32 = (|| -> i32 {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            return q;
        }
    } 
    return q;
    })();
    println!("example: {}", temp);
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a98e89ec9a6a1940e80d0997c8a1cbc3

Yeah I know it will only print out 6, but it is just an example. A closure is like 10 extra characters no matter the size of what is inside, and instead of breaks you can just use returns and it all still works.

felix.s · February 10, 2020, 6:12pm

Control flow cannot escape a closure, but it can escape a labelled block. Say, the ? operator can only bubble up to the closure call site; if you want to pass errors along out of the containing function, you’d have to use ? twice.

Cazadorro · February 10, 2020, 6:13pm

can you provide an example that illustrates this deficiency in the closure system?

felix.s · February 10, 2020, 6:19pm

Adapting your own example:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ba97c5977b528e792dc37efa2901e9cf

Cazadorro · February 10, 2020, 6:20pm

Okay I see your point

RustyYato · February 10, 2020, 8:21pm

With labeled blocks, you can do

#![feature(label_break_value)]

fn main() {
    let temp: i32 = 'temp: {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            break 'temp q;
        }
    } 
    
    q
    };
    println!("example: {}", temp);
}

Or on stable,

fn main() {
    // abusing loop, even through there is no actual loop
    let temp: i32 = 'temp: loop {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            break 'temp q;
        }
    } 
    
    break q
    };
    println!("example: {}", temp);
}

With the help of a macro you can almost get the first version on stable. This doesn't have any of the issues a closure presents.

kennytm · February 10, 2020, 8:52pm

IMO Pre-RFC: Break with value in for/while loops is the only sane solution in the whole thread, yet it is not included in the poll. So I voted status-quo.

generator is the next best solution, but it doesn't cover while and are too experimental to rely on.
for/else or variants of it are hard to understand, as we've seen from Python. No, renaming else to then or finally doesn't help a bit.
changing the return type of for and while from () to Option<()>, breaking all existing code just for a niche feature? The cost doesn't cover the benefit at all. This is the worst solution among everything IMO.
- if the type switches between () and Option<T> depending on whether break v; exists it is very ugly to explain.
- retuning Option<T> instead of T also makes the behavior of break differ between for/while and loop.

The example written using the Default solution:

let result = while l<=r {
  let m = (l+r)/2;
  if a[m] < v {
    l = m+1
  } else if a[m] > v {
    r = m-1
  } else {
    break Some(m)
  }
}.or_else(|| {
  println!("Not found");
  None
});

(We could introduce a stricter trait such as ControlFlowDefault if Default is considered unsuitable.)

(And interestingly, with this feature, x.then(|| f()) maybe written as if x { Some(f()) }.)

Nokel81 · February 10, 2020, 9:09pm

I actually never read that comment until now.... I think it is a very interesting solution which does have some useful properties (which you mention).

exprosic · February 11, 2020, 12:44am

I don't think it ever makes sense to impose general trait restriction (or even type restriction, in the case of the for-return-generator proposal) just because of (the flaw of) the syntax construct. A less confusing implementation for that idea would probably be to restrict (not infer) the type of the loop to be either () or Option.

But first, as you have said, the Option seems like a patch. Default that tries to 'unify' () and Option and even all the folks of types would be just a patch covering patches, effectively blocking the 'freedom of types' for those who don't want to, or even don't make sense to implement a Default.

Second, the users will probably never write break None because when they do, it is unlikely to make sense.

I believe this sort of problems are intrinsic and will keep poping up if the solution is to break down the original design, put the functionality inside, and then glue it back with however transparent glue, instead of to just discover what is already there.

I would say the naming is the only problem for the extra block. Someone proposed !break, which seems pretty accurate to me, but was rejected for the 'unconventional syntax'. But anyway, the extra block is not extra, and it means more than just an implementation of a random functionality.

newpavlov · February 11, 2020, 1:01am

Personally as a proponent of the generator approach I don't care about value breaks that much. In practice I can't remember many cases of when I needed such feature. But I really hope that integration of generators into for loops will help with the following pattern which is annoyingly common:

for value in iter {
    let value = value?;
    // ...
}

It even causes ad-hoc proposals like for val? in iter { .. } instead f a more generic solution. Allowing value breaks is just a nice bonus.

Note that iteration over Iterator<Item=Result<T, E>> has different semantics from iteration over Generator<Yield=T, Return=Result<(), E>>. In the former case you may receive Ok values after an error, but in the latter one the iteration must be terminated on a first error. (Unfortunately we do not enforce this constraint using type system, but it's a different discussion...)

exprosic · February 11, 2020, 2:46am

I think if Python were granted another keyword, it would have no reason to use anything that deviates from its intention, namely: nobreak.

And as @felix.s has said, most of the unfortunate confusions from Python would be blocked simply by the type system of Rust. The kind of rhetorical questions like "why should I care about whether it is break-ed or not" is now legitimately answerable, since a type of a loop naturally implies a corresponding value at each possible exit of the loop, which is already part of the users' construction plan without any reminder from anyone, except perhaps from the compiler, in which case the error reflects not the flaw of the syntax but really the bug from the user.

Since the point is to return something from the loop, I would also propose the following syntax

for i in 0..n break None {
  if some_predicate(i) {break Some(i)}
}

// for generator:
for token in TOKEN_GENERATOR
  break |result| {
    println!("stopped scanning tokens");
    match result {
      Finished => println!("finished"),
      SyntaxError => println!("syntax error"),
    }
  }
{
  consume_token(token)
}

// backward compatibility
for i in 0..n /* break () */ {
  println!("{}", i)
}

Anyone who understands that the loop is not a statement but a value, which is already true, will intuitively get what it means even without referring to the documentation.

kennytm · February 11, 2020, 3:58am

There is a trait restriction in the ? operator (Try), return type of main() (Termination), input of for (IntoIterator), input of match (StructuralPartialEq) etc. I don't see why we can't have a trait restriction on the input of break.

Nobody wants to unify () and Option in the Default solution, and there's no need to manually write break None at the end of the block, and also you can't use break; if the loop returns an Option.

The Default solution means that the while loop

while f() {
    if g() {
        break h();
    }
}

is de-sugared to

loop {
    if !f() {
        break Default::default(); // the only change
    }
    if g() {
        break h();
    }
}

You do need a default value when the loop exits without break, and Default does make sense in this context. If Default does not make sense for the type, that type should probably not be used for breaking.

Furthermore, since Option<T> implements Default even for non-Default T, you could simply write

let result = for x in it {
    break Some(f(x));
}.unwrap_or_else(|| g());

exprosic · February 11, 2020, 4:17am

? requires a Try-able thing, main requires Termination status, for requires an Interator (be it Into or not), match requires a StructuralPartialEq to find a slot for its argument. These restrictions are not really from the syntax but from the functionalities that they implement.

But, let's say for i32, why is its additive unit always more special than its multiplicative unit (or the other way around) ? Whatever <i32 as Default>::default() you choose, how should you know whether the loop has been broken or not in all the cases where the loop returns an i32?

What the loop should return in the case of being exhausted should really be the business of the loop, not of the type. Even if Default does not make sense for the type, that type could still make perfect sense in each specific realization of the loop, as well as of the type. The Default for all loops would be required only because of the lack of the syntax to specify the defaults for each loop. That is, Default would be required in this case solely because of the syntax instead of the functionality. A type level Default is not really as legit as the trait restrictions you have mentioned.

Yes I could indeed write that, just as I could replace all the for and while by loop. Without small enough operational, logical and aesthetic overhead, the reducibility itself does not really form an argument. When mutually reducible, I would prefer a solution without redundant Some, redundant unwrap_or_else and redundant ||.

nakacristo · February 11, 2020, 9:27am

RustyYato:

With labeled blocks, you can do

#![feature(label_break_value)]

fn main() {
    let temp: i32 = 'temp: {
    let mut q = 0_i32;
    for x in 0..10 { 
        q += x;
        if q == 6_i32{
            break 'temp q;
        }
    } 
    
    q
    };
    println!("example: {}", temp);
}

I always find this pretty contrived. But perhaps if we could use the labels in a different way it would feel more natural. I am thinking on something such as

#![feature(yet_another_break_proposal)]
fn main() {
	let mut q = 0_i32;
	'myfor for x in 0..10 {
		q += x;
		if q == 6_i32{
			break q;
		}
	}
	arbitrary_middle_code();	
	if 'myfor broken as t{
		process(t);
	}else{
		something_else();
	}
}

This however would require a decent amount of new notation. And in contrast with Option or Default based solutions this averts the ending semicolon problem.

PoignardAzur · February 11, 2020, 9:45am

It is a pretty contrived use case in the first place.

Honestly, I find Yato's proposed workaround fairly elegant, given how niche a use case it covers. Its only drawback over the "break/else" version is to have an additional label, which isn't exactly the height of identifier pollution.

Topic		Replies	Views
Yet another discussion of `while-else` clause language design	10	4325	April 1, 2021
Break with value alternatives language design	25	2483	February 4, 2020
Pre-RFC: `while let ... break` language design	23	1696	December 25, 2023
[Pre-RFC] Early exit from any block language design	16	6629	March 25, 2019
Allow loops to return values other than () ideas (deprecated)	19	5622	March 25, 2019

Pre-RFC: Break with value in for/while loops

Related topics