Pre-RFC: Break with value in for/while loops

Huh, I never thought of that before. So fn foobar() is type equivalent to fn foobar() -> () or just for the purposes of throwing away the last expr vs just returning it?

I disagree with this. These "primitives" are not really primitives, as they are implemented in the library. Being able to use them uniformly (eg. in generics), and potentially substitute them with custom types is valuable, hence turning them into lang items shouldn't be treated lightly.

4 Likes

This feels overstated to me. For example, core::ops::Add isn't somehow non-uniform just because it's marked #[lang = "add"]. Similarly, saying that we shouldn't have #[lang = "unit"] struct Unit; just leads to things like () instead, which is less uniform. Letting the compiler desugar syntax to core things seems like a good thing, to me.

8 Likes

By that reasoning, we shouldn't have the Try trait, instead every std type that currently supports it should be baked into the compiler.

There is some necessity for lang items, but introducing not strictly necessary tight coupling (in the Try example, using many concrete lang items instead of a single general one) just for supporting syntactic sugar is ill-advised.

That does not at all follow.

3 Likes

I can't remember a single case in my practice where I would've used this feature, so I am very skeptical about it and I don't think it pulls its weight in the proposed form.

As for return value of for loop I strongly prefer potential integration wit generators to the proposed approach:

1 Like

That incompatibility can be perfectly solved with editions: current edition warns when the value of for/while is not immediately discarded, next edition enables the feature.

I dislike that proposal, precisely because it would not integrate well with the feature proposed here. There is something fishy about the loop body being free to choose the value it passes to break, but its type being dictated by another object.

Labeled blocks allow for such a for-then construction without additional branches:

fn for_then(k: i32) {
    'otherwise: {
        'then: {
            for i in 0..10 {
                if i==k {break 'then;}
            }
            println!("completed");
            break 'otherwise;
        }
        println!("left early");
    }
}

fn main() {
    for_then(4);
    for_then(10);
}

The restricted construction is more readable:

fn for_then(k: i32) {
    'then: {
        for i in 0..10 {
            if i==k {
                println!("left early");
                break 'then;
            }
        }
        println!("completed");
    }
}

Even the desired syntax is handled better by them:

enum Break {Then, Else}
use Break::{Then, Else};

fn for_then(k: i32) {
    match 'block: {
        for i in 0..10 {
            if i==k {break 'block Else;}
        }
        Then
    } {
        Then => println!("completed"),
        Else => println!("left early")
    }
}
2 Likes

Could you point out some of the downsides of this approach, having for and while evaluate to Option<T> if they contain break T; and evaluate to () it they contain break; or no break? From the perspective of reading and writing Rust code this seems perfectly natural to me, and it is not a breaking change because break T; is not currently allowed there. And in some ways it is consistent with break inside loop in that loop evaluates to T if it contains break T; and evaluates to () if it contains break;.

break T; sufficiently clearly communicates from writer to reader that we care about the value of the enclosing loop expression. I think it is more valuable to be consistent about break T; breaking with a value than to be consistent about break (); being interchangeable with break;, which is theoretically attractive but not at all valuable to reading or writing code.

5 Likes

I think you misunderstood something. In the cited proposal break inside for loops will continue to work as it does today (i.e. you can't use break 1; inside them). There were suggestions to pass break argument as a generator resume argument (e.g. ending iteration without break will be desugared into gen.resume(None) and break 1; to gen.resume(Some(1))), but I am not sure about usefulness of such feature.

Can you provide practical examples for which break arguments inside for loops could be useful? I personally can't recall any.

UPD: I've messed-up the description of continue/break integration with generators, see the next message for a correct one.

That's not how I thought it would work, I exected that continue would pass an argument to resume and if you omit continue you get continue () at the end of the loop. Although this does require generator with resume arguments. Then break would have to match the type of Generator::Return. But this shouldn't pose a problem because we would presumably have a combinator for mapping the return of the generator, i.e. map_return, so we could force the return type to be whatever we want.

2 Likes

That is actually I would have expected such things to work. continue for resume arguments makes more sense than break

for item in haystack {
    if is_needle(item) {
        break item;
    }
}

is the go-to (ahem) example. You may argue it's silly given that find already exists, but then your own proposal can also surely be re-expressed with a combinator.

That seems like a workaround, not a fix. If it were really the case that for loops fit generators that return values well, a natural syntax to handle them would fall out of that proposal, without resorting to combinators. If you're not above using combinators, why not express the loop itself with a combinator?

Consider: a loop can evaluate to any type the loop body chooses. An if-else expression can evaluate to any type its bodies choose. But the for loop's return type is supposed to be chosen by the generator, why? And the generator proposal's definition of how break is supposed to work is so confusing and unnatural, that @newpavlov who proposed it cannot remember it!

As alternative to the proposed evaluating for to either () or Some(T) depending on the break t, as both types implement the Default trait:

  • If there is a break t in the for then it should evaluate to the type of t.
  • If the for can end without a break t statement (is this always true?), then the type must implement Default, whose default value is used.
  • If there are non-agreeing breaks then it is a syntax error.
  • If there is no break element we advise type inference to be (), but it could be any Default. E.g., with a for being assigned into a variable and then removing all its break statements.

This should be completely compatible with existent code. I have no idea if it would imply some notable change in the compiler. Does the compiler have a notion of the Default trait?

6 Likes

It's mainly just that its inconsistent and needs to be explained to everyone. It's not the same as loops evaluating to T or (), because break; is regarded as equivalent to break ();, just like return is. In order for it to be the same, normal loops would have to evaluate to Option<()>.

3 Likes

And that's the only "practical" example which you can provide? Then I am even more sure that this feature does not pull it's own weight in the proposed else form.

Please tone down your attacks a bit. Firstly, I wasn't proposing this feature (re-read my previous message carefully), I've just mentioned that I vaguely remember that people have proposed something like this in my proposal which does not include any modifications of continue behavior compared to today. Secondly, it was a late night for me when I was writing that message. And thirdly, as I've written several times already, I don't think this feature pulls its weight in both discussed forms, although I think the generator one is a much more natural one. It fits nicely into a notion that iterator is a coroutine with both return and resume argument types equal to () and generator is a coroutine with a resume argument type equal to (). And it's fully backwards compatible with how for loops work today, since for loop always evaluates to () and loop body has to evaluate to () as well.

In other words, with the generator/coroutine integration we will be able to write code like this:

// gen has type Generator<Yield=u8, Return=Result<(), &str>, Resume=&str>
// but I think coroutine will be a better name for a generator with
// resume arguments
let gen = ...;
let res = for byte in gen {
    match byte {
        0 => break Err("got zero"),
        1..10 => continue "less than 10",
        _ => (),
    }
    // do stuff
    "loop end"
};

Not a fan of this, for the following reasons:

  • Default is a bad idea on its own merits
  • Default::default() can potentially have side effects, which would not be apparent from the loop code
  • Your proposal fails to distinguish exhausting the loop from break Default::default()
  • Evaluating to Option<_> is no less expressive, as you can still do .or_default()

Well, you did post it as a pre-RFC, and it did include break semantics (which you got confused with someone else's continue proposal). I don't like the continue proposal either, but for different reasons.

Also, consider that with the generator proposal, if you have a generator which returns values indefinitely, i.e. a Generator<Return=!>, then a for looping over it cannot meaningfully use break at all (unless it's out of an outer block). I'd argue that breaking out of an infinite loop is a quite important operation that shouldn't be too awkward to express.

1 Like

I do not think that is true. Perhaps it is easy to misuse. I do not think this is the place to discuss if it is correct to have a Default trait in rust.

A hidden call to default could certainly be problematic and had not thought of its implications. However, is there any real code in which a spurious call to default causes a problem?

That is the wrong perspective. You write break Default::default() only when you do not want to differentiate that situation from normally ending the loop. Perhaps it is a sorted list and you have already found a value greater than the you are searching and you can stop the loop immediately without providing a value.

Yes. The improvement over Option is to never have confusion between returning () or Option<()>.

Consider the code

let x:Option<usize>=for i in 0..10
{
  if some_condition(i) { break i;}
  some_other_thing(i);
};

Then if we remove the break i the type would change from Option<usize> to (). I suppose you could also apply inference on the absence of break i, but it seems odd. In my opinion break Some(i) makes it more clear.

2 Likes

I also didn't propose the final expression being a resulting value of a for/while loop either (see "loop end").

Surely you wouldn't explicitly write break Default::default() under this proposal unless you wanted to conflate the two cases, but it may be the case that you do break val where val happens to be the same as what Default::default() evaluates to, and later code mistakes that value for the case that the loop was exhausted. This is precisely why we have discriminated unions in the first place.

Or it could be resolved by making for loops always evaluate to an Option<_>. It will break compatibility, but that can be easily handled with an edition change.

1 Like