Idea: non-local control flow

I personally think this feature is great in the languages that have it. It's really essential to closure-based DSLs in languages like Kotlin and Ruby for example*; in both languages closure based APIs are one of the main forms of metaprogramming, whereas in Rust closures are used in much more limited way, and these kinds of use cases have to be handled with macros (which have much worse hazzards IMO than the hazzards with these sorts of closures).

But the time to ship this feature was before 1.0, and I think it would be very unlikely to be net positive to add a new kind of closure to Rust for this purpose at this point in the language's development.

*Both languages also provide some syntactic sugar for closures to make it easy for them to look like natural control flow, again making macros less necessary. I believe Rust experimented with this briefly before 1.0.

9 Likes

FWIW, I don't think internal iteration is the direction Rust is going in.

Rust strikes me as a language that is very caller-controlled.

Optionals and Results and the try operators are handled using state machines, not monad continuations. Similarly, async functions chain Futures together using state-machine-like generators, not by passing callbacks to .then() methods.

So I'm not sure how useful a feature to improve internal iterator and other callbacks would be?

In fact, "non-local control flow" could be implemented as syntax sugar around this pattern. For example, if the caller wrote something like this (based on the OP's example):

s.iter().try_fold(0, |acc, &x| -> i32 {
    if x == 0 {
        return 0;
    } else if x == 1 {
        break;
    } else {
        acc * x
    }
})

the compiler would automatically transform it to:

// (the compiler would make a new enum for each invocation,
//  opaque to the programmer)
enum Nonlocal123 { Return(i32), Break }
match (s.iter().try_fold(0, |acc, &x| -> Result<i32, Nonlocal123> {
    if x == 0 {
        Err(Nonlocal123::Return(0))
    } else {
        Ok(acc * x)
    }
})) {
    Err(Nonlocal123::Return(val)) => return val,
    Err(Nonlocal123::Break) => break,
    Ok(x) => val,
}

From the callee's perspective, there would be no magic, none of the silent nonlocal control flow that we try to avoid (cf. exceptions). And this would be highly compatible with the existing ecosystem. Unsafe code wouldn't have to be updated to consider a new type of silent control flow in addition to panics. In fact, APIs wouldn't have to be updated at all to support the new kind of closure; you could just use existing Result-based methods like try_fold. (In cases where try_ variants don't exist, like this one I ran into recently, it would be a good idea to add them regardless of whether any new language features are added.)

But yes, there would need to be some way for the caller to distinguish this new kind of closure from the existing kind.

One possible approach is based on observing that this isn't just syntax sugar for a closure; it's syntax sugar for a function call where (at least) one of the arguments is a closure literal. The syntax has to be aware of the function call, because that's where the match gets inserted; the whole scheme depends on the called function passing on Errs that it encounters.

So we might as well go the whole way and add the kind of syntax sugar that @withoutboats mentioned exists in other languages, and have the auto-Result-wrapping happen iff the sugar is used.

However, fitting that kind of sugar into Rust's syntax is a bit tricky. It would be nice to be able to write

foo() |x| {
    x + 1
}

but this is already legal syntax that does something else: the pipes are interpreted as bitwise-OR.

This syntax looks clean and isn't already taken:

foo() {
    // ...
}

...but if the closure has arguments, where would you write them? You could do it Ruby-or-Swift style by putting the arguments after the opening brace, but that would be weirdly inconsistent with existing closures.

My favored approach is to just use a keyword, reminiscent of Ruby:

foo() do |x| {
    x + 1
}

This would also give us a name: we could call the new feature "do closures".

In addition... while this kind of syntax sugar only makes sense when passing a single closure, the Result-wrapping scheme could potentially support passing multiple closures to the same function call. If we wanted to support that, we could reuse the same keyword with a more traditional syntax:

foo(do |x| 123, do |x| 456);
8 Likes

Just for kicks:

I happen to know because I was digging around in ancient Rust history a bit back, but as mentioned upthread, Rust did at one time have the "trailing closure syntax". When it did, arguments to the closure were indicated on the inside of the bracket, as so:

arr.iter().for_each() { |x|
    print(x);
}

Given that the desugar of "closure busting control flow" is using Try types, though I think the most appropriate keyword if we want sugar for it probably would be something with try.

3 Likes

From the security point of view, I think this is a horrible idea.

People don’t need Rust to create spaghetti code.

2 Likes

Ouch. All this mess for not having to write a separate function and return. I appreciate you fleshing it out from a technical point of view, but it looks like a spectacularly confusing thing. Especially when it comes to unsafe and lifetimes.

2 Likes

Unless I'm missing something, these only make sense if the closure is used within a loop or iteration.

As such, this seems just a nicer (?) way of writing iterator adapters, using continue instead of .filter, .filter_map; and break instead of .find, .find_map, .take_while.

1 Like

Not sure exactly what you mean by a separate function. From my perspective, it's all this mess for not having to write a macro... either that, or avoid the abstraction altogether, which as usual has pros and cons.

For example, I recently wrote a parser for MPEG TS files. Due to the way the format works, there are a lot of instances of

if foo_present_flag {
    Some(/* read 'foo' field from file */)
} else { None }

In fact, a quick grep shows 24 copies of the text } else { None } in the file.

At one point I wondered if there was some combinator that would encapsulate the pattern if boolean { Some(expression) } else { None }. Turns out there is one available in the standard library on nightly: bool::then. So in theory I could write this, which is less noisy:

foo_present_flag.then(
    || /* read 'foo' field from file */)

Except the reading process can fail – the actual expression uses the ? operator – and that failure needs to propagate to the outer function. To do that with current lambdas, at minimum I'd have to add both Ok and ?:

foo_present_flag.then(
    || Ok(/* read 'foo' field from file */))?

This is already noisy enough to call into question its benefit over the non-combinator version. But it actually doesn't compile, because the type checker can't figure out what the return type of the lambda is supposed to be. (Here's a concrete example on the playground.) To make it work I'd have to write a type signature as well, which is clearly worse than the non-combinator version. I decided to just stick with the latter.

Admittedly, this has some flaws as a motivating use case for fancier lambdas.

On one side, it doesn't take much to make the combinator version worse than the non-combinator version, because the non-combinator version is fairly succinct to begin with.

On the other side, the fact that I need to write a type signature in that case seems like a flaw that could be fixed without adding a major new language feature. Maybe. (It happens because ? does into conversion onto the error, and there's no way to tell the compiler to default, in lieu of type annotations, to a no-op conversion to the same type.)

Still, I've wanted TCP-preserving lambdas for a long time, so for me the experience was "yet another instance where I wish Rust had this feature". :slight_smile:

5 Likes

There's also return – including both manual return statements and returning via the ? operator (as in the example I just posted).

But return to where? AFAICT return only makes sense if such a closure is given directly in-line, where there is only a single outer scope.

OTOH, if such a closure is passed around, return IMHO no longer makes sense, as the closure has no control over the context and scope in which it will be executed.

? then also seems ambiguous: E.g., if you use it within .try_fold, then should ? signify failure of the try_fold with an Err, or leave the enclosing scope of the try_fold entirely (equating it with return)? I would have assumed the former, actually.

To be clear: this sort of feature would have no impact on unsafe code at all. Unsafe code already has to be "exception safe" because it cannot assume the closure does not panic. Good safe code should also be exception safe in this regard, but if it isn't it usually is low impact because the program will crash anyway.

TCP-preserving closures cannot be passed around as objects, but must be called directly in the caller (or just dropped). Kotlin for example has a syntax called inline which notes that the closure taken by a method must be called inline (and not moved around), and only when that annotation is used does the closure have this behavior.

(Again, I don't think Rust should grow this feature, but I think its a great feature in other languages that deserves a fair appraisal instead of misinformation.)

10 Likes

There is unsafe code that uses std::thread::panicking in a drop impl of a guard value to detect the panic and for example abort. In case of TCP std::thread::panicking would return false and the code would incorrectly continue executing with potentially messed up invariants. Because of this TCP does impact unsafe code.

I'm not understanding how this would happen and it seems pretty crucial, so could someone expand on this with example code?

1 Like

https://github.com/Sgeo/take_mut/blob/master/src/lib.rs#L31-L41 is the code I was refering to. I noticed it uses catch_unwind of std::thread::panicking instead by the way.

I don't think a TCP-preserving closure could ever meet that trait bound, I don't think there's a backwards compatibility hazard.

3 Likes

Oh, I should have probably elaborated on what I meant by the "separate function". I thought somebody already brought this up but maybe I remembered wrong. So, it's basically just my usual advice for solving the problem around one of the motivations for non-local control flow: a continue or break across function boundaries can be replaced by an additional appropriately-constructed function and an explicit return if necesary.

There are several variations on the same theme, and I know it does eventually come up in practice β€” I have encountered these myself too, although not frequently enough to find it annoying. Still, I absolutely prefer that break and continue operate on the innermost loop and return operate on the innermost function. For my brain, non-locality is one of the worst kinds of readability issue, and whenever I see some code intentionally using such constructs, I can't help but think that "this programmer wanted to take the ugly shortcut".

Incidentally, this is exactly why I don't consider this kind of language construct to be TCP-preserving. To me, a closure is a function, so it should not matter whether I write fn foo() { return bar; } or let foo = || { return bar; }.

As a matter of fact, I often make quick but reusable closures within an outer function, just for the sake of encapsulation and not having to write all the type signature/doc comment baggage, then refactor them into stand-alone functions once they grew in length beyond what's permitted by good taste. I would find it extremely painful if I had to re-think all the returns and breaks and continues in these functions whenever I decide to pull them out or inline them.

I appreciate that, however my comment was not about technical changes to language semantics. I'm worried about not being able to reason about code written in the non-local style, and making fatal mistakes. This is exacerbated by the fact that I would never write such code by myself, and consequently, if I ever needed to change it, I would have a hard time completely deciphering the original author's intention and underlying assumptions about non-locality-safety.

10 Likes

I don't think anyone is saying return $expr; should ever do anything but return from the innermost closure/function. Rather, just that return 'label $expr; could be permitted to work in some cases and return from the labeled closure/function. So intent would always be extremely clear.

And in terms of the TCP property, I think there's an important distinction between (using roughly Kotlin syntax)

foo {
   return x;
}
// and
foo({-> return x; })

The former, due to the lack of (), parses mentally closer to a block than a closure. So I'd almost argue the "best" TCP preserving would treat these two differently... but this is also surprising and not really applicable to Rust, as I don't think we should get tail position closure syntax sugar at all.

3 Likes

Even if intent is extremely clear, somehow I find return 'label to be in really bad taste, a trait it doesn't share with break 'label and continue 'label.

I think that's because return inherently has the semantics "go back from whence you came", whereas semantically speaking a labeled return is way more in the direction of a GOTO. It may not quite be "jump to an arbitrary code address" but it's getting mightily cozy with that idea.

I'm guessing that that's unavoidable with non-local control flow. If it is, then that's a strong argument against non-local control flow altogether IMO.

Also, somewhere in this thread someone mentioned wanting this to avoid having to write a macro. To me that's a lot like mowing the lawn with a pair of scissors because you don't want to reach for a brush cutter or lawn mower. Altering control flow is 1 of the main reasons to write a macro, it's part of their raison d'etre. The other ones are boilerplate cutting (i.e. syntactic shortcuts with no alteration of semantics e.g. derives or all macro_rules macro's) and compile time code generation i.e. making things cheaper at runtime. So to grow new language features just to avoid using the right-but-presumably-scary tool for the job seems... excessive, from that point of view.

3 Likes

Non-local control flow is a subclass of functions with multiple return points. As also already observed, explicit breaks to outer loops are a very similar feature and implicitely requires that code is structured in nested scopes, and does not permit arbitrary goto. Maybe we could unify them under the break keyword? It doesn't need to be a special feature of closures but it would require expanding the notion of function types. However, to make fold work would require reworking it which may be difficult in a compatible manner. An additional version would pose no problem though.

// A function with two return points. As with reference parameter, 
// requires the caller to provide that the lifetime lives for the full
// function call.
fn mul_early<'early>(acc: i32, num: i32) -> i32, 'early: i32 {
    if num == 0 {
        break 'early 0;
    }
    return acc * num;
}

trait BreakFold {
    type Item;
    
    /// Fold over a function that may return the final value early.
    fn fold<A, F>(&mut self, init: A, f: F) -> A where
        F: (FnMut<'br>(A, Self::Item) -> A, 'br: A);
}

impl<I: Iterator> ManualFold for I {
    type Item = <I as Iterator>::Item;
    
    fn fold<A, F>(&mut self, init: A, mut f: F) -> A where
        F: (FnMut<'br>(A, Self::Item) -> A, 'br: A),
    {
        // The early return provides a value for this block directly.
        'early: {
            let mut acc = init;
            for item in self {
                acc = f::<'early>(acc, item);
            }
            acc
        }
    }
}

This explicit syntax choice would address problems named in the earlier thread:

  • The special meaning compared to normal return is immediately clear, in particular for compiler internals it becomes obvious that the tear-down of the function frame needs to be special cased and potentially performed in a manual way.
  • It's explicit and requires the caller of such a closure/function to explicitely opt-in.
  • It introduces an name for the co-variables that model the code flow.
4 Likes

I am saying that. After all, if return returned from the closure rather than the outer function, ? wouldn't work, negating a significant benefit.

This is similar to Ruby, where in the following example, return returns from foo, not the block. It's probably also similar to Kotlin's inline thingy, although I've never used it.

def foo
  3.times do
    return
  end
end

I think we should have syntax sugar in part for this very reason, that the goal is to parse like a block.

Perhaps. But the standard library has a ton of combinator methods you're meant to pass closures to, and every one acts as a control flow construct of sorts. For just about any of those methods, you could probably come up with a plausible example where it would be useful to break out of an outer loop from inside the closure. Should they all be replaced by macros?

Also, macros don't have dot syntax, although that could always be fixed. :wink:

In addition, sometimes, "altering control flow" just means using ? to return early from the outer function in case of error. As I discussed in this post, you don't need non-local control flow for that (you can just explicitly return a Result from the closure and then use ? on the outer call), but it would be convenient.

1 Like