Proposal: Bounded continuations - non-local control flow without the mess

Yes, this doesn’t allow for the more esoteric uses of call_cc in languages like scheme, it’s just a single-use first-class non-local return feature.

As someone else mentions below, this is basically like setjmp/longjmp from C, except properly typed, safely unwound, and guaranteed to only be used to unwind up the stack.

It wouldn’t require GC-ed stacks. In fact, a colleague “prototyped” an implementation of this based purely on panic!. But proper support of this feature in a way that didn’t intersect with panic! would be good.

That was my intended interpretation. Not code size in terms of the size of the final executable, but code size in terms of the actual density of instructions emitted, which affects everything from pressure on instruction-cache density, to branch-predictor-table exhaustion, to the number of actual instructions executed.

I have to respectfully disagree. Well, maybe I shouldn't have used non-local returns for returning tokens in my example (I was just being a bit too clever there). But if we keep this restricted to non-local returns on end-of-incomplete-buffer, and bad-utf8-sequence, then these are rare enough that the perf gain of removing numerous branches on the common path, and increasing code density, should yield significant benefits.

In a typical parse, a bad-utf8-sequence should NEVER happen, and typically halts the entire parse - it's both an error condition and expected to happen only once per parse. With the end-of-incomplete-buffer situation, that should only happen once per buffer "update", which can be tuned, but is typically several K in size. Typically with incremental parsing, re-entering the parser/tokenizer frequently is a bad idea, and we wait until we've reached end-of-stream or several kb of additional data before trying again.

Performance quibbles aside, though.. the main reason I want this is because it expesses the intent of the code far better in the circumstances where it applies. Conceptually, I'm thinking to myself "ok, I'm writing next_char(), I see a bad UTF-8 sequence, I want to break to the start of next_token() and return an error". Currently, I'm forced to implement this with disjoint-value wrapers and bubbling-up by adding wiring code at every call point leading up to the point-of-resume.

“stackless” is the implementation strategy aka “state machine transform”.

Well, I would say that we actually both agree here: the unwinding performance being terrible it is something that you would only use in exceptional circumstances.

As long as this is properly documented, I can certainly see a benefit in terms of terseness, and if even in the absence of performance gain that could be great.


I still find the point raised by @DanielKeep important though. Notably, I wonder about the interaction of this feature with coroutines or the crossbeam crate.

It was demonstrated no long ago that while coroutines or crossbeam were safe in isolation, there might be unsafety in combining them, and I am worried that another non-local return might have the same issues. On top of the not-really-explicit issue of the jump being hidden inside a method, with no clue for the caller.


In terms of implementation, I also wonder whether this could be implemented as a library.

This would notably allow getting a few performance numbers to back-up the claims, and would also allow experimentation of the combination with the crossbeam/coroutines crates.

2 Likes

This proposal would be a bit easier to understand if you called it try or catch or unwind_target instead of call_cc. Since its functionality is strictly limited to non-local returns, the name call_cc is inappropriate.

It seems to me that this proposal doesn’t have any other interesting use cases (e.g. doesn’t support coroutines), and it requires every function on the call stack through which an “exception” could pass to take an extra parameter - a parameter that knows the “exception” type, I might add. Therefore, it suffers from the same level of inconvenience of the existing try! paradigm (IIUC, not that I have ever actually programmed in Rust), and it also has a similar amount of overhead on the non-exceptional path (because passing that extra parameter around is not free).

The term call_cc is short form for "call with current continuation". The concept as a whole doesn't demand that the continuation be able to escape the dynamic scope of the call. That particular property is a side-effect of scheme's garbage-collected abstracted object model that presents the semantics of both values and call frames being heap allocated. Within the confines of a system language where the call stack is explicitly a stack, the terminology still applies exactly: call_cc accepts a closure, and calls that closure with the current continuation, and when invoked that continuation resumes execution at the point of the call to call_cc. The only semantic restriction is that the continuation is not allowed to escape the dynamic scope of the call. This restriction arises naturally out of Rust's borrow semantics and typechecking.

Coroutines are a distinct concept that requires more runtime support, specifically the ability to save/restore segments of the call stack. The use cases for coroutines relate to suspended computations that may be resumed and suspended again as the programmer desires.

While full-fledged continuations in languages like scheme are general enough to subsume coroutines as well as non-local returns, this proposal is purely about making non-local returns a first-class concept. It's basically the equivalent of a "labeled break" which is typesafe, and able to cross function call boundaries.

The utility of this is independent and orthogonal to the utility of coroutines, and the two ideas should be considered separately.

Additionally, being required to pass the continuation object around to places where it is used is an asset - it clearly documents to the programmer (and reader of the code) which calls can perform non-local returns, and constrains where those non-local returns can lead. General exceptions do not have those limitations, and consequently are more opaque in behaviour (one of the major arguments against exceptions in Rust).

2 Likes

I’d like to add a caveat to this proposal:

For call_cc to work, the compiler would need to emit unwind tables, even when compiling with panic=abort.

Is it an explicit goal of Rust’s compiler to never have any language features that require unwind tables? If so, that might be an impediment to implementing this.

If we do want Rust to never require unwind tables, we could possibly consider an “unwind_table=false” compile mode that would disallow usage of “call_cc” within the program.

It’s not just the overhead of unwind tables themselves, but the ability to make optimizations that assume exceptional returns never occur - this is a big part of the motivation for panic=abort. Some people also want to be able to make that assumption themselves when writing code, albeit at the cost of portability.

@kvijayan has mentioned previously that one of his colleagues had made a panic! based implementation of this. I’m said colleague :wink:. I’ve polished up and published the testing crate I made for him when he told me about the idea in may here, if you want to take a look at a form which this feature could take: https://crates.io/crates/breaktarget

I opted not to call it call/cc, because it has such different semantics than the other languages which use call/cc.

1 Like

These optimizations were not an issue I had considered. That said, if going by my above suggestion of "panic=abort" implies "unwind_tables=false" implies "call_cc disallowed in the program", the compiler would still be able to enable those optimizations, no?

Also, have there been measurements on the concrete perf benefit of those optimizations? The canonical example seems to be something along the lines of:

   let v1: u32 = ...;
   let v2: u32 = ...;
   some_func(&mut v1);
   v2 = v1;

Here, assuming some_func always completes allows the compiler to eliminate v1 entirely and reduce the code to:

  let v2: u32 = ...;
  some_func(&mut v2);

However, the example seems contrived, and I'm wondering how often it shows up and what concrete benefit (in terms of numbers) it provides in typical code bases.

That would split the ecosystem in half. Please don’t.

1 Like

I actually think poisoning would be necessary, for the same reason that they are necessary for panics. Just consider

fn foo<F: FnOnce()>(m: &Mutex<Vec<i32>>, f: F) {
  let g = m.lock().unwrap();
  // break invariant
  f();
  // re-establish invariant
  drop(g);
}

If f does a non-local return, the invariant of the lock will not hold any more. This is exactly the kind of bug that poisoning is intended to prevent.

One can implement panics using call_cc and a program transformation to always thread through the continuation. So all problems arising from panic also arise with call_cc.

On the other hand:

let result = call_cc(|cont| {
    let guard = mutex.lock().unwrap();
    ...
    if some_condition {
        // Equivalent to "return value;"
        cont.resume(value);
    }
    ...
    // Pass continuation to some helper function
    // which may call cont.resume(), which would
    // perform a non-local return from this closure.
    try_compute_result(cont);
    ...

    // Return from the closure without using "cont"
    value
});

The mutex should clearly not be poisoned, because this is just like a local break except it’s breaking out of a lambda.

The issue is the same one @DanielKeep raised. Currently functions can assume that calls of supplied Fns and similar things cannot short-circuit out of the caller (except for panics). With this change, that would no longer be true. Either functions would have to stop assuming that (clearly a breaking change), or we’d need some kind of effect typing or something so that it can be tracked in the type system and functions which assume it can also effectively require it. (Incidentally, do you happen to have any idea or know of any prior art about what that could look like? I’ve been wondering about it for a different project for a while now.)

Given that, for backwards compatibility, “my Fn arguments may short-circuit” would presumably have to be on an opt-in basis even in the second case, I’m not sure how well this feature could integrate with Rust 1.x.

Wouldn't the fact that such functions took the continuation as an argument be enough to tell you that they could non-locally return? I suppose that it could be buried in a data structure, where it wouldn't be as obvious, but generally the fact that a data structure contains a continuation would be a pretty important fact about the data structure that would be quite clear from how it's constructed and/or the type of the data structure.

Since the continuation is passed in by reference and can't be copied or cloned, you can't do anything like sticking it in a global or thread-local variable that would obscure the fact that a function could access it, so it would have to be contained in an explicit argument to any function which could invoke it.

I think that's the big advantage of continuations over panic!; it is explicit when you pass them in, so you can clearly see which functions could potentially non-locally return.

3 Likes

That may be reasonable (it sounds like it at first blush), but existing Rust code was not written with that assumption in mind and I don't believe that breaking it would be considered a realistic option.

There's nothing in the type of the proposed call_cc preventing the continuation from being captured and used by an &Fn.

In that example, what would be the behaviour if the code was instead:

fn foo<F: FnOnce()>(m: &Mutex<Vec<i32>>, f: F) {
  let g = m.lock().unwrap();
  if (some_cond) {
    return;
  }
  drop(g);
}

If that’s the case, would g not get dropped, and the lock held past the lifetime of the function? I’d expect that the return would implicitly drop g and thus the lock.

Unwinding the stack would have the same effect.

Ah, that's a good point, that would be a little less obvious, and easier to use in a place where code isn't expecting the possibility of a non-local return that's not a panic.

I suppose one thing one could try is to have a marker trait expressing that a particular type does (not) contain continuations. That way, one could control whether a closure captures continuations, just like Send controls whether a closure captures thread-dependent data/permissions. Not sure how ergonomic that would end up being, though.

I'm not very familiar with the type-and-effect-system literature, but haven't heard of one specifically for tracking non-local control flow. I will try to remember to ask someone :wink:

1 Like

If a function that can non-local-return has to accept a continuation as a parameter, and the continuation doesn’t work for anything except non-local-return up the call stack (not the full power of a continuation), what advantage does that have over adding a new item to the return value instead (along the lines of Result and try!)? Either way you have to change the function signature of everything in the call stack between the level to return to and the level that can call the non-local return. So, why not just return a Result for that non-local return instead?

5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.