An idea for TCP closures and rust's effect system

I have to clarify what i meant here:

The problems of full non local control flow are already described in thread, but without basics of NLCF I can't go further.

I want to place a restriction on NLCF: a single function can both receive typed labels to return to and provide local typed labels to other function, however it cannot propagate received labels downstream.

This restriction is imposed to enforce handling of failures locally or, at least, more explicit propagation.

The case I find the hardest is multiplicity: particularly because the design of generators is itself not finalized yet.

So,I will make a few strong assumptions:

  • Magic mutation is chosen - personal preference
  • Signatures are following gen fn(initial_data)->resume_ty->yield_ty
    Note: closures doesn't have such full form: their start data is captured.
  • gen fn and gen closures are introduced - explicitness about the context "actions" take place in.

Example 4:

gen fn exmp4(data: impl Iterator<Item=char>) -> () -> [char;2] {
   data.map(
      gen |ch| loop {
         let ch1 = ch; yield;let ch2 = ch; yield; // here we accumulate 2 characters.
         yield 'fn [ch1,ch2]; //here we yield these right from original context.
      }
   )
}

Sorry, if this is cryptic.

Edit: Also syntax of giving the closure contexts labels is unclear.

That's not true, an await can cause the future to yield to the executor, suspending the computation, then if the executor never polls the future again you've effectively skipped the code that was required to run. You can even leak a future, making it even worse because now not even Drop implementations will run, which many rely on (I was bitten by this when I proposed a macro approach for scoped threads, it worked in sync functions but not in async ones, see these two comments (1, 2))

Thanks, I was just assuming that the futures are eventually run till the completion and then dropped.

However, I don't think such conditions are manageable without linear types at all.

Unfortunately, this is outside of current scope. Tbh, I was thinking about some super light weight form of linear types, but that's different topic.

Brief summary of idea\proposal:

Allow closures and functions to capture control flow environment, with support for following effects: fallibility,asynchrony & multiplicity.

  • fallibility is achieved through introduction of a return-to-label (diverging) expression: return 'label val. It is terminating effect;
  • asynchrony is done by allowing closure to capture async context's implicit values and use them to await a future in a closure: || await(some_captured_fut). Resulting closure's state is stored in and polled by parent future;
  • multiplicity is provided via capturing generator's environment and yield'ing from (another) generator context.

Pain points:

  1. light weight syntax for providing labels to their consumers;
  2. inconsistent syntax of asynchrony;
  3. giving a labels for closure and generators contexts.

Solutions:

  1. Second class labels.
    We introduce labels as second class citizens. In input positions, by functionality:

    • fn gimme_label(lab: label(u64)) - is for terminating return effect;
    • fn gimme_gen(parent_scope: label(u64 -> bool)) - is for consuming labels of generators' scopes.

    Because of labels being second class citizens we impose restrictions:

    • No renaming other than at function boundary;
    • As a consequence, there is no bindings of type "label".

    Example:

    fn takes_a_label(a: label(u64 -> bool)) -> !  {
       let mut acc = 0; loop { acc += yield 'a (acc % 2 == 0) };
    }
    

    And btw, here I had to use yield as an expression approach - I have no access to what is actually being mutated.

  2. Possible use of async closure's syntax.
    Instead of trying to justify inconsistent use of await outside of async function/block(/closure),
    I opt to reuse async closure syntax for creating such TPC closures:
    Example:

    async mapper(iter: impl Iterator<Item=Struct>) -> Vec<Struct2>{
       iter
         .map(async |it|{ process(it).await }) //note, this creates iterator tied to local scope
         .collect();
       ...
    }
    
  3. A syntax is like 'label: |arg| {...}.
    Example (using previously defined takes_a_label):

    ...
    // has type: `impl Generator<u64,Yield = bool,Return = !>`
    let gen_cl = 'lab: |num: u64| {
       takes_a_label('lab) // diverges
    };
    ...
    

HOF.

Section which they deserve. Such cryptic syntax is not for no reason:
The primal goal was to allow HOF to write signatures like this:

  1. async fn hof(f: async fn(label(u64)) -> Struct );
  2. or this:
/// it will "fold" a coroutine
fn folder<T>(
init: T,
f: fn(label(Struct -> Struct2)),
fld: fn(T,Struct2)->(T,Option<Struct>),
)

Drawbacks:

  1. label is a keyword now.

Unresolved questions:

  • exact rules for async closure syntax: when it creates a TPC async closure?; when a plain async closure?;
  • real cost of second class labels.

Things to note:

  1. Given all of these one might think that this is a coroutines proposal, but it's not really one because of imposed restrictions on how labels flow between functions.
  2. To avoid unnecessary monomorphic copies of label taking code, we can give them an actual runtime representation. Then effectful code will receive pointers to where to store next yield value and whom to callback on next yield.

... meanwhile I was thinking about making non-local returns a new sort of panic:

// reserves space on stack for NonLocalReturnType
// keeps track of current stack frame so that we can return to it
let catcher = PanicCatcher<RegularReturnType, NonLocalReturnType>::new();

// pass it into the new catch_unwind
catch_unwind_nonlocal(&mut catcher, |nonLocalReturn : &PanicInvoker<NonLocalReturnType>| {
        ....
        let value : NonLocalReturnType = ...;
        nonLocalReturn.panic(value); // can be in another method, just pass nonLocalReturn to it
        ....
    });

if (catcher.isOk()) { ...} else {..}
2 Likes

Non-local returns also call destructors, It should had been written.

It could be nice, typed catch_unwind, but the last one is stable. Also, with this approach we can have library level exceptions, for catching NPE, IndexOutOfBounds, and other common error at runtime, safely(?).

Any ideas on improving asynchrony syntax?

I was already assuming that since normal returns also call destructors. However this should not be the case for async/generators because they can be resumed.

While syntax is important I don't think it should be the primary focus here. There are important questions that I think should come before them, for example:

  • how should this be implemented? For example kotlin requires the function accepting the lambda with non-local control flow to be inlined into the caller. Should we do something like that? This would e.g. limit the ability to turn these types of closures into trait objects. What are the alternatives?

  • Does this need to be something special or could it be modeled with a trait? For example just like we have the Future trait that can be .awaited we could add another trait that does a more generalized thing. Although at that point I suspect we would pretty much get syntax sugar for monads.

  • Can we let functions limit the kinds of effects they allow, but still being somewhat generic? For example I could see code definitely being unsound with async/generators, but that could somehow manage if destructors where guaranteed to run, so it could still allow non-local returns.

  • Can this be integrated in a backward compatible way? E.g. being able to change old functions that take closures in a way such that they still work with old closures but would also accept new one? I suspect type inference might make this harder or more complicated than it seems though.

I imagine giving labels factual reperesentation:

  • (*mut T,*const fn()) - the first is pointer to place for value; second is pointer to instruction to jump;
    Passed as an argument to consumer function, this makes non local returns not to require inlining.

  • (*mut T,*const fn(*mut U)) - pointer to place where effect consumer yields; pointer to instruction to jump, provide a pointer to resume arg. place for resume argument to be placed there.
    Such use of pointer implies that use of such functions result in a consumer's future to be !Unpin;
    Alternatively, we could store offsets of places in the state; adds one more pointer op, but lifting the restriction.

I see inlining as an optimization, not mandatory thing.

Personally I can't imagine such a trait, albeit I haven't thought deeply.

But given the amount of nuances, I doubt that such trait will be feasible at all.
Moreover, the need for such a trait seems to be absent: We introduced special traits when we either wanted to add flexibility for an otherwise special constructs (see Try trait) or desired a syntax to be a sugar (e.g. async & generator features) for impl Trait types production.

Could you show an example of such code?

I especially designed feature that consumers of effectuful functions say what they can support and will provide explicitly, they pass appropriate labels and an async context inside a functions, closures can capture these on demand.

One mechanism of limiting what a provided closure could do is to use traits like NoReturnEffect, NoGeneratorEffect, NoAsyncEffect (nameing is bikeschedable). They are implemented for closures without any usage of provided effect. This way traits carry their function and provided labels can play role of phantom data.

This is why I asked for another variant of acynchrony syntax - it clashes with async closures and I couldn't come up with a better one.

My design goal is to allow functions to accept TPC closures without any changes. In price of closure always having a local lifetime.

In the examples I've demonstrated use of existing map combinators - they accept plain closures, without any notion of effects.

Addition: TPC closures shouldn't capture non-local labels.

Those don't support calling destructors though, and I can't see how they would be used to implement generators/async since they require completly rewriting the function as a state machine.

Also, what happens when you mix generators/async and non-local returns?

Isn't this also a kind of sugar? async rewrites a function to a state machine that implements Future, this feature also kind of rewrites a function.

The replace_with crate for example uses a destructor to run code needed for soundness. Only as long as destructors are always run it would be sound to accept closures with effects.

...could non-local returns not share machinery with panics?

Assuming An idea for TCP closures and rust's effect system - #23 by atagunov you initiate a non-local return by invoking a method on a special object passed down the call stack; this object records some kind of tag or address.

Panic machinery would go up the stack unwinding until it finds a handler with a matching tag/address. This object also provides storage space to put return value into - a reference into the stack frame of the non-local-return-aka-panic handler.

If a regular panic handler is met while unwinding the stack it would be necessary to unwind past it until correct non-local return handler is found I guess.

... yinz realize you are literally in the process of reinventing the C++ exception unwinder, including the never-actually-resolved-people-just-got-sick-of-arguing-about-it question of how it should interact with thread cancellation panic.

1 Like

Share the exact same machinery with panics? Probably not, that's optimized for the "doesn't unwind" case, which is the opposite of what would happen with non-local returns.

This also doesn't handle generators/async.

In particular this feels very non-zero cost.

Shouldn't cost more than catch_unwind? On each step you compare frame pointer to the one on which you want to stop or something to the same effect.

And you know on which you want to stop because the &PanicInvoker<T> you passed down the call stack has the address of the base frame on which you want to stop. A bit different from C++.

Certainly

It probably costs a bit more than catch_unwind, but the problem is that catch_unwind is already expensive.

Important detail to notice.

We simply forbid non-local returns from async and gen fn's - closures are allowed to return to local fn's scope producing a value. Then rest of code in scope can do something reasonable with labels fn was provided with..

I see, we'll need to store these states inside of their callers states. Representation for labels was only to allow to not inline most of code (not state) to not duplicate it in monomorphic copies.

Now I see that it's smaller optimization than I originally thought.

Fundamentally there should be nothing preventing you to have non-local returns from generators/async, so this sounds like an artificial limitation that creates inconsistency.

Closures are pretty much inlined anyway, so does it change anything if you don't require closures with non-local control flow to be inlineable?

1 Like

Indeed, they can be safely allowed.

My initial concern was like "what if a future is spawned on an executor, and then, after being moved to another thread returns to a non-async code from a runtime's code" that would break runtime as of one of its threads terminated to sync scope.

However, my concerns were just overstatement, as of these futures cannot be spawned: they have local lifetime.

Generators also share this property.

Another point: TPC closures aren't Send nor Sync.

Now, interactions between Asynchrony and Multiplicity.

My view is that asynchrony dominates multiplicity, so consumer of async fn(gctx: label(u64->bool)) will use it like:

...
while let GeneratorState::Yielded(it) = gctx.resume(64).await {
   ...
}
...

Pretty syntaxes are not there yet.

And this brings us to their "full" counterparts. Probably, together they should result in stream literals... But that's outside of current topic.

Edit: what's about denoting asynchrony effect by label(async)?

1 Like

dyn safety and fn pointers:

There are no place for state, produced by asynchrony or multiplicity in case of fn pointers => cannot allow.

Thereof, fn pointers only can have non local returns.

In case of explicit boxed dyn Trait, we allow Fn trait to accept only non local return effect (it does not mutate anything); FnMut is allowed to have rest of effects as long as their state is Unpin. AFAIK boxed things have stable location in memory even in face of allocators API, Unpin here is actually not necessary.

This is also the case for a representation of labels:

The destructors should be run before returning (setting the value, then jumping).

The information passed in the second repr. is indeed insufficient, my bad, but... - what if we do it like that:

(
   *const fn(*mut ResumeTy), /*a callback for setting where to place resume arg*/
   (
   *mut YieldTy /*place for a value*/,
   *mut *const fn()->bool /*next resume address, also tells whether parent coroutine should yield*/
   ) //`yield`info
)

Then:

  1. When a consumer invokes an effectful function, the first thing function does, is set up resume argument address via callback. A place for a next resume argument is set up by via this callback on each (effectful) yield, if the address has changed;

  2. When effectful function yields, it sets up yield value, and a pointer to code where to pass control (it returns).

Effectively, this is "runtime place resolution" like thing.

Lifetimes:

for now, because of the rules, labels always have either lifetime of the caller, or local stack frame lifetime.

In case of lifting the rules and allowing full non local control flow...

In this case we treat them as a structs with references (what they in fact are), any label has one lifetime parameter which can be specified like label<'a>(...) when it's necessary.

Lifetime in signatures are elided, if they are not specified (elision failed => annotation needed).

asynchrony effect syntax:

As of syntax for capturing async env., but not producing a future, I would suggest smth. like fn<async>(...)... as a notation. I don't think we should allow an async fn or an async closure to accept this effect.