Async and effect handlers

Since work has started on getting generators into Rust in order to support async/await, I think it’s worth thinking about how this stuff will eventually look in the language. I have some issues with the direction the design seems to be going and a vague outline of how it might be possible to do it better. Having said that, I’m not at all confident that I have any idea what I’m talking about, but I thought I’d brain-dump this anyway :slight_smile:

Problems with async/await style programming.

This example code from the generators RFC illustrates the first problem.

#[async]
fn print_lines() -> io::Result<()> {
    let addr = "127.0.0.1:8080".parse().unwrap();
    let tcp = await!(TcpStream::connect(&addr))?;
    let io = BufReader::new(tcp);

    #[async]
    for line in io.lines() {
        println!("{}", line);
    }

    Ok(())
}

Async/await infects everything with async/await notation. You can only call functions marked async from other functions marked async. All "blocking" calls needs to be marked await etc.

Additionally, introducing async into Rust like this will split the Rust ecosystem into two halves: blocking Rust and async Rust. Everything that isn’t implemented twice will be unavailable (without painful workarounds) on one or other paradigm. Not only are these two halves incompatible, but they’ll cause runtime problems when accidently mixed. ideally they should either just work or raise a compilation error.

This is all a shame since, on some level, async code is no different to ordinary blocking code. Whether blocking/IO/multi-threading are being handled in-process or by the kernel is something we should be able to abstract away. Luckily, PL theorists know the tool to do this.

An effect system

For the sake of simplicity, suppose the only IO operations we care about are open and read. Consider the following psuedo-rust code.

effect Io {
    fn open(&self, path: &Path) -> io::Result<RawFd>;
    unsafe fn read(&self, fd: RawFd, buf: &mut [u8]) -> io::Result<usize>;
}

struct NativeIo;

handler NativeIo for Io yields ! {
    fn open(&self, path: &Path) -> io::Result<RawFd> {
        libc::open(...)
    }

    unsafe fn read(fd: RawFd, buf: &mut [u8]) -> io::Result<usize> {
        libc::read(...) // do a blocking read
    }
}

struct TokioIo {
    poll: mio::Poll,
}

enum Event {
    Read,
}

handler TokioIo for Io yields (RawFd, Event) {
    fn open(&self, path: &Path) -> io::Result<RawFd> {
        let fd = libc::open(...);
        set_non_blocking(fd);
        ...
    }

    unsafe fn read(fd: RawFd, buf: &mut [u8]) -> io::Result<usize> {
        loop {
            match libc::read(...) {
                EWOULDBLOCK => yield (fd, Event::Read),
                x => return ... ,
            }
        }
    }
}

The above code introduces two new keywords: effect and handler. Effects and handlers are a lot like traits and impls, but with two major differences. The first is that handlers are allowed to yield. Yielding saves the state of execution (eg. by compiling the code as a state machine in the first place), and returns control back up the stack. The second difference is in how they’re used. An example of invoking this effect could look like this:

#[sticky(Io)] // can't be moved between Io effect contexts
struct File {
    fd: RawFd,
}

impl File {
    pub fn open(path: &OsStr) -> io::Result<File>
    effect Io {
        Io::open(path)
    }

    pub fn read(&mut self, buf: &mut [u8]) -> io::Result<usize>
    effect Io {
        Io::read(self.fd, buf)
    }
}

In order to use an effect, a function needs to be marked with that effect. Implementation-wise, this effect is like a trait bound on a generic type parameter, except the type is unnamed and is provided by the calling function. In order to call a function with an effect, the caller either needs to be marked with that effect as well, or it needs to use a do block.

fn example_fn() -> io::Result<u32>
effect Io
{
    let file = File::open("blah.rs")?;
    file.read(...)?;
    Ok(123)
}

fn run_event_loop() {
    let tokio_runtime = TokioIo::new(...);

    let mut generator: impl Generator<Yield=(RawFd, Event), ...> = do &tokio_runtime {
        example_fn()
    };

    loop {
        match generator.resume(()) {
            Yielded((fd, event)) => {
                // The coroutine needs to block. Register the IO event we're waiting for.
                // No tokens coz this is example psuedo-code with only one coroutine.
                tokio_runtime.poll.register(fd, event);
            }
            Complete(x) => {
                println!("returned with {:?}", x);
            },
        }
        // Block, waiting for the event.
        tokio_runtime.poll.poll();
    }
}

A do block is how we contain effectful code. It takes a handler (in the above case a TokioIo instance) and returns the code as a Generator implementor. If the handler yields anything other than ! then the contained code is compiled to a state machine so that it can be resumed. Any effects that occur inside the contained code are implemented through the effect handler, rather than through the effect handler of the containing code (if there was one). As such, we can use a single File type, and have it behave differently depending on whether it’s used inside our TokioIo event loop or not. Backwards-compatibility could be handled by making effect Io special and implicit (like Sized), with main being executed under the NativeIo handler.

Conclusion

It’s been a while since I read any effect systems papers and I’m making all this up as I go along. But if something like this is at all possible then I think it’d be much nicer than keeping async and blocking code seperate. It also generalises to other kinds of effects, eg. iterators and panic could be implemented on top of an effect system like this.

Thoughts?

5 Likes

eg. iterators and panic could be implemented on top of an effect system like this.

To be less vague: I'm imagining the Generator trait looking something like this:

enum GeneratorState<Y, C> {
    Yielded(Y),
    Complete(C),
}

trait Generator {
    type Yield;
    type Resume;
    type Return;

    fn start(&mut self) -> GeneratorState<Self::Yield, Self::Return>;
    fn resume(&mut self, resume: Self::Resume) -> GeneratorState<Self::Yield, Self::Return>;
}

If a generator has Yield == ! then it's just an ordinary function. It's allowed to use the stack and it doesn't need to be compiled to a state machine. Additionally, if a generator has Yield != ! but Resume == ! then it's also an ordinary function. We don't need to be able to resume it, so we also don't need to compile it to a state machine. It needs to be able to yield though, and that can be implemented as unwinding.

With effects implemented like this, panicking can be implemented as an effect:

effect Panic {
    fn panic<T: Any + Send + 'static>(val: T) -> !;
}

// The default handler we enter main with (unless built with panic = "abort")
struct UnwindPanic;

// yields with Box<Any + Send + 'static> and resumes with !
handler UnwindPanic for Panic yields Box<Any + Send + 'static> -> ! {
    fn panic<T: Any + Send + 'static>(val: T) -> ! {
        yield (box val)
    }
}

// The handler inside drop calls, during panics, and when built with panic = "abort".
struct AbortPanic;

handler AbortPanic for Panic yields ! {
    fn panic<T: Any + Send + 'static>(_val: T) -> ! {
        libc::exit(-1)
    }
}

fn catch_unwind<F: FnOnce() -> R>(f: F) -> thread::Result<R> {
    let g = do UnwindPanic { f() };
    match g.start() {
        Yielded(e) => Err(e),
        Complete(x) => Ok(x),
    }
}

For iterators, there could be an effect like this:

effect Iter<T> {
    fn item(t: T);
}

// A handler to drive the iterator.
struct For;

handler<T> For for Iter<T> yields T {
    fn item(t: T) {
        yield t;
    }
}

struct EffIterator<G> {
    gen: G,
    started: bool,
}

fn eff_iter<T, F>(f: F) -> EffIterator<impl Generator<Yield = T, Resume = (), Return = ()>>
where F: FnOnce() effect Iter<T> {
    let gen = do For { f() };
    EffIterator {
        gen: gen,
        started: false,
    }
}

impl<T, G> Iterator for EffIterator<G>
where G: Generator<Yield = T, Resume = (), Return = ()> {
    type Item = T;

    fn next(&mut self) -> Option<T> {
        let val = if self.started {
            self.0.start()
            self.started = true;
        } else {
            self.0.resume(())
        };
        match val {
            Yielded(t) => Some(t),
            Complete(()) => None,
        }
    }
}

Which you could then use like this:

// make a function with the Iter effect
let count_to_ten = || {
    for i in 0..10 {
        Iter::item(i);
    }
};

// turn it into an iterator, use it to count to ten
for val in eff_iter(count_to_ten) {
    println!("got: {}", val);
}

@alexcrichton Thoughts on any of this (being the mastermind behind the generators work so far)?

It seems like you’re effectively proposing that all functions be compiled twice (or at least potentially be compiled twice), once as a state machine (if the function is used with an asynchronous IO handler) and once normally (if used with a synchronous one). That sounds tricky…

I’m effectively proposing that all functions have an extra, hidden type argument and that the type it’s instantiated with effects how the function is compiled.

eg. TcpStream::read becomes TcpStream::read<T: Io> (though it’s not written like this), and TcpStream::read::<NativeIo> gets compiled normally while TcpStream::read::<TokioIo> gets compiled as a state machine.

Although there wouldn’t just be one extra argument, there’d be as many as there are effects on the function (with effect Io and effect Panic presumably being implicit). If any of the handlers that those effects are instantiated with require resuming then that instantiation of the function is compiled as a state machine.

Would this allow for eventually fully annotating a Rust program with explicit effects? That’s one of the things I always wished Rust had, but the challenge is always integrating it in a way that is backwards compatible, and does not blow out the complexity budget too much more and still remains zero-cost. Here is a previous RFC - there might be more out there, but I seem to remember one as far back as 2012 or 2013. I think the async angle is an interesting one in terms of merit - will have to think more though. :thinking:

2 Likes

I like the generality of this proposal, but there's one thing I want to call out:

For a lot of people, this is one of the advantages of async/await. In tutorials on the concept, you will see things like "it is a lot easier to reason about the locking requirements and other potential concurrency issues when every point where a complex function might 'block' is explicitly visible in the code." And I can personally attest that this is true — in Python, where all errors are runtime errors.

In Rust, especially if effects are understood by the compiler and not just sugar for a bunch of library calls, I wonder if the same safety net might be had from compile-time checks. I am not familiar enough with the effect-system literature to know if this is possible, but maybe you are? This proposal would be more compelling to me, anyway, if it addressed the question of verifying concurrency-correctness.

Would this allow for eventually fully annotating a Rust program with explicit effects?

For backwards-compatibility's sake, effects which already exist in code would have to be implicit and opt-out. I suppose there could be a module- or crate-wide attribute like #![no_implicit_effects] which would force you to write effect Io on things, but that would probably just be confusing.

... and still remains zero-cost.

The thing I'm proposing here is zero-cost, at least as far as I can tell. File::read for example would just get inlined as a wrapper around the ambient Io handler's read method.

Here is a previous RFC

Interesting! Allocation and garbage collection would also be well-suited to a system like this, that's a good point. That RFC however, assumes that there is only one handler per effect and that every effect+handler combo is cooked into the language. I'm proposing user-defined effects and handlers where you can swap-out different handlers to get things like async vs blocking code. Some effects would have to be lang-items for backwards-compatibility (eg. Io), but you could still write your own handlers for them.

@zackw

I guess it’s nice being able to see where a function can block, but I don’t like that I have to do it for tokio::net::TcpStream::read and not std::net::TcpStream::read, and that they have to be two separate, incompatible types.

It would be great if it’s possible to make locking part of the Io effect as well, I don’t know feasible it is though. If it’s doable, it would mean tokio tasks / green threads / whatever could use locks the same way that native threads do. The NativeIo implementation of lock would call directly to pthread_mutex_lock or whatever Rust currently uses, but the TokioIo implementation would have to use something like eventfd on Linux, WaitForMultipleObjectsEx on Windows, something else on every other platform, tie it all together under one API, and do it without creating any non-trivial costs. That sounds hard, but the mio guys did something similar to get a working cross-platform Poll implementation, so maybe it’s not impossible.

Can effect handlers be built with out allocations? As far as I know all of the implementations use heap allocations to jump between the stack and the effect handler 1 2. Though, its not clear to me if this is a problem with effect handlers or the coroutine implementation on top of them. It seems like stackless coroutines may get around this limitation with compiler support.

1 Like

@bbatha Yeah, stackless coroutines are what’s being worked on at the moment. I’m envisioning this effects system as a way of getting them into the language. I’m not sure if stackless coroutines are as versatile as stackful ones though. I can’t see how you’d compile a naive implementation of factorial like shown below for instance because you don’t know how much memory it’s going to need.

fn factorial(x: u32) -> u32 {
    if x == 0 {
        1
    } else {
        factorial(x - 1) * x
    }
}

If it turns out that certain kinds of code can’t be compiled as state machines then we could have a way for a coroutine to explicitly opt-in to using a heap-allocated stack. The mioco crate already has a rust implementation of heap-stacks which we could work into this somehow.

Not only are these two halves incompatible, but they’ll cause runtime problems when accidently mixed.

I thought calling non-blocking code from blocking code without an explicit wait would definitely be a compile-time error. Is that not the case? (Preventing blocking code being called by non-blocking code is conceptually difficult since in practice any code can effectively "block".)

I'm writing lots of futures-based code and async/await would be incredibly helpful to us. Effect systems are relatively unproven. I'd hate to see async/await blocked on something experimental like this.

One big question for effect systems is how the annotation burden scales and it's hard to determine that without experience with large, real systems. For example in your iterator example wouldn't count_to_ten's closure need to be annotated with the Iter effect? It seems that your original complaint about the preponderance of async annotations is actually worse with your proposal because blocking code needs to be annotated with effects too.

I thought calling non-blocking code from blocking code without an explicit wait would definitely be a compile-time error. Is that not the case?

Yes actually, not using await! where you need it will be a compile-time error.

(Preventing blocking code being called by non-blocking code is conceptually difficult since in practice any code can effectively “block”.)

It's impossible to prevent all of it, but we could prevent a lot of it. Anything that's handled by the Io effect (file/socket IO, timers, joining threads, locking (maybe)) would be safe to mix.

Effect systems are relatively unproven. I’d hate to see async/await blocked on something experimental like this.

I'd hate to see Rust rush into a crummy solution if there's potentially much better options on the table.

One big question for effect systems is how the annotation burden scales and it’s hard to determine that without experience with large, real systems. For example in your iterator example wouldn’t count_to_ten's closure need to be annotated with the Iter effect?

I'm assuming that if Iter is in scope then the effect could be inferred there.

It seems that your original complaint about the preponderance of async annotations is actually worse with your proposal because blocking code needs to be annotated with effects too.

No, because the Io effect would be implicit. It has to be for backwards-compatibility, though I wrote it explicitly in some of the example code I gave.

That's not the case though. Calling blocking code from non-blocking code can just happen, any time. I.e., if you directly call println inside your async task, there is no warning but this code may now block on IO -- and that's bad, of course, as it blocks the entire worker that this async task is currently running on. Rust doesn't have the effect system required to detect this. The system proposed in this thread could probably do that. It is calls to non-blocking functions that need extra annotations. (Or at last, that's what I gather from reading some examples and skimming the RFC.)

Hmm, is your own proposal stackful or stackless (the second sentence suggests stackless but it's not completely clear)? My recollection of the general consensus from the other threads is that while stackful coroutines are more expressive and/or pleasant to use, they are not performant and "zero-cost" enough for "a systems language like Rust", and in particular that they would not be appropriate to use for writing implementations of Iterator, which is one of the main use cases alongside async I/O.

Also, just by looking at your proposal without you telling me, from what could I determine for myself whether it's stackful or stackless? As far as I've managed to understand these things so far, "stackless" means that only the top-level function/coroutine may yield (= invoke effects, IINM), while if it's stackful, that function may call another, which may yield for both of them. (In the stackless proposals so far, they are clearly stackless because if one function which may yield wants to delegate to another function which may yield, the only way to do it is to invoke it in a loop with the outer function yielding every time the inner one does.) In your proposal, I believe (1) any function with a given effect may transparently call any other functions with that effect, which (2) implies that it's stackful? Or am I mistaken about either of those?

Do you remember them well enough to say how closely your proposal corresponds to algebraic effects systems "from the literature" (I'm familiar with Frank and am reading about Koka in particular, fwiw), or, if there are some important differences, what those are?

(Just a few days ago I was trying to google for whether "algebraic effects" correspond to "stackful" or "stackless" coroutines, with literally zero useful results (does nobody else on the internet care about how these things relate to each other, or what?), but based on the same logic as above concluded that they are almost surely stackful...)

Stackless is definitely what I'm going for here, although with the possible ability to explicitly opt-in to having a stack where it's needed. In this proposal, functions can yield (= invoke effects :+1:) from as many layers deep as they want. This requires all the functions between the yielding function and the effect handler to be compiled into a great big state machine. This is the same as what happens when you compose a bunch of futures together into a great big future, and some deeply-nested future yields. I believe @alexcrichton's coroutines work will allow rust to generate these state machines automatically from plain old, synchronous-looking rust code. I don't know precisely what limitations this imposes on the code.

Do you remember them well enough to say how closely your proposal corresponds to algebraic effects systems “from the literature” (I’m familiar with Frank and am reading about Koka in particular, fwiw), or, if there are some important differences, what those are?

Uhh, I don't know. I'll need to go back to doing research. Frank is one of the inspirations for this though, and I'm pretty sure this is the same as what's in Frank except that here you explicitly wrap the effectful code up into a Generator which is used basically like an Iterator.

1 Like

Ah okay, I see. This does seem less restrictive while being still-hopefully-implementable than restricting it to a single function.

Presumably "no virtual calls (to effecting functions)", for starters? Or no cross-crate calls, even... (or does monomorphization of generics save us)?

Cross-crate calls should be fine, since we can call generic functions cross-crate. You’re right about virtual calls though… that’s problematic :confused:

I think we will need some nice keywords for coroutine and generator as indicators just like fn is used for function.

I propose : ‘co’ for coroutine; ‘gn’ for generator; we have ‘fn’ for function. that implies consistency in using the first syllable of each word. ‘fn’ -> fun; ‘co’ -> co; ‘gn’ -> gen;

e.g : fn plusfive(x: u32)->u32; gn yieldfive()->u32; co awaitfive()->u32;

Just Brain Rumbling for Discussion

THANKS.