I just read this article, and it made me wonder: why don't we replace async+sync with just async? the reasons for doing so are explained in the article.
I think a change like this would be backwards-compatible. One way to impl this is to use the syntax of what is currently sync, and let it behave like async. If something works in sync, I think it will work in async as well. So if all sync source code gets interpreted by the compiler as async, it prob won't cause any issues. And the async source code that's currently out there, can still remain supported, but lints can be introduced to tell the developer that "hey, you can simplify the syntax and write like this instead".
I haven't done any rigorous analysis/research on this but the above seems plausible.
I think right now we might not be ready for unifying async+sync into async since the async system is still under development. But perhaps sometime in the future we could unify them?
A quick counterexample: FFI. C cannot call an async Rust function/function pointer as if it was a normal sync function.
A more complex example: std::thread::scope, whose async version is unsound because it doesn't guarantee that stack variables will be dropped before the lifetimes they refer to become invalid.
Another issue is how you would write the executor itself if all code is async (since you need a sync runtime behind the scenes to drive your async world, it isn't turtles all the way down).
And no "make it part of the language" doesn't work for Rust. How would you do that for embedded or kernel code?
If only it where that easy. I think what you're looking for is a kind of maybe_async (or perhaps keyword generics) that you don't have to specify, which I'd love to have, too. But I think there are some things you cannot or don't want to do across an await boundary, for example holding a std::sync::mutex guard. So modifying your functions internals by adding a mutex could make your function not-compile if it is used as an async function, thus making it easy to accidentally introduce backwards incompatible changes.
Personally, I think it's better if every function is generic over being async or not. That way the compiler can choose which one to use when needed instead of having to make everything async.
Regarding the FFI example: The gist is that C calls your function once, while async functions are called (polled) multiple times until the state machine the compiler generates finishes (or you drop the future). So at every function you want to be callable from C you'd need to involve the/an executor (e.g. tokio), unless it is possible to have every function you ever call be sync (which you may not want).
And in the other direction a C function is always sync/blocking and can (already) result in problems with the async executor due to blocking the executor thread.
in the article (in OP), it mentions that Java and Golang don't distinguish between sync vs async (as I understood it). Do those two languages have problems with interoperability with other languages?
I like your suggestion for generics over sync/async. Are you referring to this, or something else?
Wanna know one that doesn’t? Java. I know right? How often do you get to say, “Yeah, Java is the one that really does this right.”? But there you go. In their defense, they are actively trying to correct this oversight by moving to futures and async IO. It’s like a race to the bottom.
Emphasis mine.
Java didn't have async then. You can't have color problems if you are blind.
Go and Java both, IME, have FFI problems. Java does not allow arbitrary FFI; if you use JNI, you have to write wrapper code in your FFI language that adapts from Java expectations to your language's expectations (and can't just call an arbitrary C function, for example).
Go has similar requirements, but instead of requiring the FFI function to be adapted to the needs of Go's runtime, CGo automatically generates the shims for you.
In both cases, this makes calling an FFI function less efficient than calling a language-native function, because of the need to do special things at the boundary; in contrast, Rust and C functions can call each other directly without some sort of shim or wrapper in the middle, because the Rust compiler can generate a Rust function with a psABI compatible calling convention, rather than generating a psABI calling convention function that translates to a Rust calling convention, calls the Rust function, and translates return values back to psABI.
Because a lot of the operating system APIs are fundamentally sync. You call to the OS and if the OS has to wait on something it'll suspend the thread until the work is done.
Sure, io_uring is improving the situation. But it doesn't cover everything and it's linux-specific.
So if you're doing systems programming (close to the OS, no runtime) but also want to do async (where waiting work is converted to telling the executor how to poll for readiness) you need to distinguish blocking (sync) and nonblocking/pollable/async APIs.
The "punt work to a thread pool" kludge that most async runtimes use for filesystem IO only works for things that aren't thread-bound (e.g. Send, not depending on thread locals, OS thread state etc.). It's not a universal solution. And it has overhead, the thread synchronization overhead can introduce additional latency and reduce throughput.
Another issue is long-running compute work. You don't want to declare that async and run it on the same executor, since it would reduce the concurrency/hurt latency of other async tasks.
Rust is low-level enough that the difference matters (e.g. different types of locks must be used, lifetime of stack data is different).
The difference can't be completely abstracted away without making Rust a higher-level language that takes control away from the program and has a runtime that intercepts interactions with the operating system. But Rust is a low-level systems programming language, and such change would make it unusable for its main purpose.
Removing the syntax without removing the semantic difference would only obscure what is happening, and make it harder to use sync- and async-compatible code in the right places.
Side note: I quite often wonder what a clean slate general purpose OS API design today would look like, one that doesn't have to offer compatibility with POSIX or C (like e.g. Redox is aiming for).
For this particular thread one of those musings jumps to mind: Would the hypothetical OS be focused on io-uring style syscalls? You likely need a couple of sync syscalls still to set things up, exit the program etc, but everything else could be async.
Not sure it would be a good idea though, without a language that integrates that approach well it would be hell to code for.
(Other musings includes tag/query based file systems instead of hierarchical ones and many other things like that.)
I have also often thought about what clean-slate general purpose OS designs look like, and something I keep coming back to is, what would it take to have only one blocking system call? Obviously that one system call has to be "wait for the next external event, whatever it is". It is then fairly natural to say that all I/O is async, you kick it off with syscalls that return immediately and you get completion notifications via the event queue. (Not entirely unlike how NT works at its lowest levels.) A native, std-ful port of Rust to such an OS would probably want to use, or at least allow, async fn main().
Things start getting hairy when I start thinking about memory allocation. In the ideal form of this design, memory allocation from the OS would be async as well, since allocation often needs to do substantial housekeeping work and I/O under the hood. But that would mean every invocation of a language-level allocator is an await point, which may be too painful to contemplate - not just because of the extra typing but because the whole point of preferring async functions to preemptive threading is that you cut the number of potential suspension points down to something a human can reason about. And it would also mean no page-granularity swapping; you could swap out an entire process if (and only if) it was sitting in event wait, but you couldn't allow individual memory access instructions to secretly be blocking system calls. Maybe we don't need that anymore? It might be possible to reflect page faults back to the application and have them turn into awaits, but it would need a whole lot of help from the compiler and runtime. And it would also interfere with my other key design goal for this thing, which is to abolish asynchronous signals entirely and have synchronous CPU exceptions be handled out of process.
Preemptive multitasking also becomes an issue - not in itself, but because it means the amount of time a thread can go without calling for the next external event becomes unbounded, and (because there are no signals) control-C and other "user wants the process to go away now" notifications are external events that only get acted upon when you return to your event loop. The path of least resistance here, I think, is to put a hard upper limit on the number of machine instructions that can be retired in between calls for the next event. It's the compiler's responsibility to get this right; a process that gets it wrong is killed. This pushes us toward "all functions are async" as originally suggested in this discussion, but at the price, again, of having possibly too many await points...
That is an interesting direction to take it (the extreme point). It useful as an exercise to explore the design space, but I don't know that it would be practical to go that far given the issues you listed.
In particular, some system calls probably make sense to offer as either async or sync. Such as memory allocation.
For efficient io uring style completion queues I suspect you need more than one sys call anyway (you need to tell the kernel that there are new items on the queue to process so it doesn't have to keep polling empty queues as I understand it, though I have yet to actually code against the API in question).
But then again for memory allocation, the program allocator should request big chunks that it splits out (just like today). In an async world this could be done preemptively when it notices it is getting close to running out in a given pool, and thus there might not be any wait needed at all once it actually runs out of the previous block! That sounds pretty neat. There is probably a bunch of neat consequences like that once you start digging into it.
Preemptive multitasking needs to be supported still, between programs and threads in programs. There is whole bunch of things you cannot do without preemptive multitasking (speaking with my hard real-time developer hat on here, but even without that this is true). But since this is handled by the kernel based on timer interrupts today when a thread doesn't call sys calls for an extended period of time I don't see why that mechanism wouldn't still work, maybe I missed some step in your explanation?
Also io-uring already has futex support apparently, so that is promising.
Midori didn't have any blocking syscalls. Every syscall is an async function: Joe Duffy - Asynchronous Everything As it's runtime is based on .NET, memory allocation is presumably done using a bump allocator, like in most GC based systems. It doesn't have demand paging and everything runs in ring0, so no page table change overhead either.