I fear this thread might be digressing from the original point I wanted to make - maybe this discussion is good, but personally I'm really not all that interested in having a "C1M" server capable of handling millions of connections. What I and many others in the Rust community want instead is a stable, by the Rust developers endorsed library, which is simple to use and can easily be integrated into libraries and applications. Honestly a "c10k" would be enough at this point, as long as something stable comes out of it, which we all can rely on.
The reason I choose stackful coroutines is because they can provide an interface which is mostly identical to the already existing synchronous I/O facilities. Furthermore they are very likely to provide safety guarantees which no other solution can provide.
If stackful coroutines are really that unacceptable, it would probably also be fine to use stackless ones (async/await), or to use simple Promise/Futures. Because if we don't have a good official/endorsed solution soon some people will settle with library A, while others will choose B and so on. And this will very likely ultimatively create the same fractured library situation which C++ has - or "had", because even they seem to adapt ASIO as the official solution soon enough. I like Rust and I'd do a lot to not see this happening here.
I personally feel like asynchronous I/O is a highly underappreciated topic for the Rust developers, since progress on this seems far slower than it should be in the age of millions of connections and the IOT.
Now onto the offtopic:
context-rs
is actually a bit faster (~25%), all while being capable of additionally transferring a usize
over. It can also invoke a callback function after a switch (after which you could for instance safely take out the coroutine handle from the processor).
Movable stacks can only exist with languages, which use a garbage collector. So Rust won't have that. Go uses stack copying which is a lot better than segmented stacks but also only possible with a garbage collector.
I was previously talking about "large" stacks with guards. Those stacks would allocate memory using mmap
of course, possibly using MAP_NORESERVE
to prevent the OOM killer from killing your application. Thanks to virtual memory they in fact also start out at 4KB physical memory. We could provide options to manually specify the size of a stack and even whether it should have a guard page. I've observed my small HTTP server (similiar to those for C1M demos) to use about 16KB of physical memory in average per connection which would result in 16GB physical and in my case 128GB virtual memory usage. I think it would be a good thing to have segmented stacks in Rust sooner or later anyways and then you could swap between using those, depending on your needs.
One could argue now that state machines and the like use less memory - which is true - but then again you will have to fight the issue that every second connection doesn't receive any data because they are coincidentally treated more unfairly. Your average software developer will then have to fiddle around with the code a long *** time before he figures out that he continues reading from a socket right out of it's completion callback, which makes other connections starve randomly, because the OS accidentially has always data ready for the same couple sockets.
People thankfully don't underestimate the complexity of developing safe crypto functions anymore. Issues like timing attacks etc. are simply things developers normally don't think about but are very critical there. I recommend to not do the same mistake in server systems to underestimate the effect of randomness. Making it safe to use for average joe without forcing him to think about every little detail should IMO have a higher importantance here.