You know, reading this reminded me of this thread. It is a green threading library that implements a work stealing executor. It seems it cannot soundly implement work stealing because of this same local storage thing.
Maybe if it could be temporarily disabled for !Send types at runtime this could be sidestepped. Like, only being disabled in a runtime context.
The analogy to multi-threading isnāt perfect. Unlike for that, for capsules, I think thereās another sound alternative to the effects-marker approach for achieving backwards compatibility:
Simply put, all existing types could be CSend + CSync. This includes ā¦ unfortunately ā¦ all generic parameters, so youād need a ?Sized-style opt-out thing here, too. Yes, marking these cases is a considerable effort, too, butā¦ it might be less bad than introducing an effect for this. Iām not certain.
If all existing types as CSend + CSync, it isnāt backwards-incompatible to require thread_local contents to be CSend.
How could all existing types be CSend + CSync, even though Iād proposed the opposite e.g. for Rc? The answer is following this argument:
All existing !Send or !Sync types could be considered still bound to physical-threads, i.e. āTSync-limitedā, i.e. they would implement !TSend and/or !TSync, but CSend + CSync could be fulfilled. Only new versions of Rc and the like would be āCSync-limitedā. New versions of such types could probably be parametrized somehow, too, so instead of doubling all the API, it would ājustā become more generic.
Alternative design: Only add TSend/TSync and make the current Sync and Send traits have the corresponding bound. This would be backwards compatible(because no existing traits would have their meaning change) while still allowing developers to relax bounds on functions from Sync/Send to TSync/TSend, or to add Tsync/Tsend impls to existing structs, like RC.
Suppose we could somehow allow users to provide an allocation for TLS. Then async runtimes and stackful coroutine runtimes could provide an allocation of their own. TLS would become synonymous to the task_local! and coroutine_local! macros these libs provide. Would the problem go away like this?
I'd imagine that we would then introduce a more restrictive version of TLS that has guaranteed kernel threading semantics, and keep the old one as having user level threads semantics.