Re-reading, and quoting lots of quotable people:
All of these posts got me thinking about the log crate, and how it explicitly states that it is a 'lightweight logging facade'. Instead of pulling something into the standard library, can we create a facade for scoped threads, and also provide one (or more) submodules/subcrates (forgive me, I'm still learning rust's terminology) that provide default implementations of the facade? The advantage of this is that if in the future rust starts targeting stuff like Vulkan/SPIR-V, then people can provide crates that implement the facade, and users can use it as a drop-in replacement (my assumption is that even if rust gains the ability to compile code to SPIR-V, someone will still need to write a crate that understands the Vulkan compute APIs to move the compiled code onto the GPU. Same for CUDA, OpenCL, some kind of distributed architecture across a compute farm, or even combinations of all three).
I would also like to see an architecture where we can compose different concrete implementations together to handle bigger problems. For example, rust could have the equivalent of python's SCOOP at a high level, then on each machine there may be scheduler that understands how to feed jobs to the CPU & GPU (or GPUs), and even lower-level schedulers for each CPU or GPU core.
OK, I've come up with the big idea, now someone go make it happen! (please take that as a tongue-in-cheek joke, and not as an insult...)