I wonder if your example could get away with defining rng
even more generically: impl Iterator<u8>
(or some other numeric type).
We have cfg(target_has_atomic)
; perhaps we want a cfg(target_has_process)
, cfg(target_has_thread)
, and so on, for high-level groups of functionality that commonly doesn't exist?
One thing that bugs me is that the stdlib doesn't come with a dynamically sized, rectangular 2D matrix with elements contiguously laid in memory, so this has been reinvented half a dozen times across the ecosystem. And it's not necessarily a matrix of "numbers" - you could for example have a matrix of pixels (that is, an image) among other things. The trouble is that if you are writing a crate that accepts such a type, you will probably not readily interoperate with the other definitions of (abstractly) the very same type defined elsewhere.
I remember that somewhere people talked how having custom metadata on dynamically sized types (rather than always being a length, or a vtable pointer, or something else special cased in the compiler) would enable ergonomic 2D slices - I'm not sure if that's still a design goal, but it means very little if such type isn't added in the stdlib in the end.
(Fortunately, this specific kind of matrix basically has little leeway on how you define it - the only real design choice is whether elements are laid in column major or row major order, and maybe the stride - so crates that offer low level access to the buffer gives enough building blocks to manually convert between types themselves, at cost of some friction and boilerplate, which in practice discourage interoperability but doesn't really prevent it)
Another, more serious failure of interoperability was the proliferation of async crates that depend on a specific executor. Nowadays, mostly tokio, but on its time, you had crates made to work with async_std as well, or the worst of all worlds, crates that special cased multiple executors in such a way that adding a support for a new executor exacerbated the maintenance burden. (I was just reading this blog post that hints on an alternate path I wish Rust had taken from the beginning)
Of course that's not without upsides - having all crates depend on tokio means that all crates within the tokio ecosystem interoperate greatly. But I'm sure that's not the vision people had for async Rust when it was being devised (well at least not all people). And at this point, network effects made tokio so fundamental to the ecosystem, with so many crates depending on it, that one wonders why parts of it - the most stable ones, if any - weren't upstreamed to the stdlib already.
Another thing that happens is that many, many crates define serde derives (usually behind a feature flag) which is great - serde is a great interoperability success story for Rust - but then other deserialization frameworks like rkyv must fight an uphill battle (they have their own derives, and I'm not seeing everybody adding a new feature flag for rkyv derives to each crate). Here, something like Haskell's Generic would be nice: rather than littering types with multiple derives that basically do the same thing, have a single "mother derive" defined in the compiler, and have things like serialization frameworks just consume it. (My understanding is that this was what JeanHeyd's "a mirror for Rust" was all about)
So, I think the stdlib should focus on vocabulary types and traits that promote interoperability, to avoid both fragmenting the ecosystem and also to favor winner-takes-it-all crates that become de facto standard through network effects (like the examples of serde and tokio), to a point they could hardly be displaced or coexist on equal footing with alternative crates.
And this is really promoted by having a single, unified stdlib facade that is implicitly depended upon, rather than tons of small, standard-ish crates, because since people really spend no effort on depending on the stdlib it's a common denominator that only few crates specifically opt out.
That is the goal of the stdlib in fact. The problem is that you want to be careful on establishing a base type for a particular domain until the crates ecosystem has had time to work out the best way of doing that.
Think of it this way, for example, Tokio and async-std are two different implementations that don't completely interoperate. What should that interoperation look like. That is still being worked through and worked out. Once it is clear what interoperation is needed, then, and only then, should it be added to the stdlib.
I think the problem is that there's too many reasonable choices for that to be a good fit for std. How would it pick between Z-ordered, row-major, or column-major, for example? There are plausible reasons to want all of those.
You also want to avoid the std::valarray - cppreference.com situation, which was added to the standard but AFAIK is generally not actually used.
There would need to be a very definitely dominant multidimensional array to warrant inclusion into std, and it would need an almost future proof interface. That said, I think it would be very reasonable for basic slices to be able to be written [i32; (3, 3)]
, and this would find its way into a lot of my code.
So to kind of summarize...
Std must contain:
- Magic things that can't be implemented outside of std (intrinsics, primitives,
UnsafeCell
, &c). - Syntactic things that form the basis of various desugarings, like
Iterator
,Future
,[T]
, &c.
Std should contain:
- Vocabulary types that many/most programs need to talk to each other, like
String
and[T]
andi32
. - Vocabulary interfaces like
Read
/Write
. - CS 101 things with ~1 obvious way of doing the right thing. I would be frustrated if using
dbg!
required adding a crate, importing it, usingdbg!
, then removing the crate/use
afterwards.
Std should not contain:
- Domain specific libraries.
- Libraries for which there is no obvious 90% correct use case.
- Many other things.
The "must" category makes sense, but I can imagine shrinking parts over time. A more generic way of calling into llvm intrinsics could make defining some "magic" types more of a library exercise, for instance. Parts of this category should be permanently unstable anyway.
The "should" category functions very much like a well trusted library/set of libraries that simply never increments its major version. I don't think this would have worked as well as "just a crate" when rust and cargo were younger, and it's a bit baked in now, but probably worth cleaning up over time. I'd love to see a kind of unified vision of what that looks like, but it sounds like that all depends on -Z build-std
anyway. Balkanization of standard libraries is a very delicate issue whose damage will not be immediately apparent; it is worth being extremely conservative.
There would need to be a very definitely dominant multidimensional array to warrant inclusion into std, and it would need an almost future proof interface. That said, I think it would be very reasonable for basic slices to be able to be written
[i32; (3, 3)]
, and this would find its way into a lot of my code.
This goes both ways. Becoming dominant requires everyone to agree and not reinvent bike sheds full of wheels to fit their personal syntactic preference.
For the multi-dimensional slices, there's this relatively recent discussion: Pre-RFC: &[T: Trait] -> &dyn [Trait]
It is nominally only about trait object slices, but naturally generalizes to any DST as the slice element. Well, as long as the metadata is sufficient to know the element layout.
That's true. And what's worse, the differences between tokio and async_std might not be that large when you compare with io_uring executors and things like embassy (but, not sure on that). So this problem isn't easy to solve, and of course a bad solution in the stdlib might be worse than no solution at all. But since it's still unsolved, the ecosystem collapsed into depending directly on tokio. So waiting too much before moving things into the stdlib also has a cost
I think that if there was any sort of agreement on what the interoperation should look like it would already be in the stdlib. Instead we (the Rust community) are shaking it out in crates until it becomes clear what the solution should look like. Rushing something into the stdlib just to have something wouldn't solve anything.
While it isn't gone, Rust's trait system and create features means there's much less issues with a community being able to align on vocabulary types and traits.
For example if you wrote a cool math library that works on std integers, and someone else wrote a cool big int library, then it's easier and safer to create a PR to add a mint
feature to both so they work together than in most other ecosystems where that would require everyone using pulling in this common vocabulary package even if they're not using it, configuring the package build, and likely breaking API changes or using the library differently to make them compatible.
In Rust it's just adding an optional dep and a trait impl guarded by the implicit feature in both.
In my opinion, this feels too hack-y. This approach will quickly devolve into target_has_time
, target_has_randomness
, target_has_file_system
, target_has_network
, etc. And it also makes it hard to create partial implementations of "standard" functionality. For example, what if I want to use std::net::TcpStream
on my bare metal target? Should I define a whole new custom target just for it?
The sysroot-based approach looks more natural to me (see the platform abstraction layer crates linked above by me). We definitely need generalization of #[global_allocator]
, hopefully, usable outside of std
as well.
Yes, and? Why would that be bad?
(In my opinion that would be an improvement on the long lists of specific target_os values people tend to use now.)
This has failed spectacularly in the async ecosystem though. Only tokio and embassy work, each in their own domain. Using anything other than tokio on desktop is utterly painful.
Similarly, good luck serialising most types with anything except serde. And serde is not really usable for zero copy deserialisation. I'm using rkyv in a project currently, and this creates issues with library types, because it isn't nearly as popular as serde.
So while in theory you have a point, in practise it doesn't work well.
smol works really well, and there are adapters like async-compat
for things that speak tokio.
I do still expect that we will end up shipping async I/O traits in the standard library, as well as an executor trait and functions like spawn
and spawn_blocking
and block_on
.
It is unfortunate (from the perspective of a maintainer of several network protocol libraries) that there seems to be little appetite to prioritize this work (AsyncRead/AsyncWrite and the buf wrappers that they depend on).
The various futures-
crates seem to be a start of aligning on some executor-independent traits too?
In particular, it turns out there's good reasons to not just put the current tokio/futures-io traits in std - the point of crates is to iterate and find what works. Off the top of my head, async cancellation, zero copy buffers and completion based APIs all seem to be mutually incompatible, and lots of people have been cooking on this for a long time now.
While I haven't looked into smol, I did look at monio and glommio (I'm interested in io-uring). Unfortunately if you have any dependencies that use tokio you are going to end up with two async runtimes in your process, defeating the point.
And the traits of tokio are (as I understand it) fundamentally incompatible with completion based IO like uring. Hopefully whatever std settles on is compatible with completion based IO.
For me this is especially important for file IO, as that is a joke in tokio (it is spawned on a separate blocking thread pool).
I don't think a lot of actual cooking has been happening recently? There is the borrowed_buf
feature but AIUI it has been stagnant for a while now. The futures traits are suboptimal to because they don't support reading into uninitialized memory, which was proven to be decent performance hit.