What is the purpose of std?

I have wondered: why is there an std at all?

Certain things in core are there with special support by the compiler, that's understandable. But what about other extra "utilities" in std that aren't special?

I can imagine a world where things like Vec and HashMap live in different crates and can have major version upgrades or several competing implementations.

4 Likes

Surely it would be terrible developer experience if basic vocabulary types required adding external dependencies (or often, writing something buggy and inefficient yourself, like in C), never mind having to figure out which one out of several you should use. There's value in having a reasonably compact standard library, but that doesn't mean the global optimum is "nothing but the compiler magic stuff". Certainly the vast majority of Rust programmers would rather it be larger than even smaller.

26 Likes

It's best to have fantastic first party modules/libraries/whatever, than multiple third party libraries that do the same thing, that may get deprecated, and make you waste time figuring out which one is the best to use ( javascript front end frameworks/libraries experience this a lot)

First party libraries are usually well optimized. Rust's std is great.

2 Likes

Both of these problems could be solved by having cargo new add a set of default dependencies.

2 Likes

There's also the problem of ecosystem splits, where you may end up needing to employ bridges between several options because some of your dependencies made one choice and others picked something different.

13 Likes

Some of the types only work well if everyone shares their definition:

Option and Result would be painful if not everyone used them, even though they aren't release special in the "compiler magic" sense of special. Look at C++ for example, where there is nowadays an optional in std. But because it was late and is not used universally, the developer experience isn't great.

Vec and HashMap are not quite as central, but try using something else (eg Dashmap, or even just a normal HashMap with a different hasher). Many crates are made to work with the bog standard Vec or HashMap, and you end up having to convert data back and forth, often nulifying the performance boost from more optimised types in the first place. I have run into this a lot, since I don't need DOS resistance in my projects, the default hasher in std is simply bad for my use cases. Stable build-std with configuration flags would be the ideal solution here.

Then there is the question of wide platform support: I might use rustix to do file IO, but then I can't run on Windows. Std basically supports and abstracts over every platform that Rust supports (well, ignoring no-std embedded for obvious reasons). It is a common base line over operating systems.

There are probably some bits that could be argued aren't needed in std, but I find Rust std a lot better than most languages.

11 Likes

I also would prefer for Rust to have a more "modular" std. Not necessary on the level of Vec being implemented in a crates.io crate, but at least as additional "sysroot" crates. std then could play a role of a convenience facade crate which simply re-exports items from other crates.

It would make it easier to see what is required by which crate, e.g. I could easily see that foo uses HashMap, while bar does something with file system. Obviously, this does not provide any guarantees, but it still can be useful. And it would make it easier to introduce other sysroot crates and implement partial std support for new targets.

Unfortunately, I don't have high hopes for this to be implemented in Rust 1. Similar ideas were proposed in the past with disappointingly low traction.

5 Likes

Vec is, of course, in alloc not std. And TBH HashMap will move there as soon as we can figure out how to have the "default to a randomized hash seed" part still be in std while the rest is in alloc.

As for things like mpsc, though? Yeah, Rust should just have never provided that in the first place. It's a legacy of very-old "has a GC, a runtime, and green threads" Rust.

That said, I also think it's important for the sort of "CS 101 basics" things to be available. println! isn't something you should ever use in "real" code, but it's important that it's there without extra ceremony to keep new people from bouncing off early.

14 Likes

For Vec it bugs me that it's asymmetric with memory handling: it doesn't release memory automatically when shrinking which makes it a poor default design for lists because you're risking allocating a lot more memory than you're using (e.g. O(n^2) vs O(n)). And now that design choice is stuck in std forever.

Sure, but that could be done in a less intrusive way: cargo new could simply put in a dependency on some stdio crate by default.

2 Likes

16 posts were moved into an new topic: Vec is asymmetric with memory handling

One aspect of this is that being in std (alloc, core) means there'll only be a version 2 if there's a Rust 2.0, which isn't a typical crate guarantee.

2 Likes

I think it would be nice if every module of std::something was its own separate crate, assuming rustc/Cargo made it hassle-free to import it.

std has a problem of being tied to the language and to its stability guarantees. It has a problem of having to make sense on every Rust platform (it already doesn't in browser WASM where there's no std::fs and std::time needs special care).

7 Likes

There already is the concept of the sysroot, where the std, alloc, core, proc_macro, and test crates live. In theory, more crates could live there.

With the (slow but present) work on std-as-a-crate and the accepted RFC for crates-as-namespaces, I think it is a possible future to see some of the std modules lifted into “std crates,” though I doubt that we'll see any of the current modules removed from the “1.0 forever” guarantee.

It would be an informative project to map out the module usage edges in std to see which modules could be split into separate crates. Just looking at std and not core or alloc, I think the only ones with strong enough identity to potentially be leaf crates are env, fs, net, process, time, and thread.

There's an interesting fundamental conflict between the desire to split functionality between boundaries that make sense for the implementation and ones for discovery, since those are often at odds.

2 Likes

Rather than having a ton of separate crates, why not just have a way to check if a given path is reachable? E.g. if std::process is unreachable then processes aren't supported (or at least aren't implemented) for this target.

That works, and cfg(accessible) is RFC-accepted, but it adds further incidental complexity to name resolution (which is already fixpoint with parsing due to namespaced macros, and could introduce actual paradoxes if checking non-extern-crate-rooted paths is allowed), and having static metadata declaring what a given crate is dependent on is beneficial to build orchestration.

Plus, crate separation enables major-version bumps, even if they're unlikely enough to treat as impossible[1].


  1. There's one way I see it happening: major API versioning which is just a facade over what symbols are visible, similar to the IntoIterator edition hacks. It's still unlikely and just deprecation would always be preferred, but I could see it happening, unlike a hard split. ↩︎

2 Likes

There are certain historical reasons, std was introduced before core and alloc. However, std has its necessity, as it is precompiled and improves compilation speed.

Under certain extreme conditions, it may be necessary to recompile std. For example, the LoongArch target now enables LSX (128-bit SIMD) by default. However, some embedded LoongArch SoCs do not support LSX, so it needs to be excluded: How to remove default target feature LSX for LoongArch64

Can you show an example of such a problem?

In theory this could be said about almost any crate: if somebody depends on Rng from rand 0.8 and I use rand 0.9, we have a problem using the same random generator. But somehow people seem to deal fine with major version updates for the most part.

Your random number generator is generally not part of your public api. HashMap's with a fixed hasher are often part of the public api.

4 Likes

Say I want to build a crate that implements some randomized algorithm. What's the right way to do it? I need a random number generator, so optimally I'd want the user to be able to pass the generator they want to use:

pub fn rabin_miller_primality_test<R: Rng>(n: u128, rng: &mut R) -> bool

If I use my own generator then I have to worry about choosing which generator which may depend on the use case, and initializing it somehow which is duplicating work if the user already has one. So it seems better to expose this in the API. Exposing it also allows the user to seed it so that they have repeatability, perhaps as part of some larger randomized algorithm.

1 Like

For rand, you’d probably want to write R: RngCore instead of R: Rng. The rand_core crate didn’t change between rand versions 0.6, 0.7, 0.8. It did change towards 0.9 though – but at least there’s less frequent breakage.

3 Likes