Build-std and the standard library

@alexcrichton and I have been working on the build-std feature of Cargo that allows it to build the standard library from source for any project. One of the tricky issues we've been wrestling with is determining which crates of the standard library to build for each project, particularly for no_std targets. Currently you can pass them on the command (like -Zbuild-std=core,alloc).

After discussing some different approaches we've concluded on a somewhat radical proposal: Change the standard library so that all of its crates build on all targets, no questions asked. We wanted to solicit feedback from a wider audience on this proposal.

Why build std on all targets?

The current strategy of specifying on the command line what crates to build is not something we ever intend to stabilize long-term. The set of crates to build is something that the end-user shouldn't be configuring, but rather it's inherent to each crate itself (more or less), so the current flag is a bit backwards.

If std simply builds on all targets, we think this could significantly simplify the user experience. Enabling build-std mode would be a simple toggle (like setting CARGO_BUILD_STD=true) which should work for any project.

Getting std building for all targets

At a high-level what I mean when I say "get std building for all targets" is simply "no compile errors". There's a number of questions about what APIs are available, but the bare minimum intention here is simply that the crate compiles with zero compile time errors.

In practice this means that we'll change the source of the standard library to some combination of adding runtime errors (like wasm, wasi, sgx, hermit, and cloudabi already do) and cfg attributes to prune things that are not possible to express (for example a target which doesn't define the types for the std::os::raw module).

Does this affect stable?

The intent is that initially this shouldn't have any user-visible changes on stable. For all targets that rustc supports std either builds today or doesn't:

  • If std builds it's already stable and nothing changes
  • If std doesn't build, then this proposal would mark std as #[unstable] for that target for the time being.

The longer-term future of this proposal is somewhat unclear. From the Cargo build-std perspective there's no need to get a stable build of std on all platforms, just the guarantee that it simply builds on all platforms.

It's hoped that this organization is something we'd experiment with over time. For example, maybe it's best to return runtime errors everywhere. Or maybe it's best to simply omit APIs and have a libstd that looks different on each target. We're not sure!

I recognize that it won't be easy, and I've been thinking about some of the trickier bits (handling libc, proc_macro, the panic runtimes, the global allocator, the sys module, etc.). I would be happy to get some feedback on the general concept! I'm hoping that we don't get too hung up on "what would a stable libstd look like for all platforms" question just yet, but moreso hopefully gearing feedback towards the idea of getting libstd building everywhere and leaving us runway to answer that question later.

12 Likes

Hmm...

So this is necessary because Cargo can't tell which standard library crates a given crate needs? In particular, no_std crates are marked as such only in the .rs file, not in Cargo.toml.

In the long run, I would like to see std and especially core start using Cargo features, and move almost all functionality to being dependent on features, so that you can get a minimalist standard library if you want. (The default set of features would still enable everything.) For this to work well, crates should be allowed to specify standard library crates as Cargo.toml dependencies, since that would be the place to specify which features they need. Incidentally, it would also effectively create a way to specify a minimum supported Rust version:

[dependencies.core]
version = "1.39"
features = ["fmt"]

In theory, this could also provide an alternate path to solving the issue at hand. All existing crates would have to be treated as potentially dependent on std. But Cargo could adopt a rule that if a crate's Cargo.toml has core in its dependencies list but not std, the crate is assumed to be no_std (and ideally this would be enforced). A future edition might then make it mandatory to explicitly list std or core as a dependency.

On the other hand, that's admittedly a somewhat long-term approach, in that it requires all no_std crates to update their Cargo.toml in order to take advantage of the new functionality. I'm not against the idea of making all standard library crates build everywhere, either as a temporary workaround or indefinitely. But I don't fully understand the use case. If a given target supports barely any of libstd, which would presumably be the case at first for any target without an allocator... then what's the benefit of building an empty libstd over just not building it? Crates that use std will break either way.

In the future, "a libstd that looks different on each target" could be implemented at least in part based on Cargo features.

18 Likes

I like your proposal overall much better than stubbing out everything everywhere. but rather than implicitly adding an edge to std, why not just let the feature only work if it can figure out the dependencies, and prod people trying to use the feature it into adding them...

I'm not a diplomat, but I guess I just don't see the point of kicking this implicit dependency further down the road.

I don't have a strong preference one way or the other. But as I said, I think it might be nice if a future edition required all crates to list std in their Cargo.toml. This would both encode the minimum supported Rust version, and more generally make std feel less special compared to other libraries.

Admittedly, it would also create churn in literally every crate, which should always weigh very, very strongly against making any change. That said, changing editions already requires editing Cargo.toml (just to specify which edition you want), and it would be very easy for cargo fix to add the std dependency as well. (But not everyone will use cargo fix when updating editions...)

Anyway, if we do decide to do that, then it would make sense for the edition to be the one and only determinant of whether you need to explicitly depend on std. Code on existing editions shouldn't need to do so, even if it wants take advantage of new features such as std-aware Cargo. Though the unavailability of std on some platforms does complicate things. shrug

2 Likes

I was under the impression that this was the goal of the "portability lint std" anyway: while functionality implementation might be strewn all around multiple standard library crates, std would be the facade through which all of it is accessed, always.

That said, I do currently think that the ideal end position for "std aware cargo" would see an entry in Cargo.toml for std. It makes some sense to assume a std=latest if neither std nor core are specified, and not if either are explicitly specified. (Then have some sort of escape hatch for no_core inasmuch as that's ever a supported use case.)

But short term, I think making std "usable" on more platforms (at least those that have more than just core) is still worthwhile. Cargo could also short-term grow an unstable configuration to control which crates are implicitly available; this would be helpful now for making sure crates that purport to be no_std don't accidentally include std at some point in the heiarchy.

1 Like

I guess I should also at least state where my position stems from in that it is atypical and not really supported afaik...

In our case we want to have the native syscall just using core, and then provide alloc + posix emulation as opt-in crates/features. With these stubs it seems like it entails distributing/linking to separate compilations of std, rather than linking in the new crates providing new symbols.

Anyhow... i'm guessing weak symbols would work, but from my perspective it'd be nice to eventually have something which deals with stubs that may or may not be implemented depending upon compilation.

So.. how many times/how often would core/std get compiled then..?

This seems like a good reason to implement the portability lint.

2 Likes

Once within a dependency tree, if core/std participate in Cargo's dependency system they should have features unioned like a normal dependency.

So there'd be a build once for each "top-level" workspace you work upon right? How about crater? Would that get a lot slower too?

I think this seems like a necessary step toward a separate long-standing goal: move from a multiple crate model of core/std to a cfg-flag model of core/std.

There are core-compatible APIs we cannot move to core because of coherence and so on, which would be totally irrelevant if core were just std with no default features turned on. It would be much more flexible if we were able to treat alloc, std, etc as features on a single library instead of multiple libraries. This is completely separate from whether we change what the features actually are (that is, it doesn't mean we would need to be more fine grained that we are now), just how we build them.

A good example is making the Error trait available in core.

19 Likes

I am a big fan of doing this.

7 Likes

[Warning: non-authoritative source. Not sure what @ehuss and @alexcrichton are planning.]

In the short term, I'd think we could keep including libcore in the sysroot and using it by default; Cargo would only build its own copy if you either request non-default features or build for a custom target.

Long term, I'd really love to see a system where any crate could get the benefit of being precompiled rather than just the standard library. That way, the standard library could stop being special, while vastly improving build time instead of regressing it. :slight_smile: Start with a system-global cache for compiled crates, ideally supplement with an online service that automatically builds things from crates.io. Yes, there would be a lot of obstacles to implementing this, but it's still worth it IMO. In any case, it's a subject for another thread.

Using explicit std dependencies is definitely an option, and one of our initial considerations (in fact, it is already implemented). However, it incurs some drawbacks that I think can be removed completely by just making std build everywhere. Some of those drawbacks are:

  • For the vast majority of projects (using std), it provides no benefit over the current implicit system. I'd prefer to avoid adding needless ceremony.
  • For backwards compatibility reasons, Cargo needs to be pick some default for projects that do not declare its dependencies. One approach is to infer what is needed from the rest of the build graph, falling onto some global default if none of them do. This causes some potentially strange non-local reasoning (the crates available depends on what's in the graph).

The benefit is that there is no need to declare which standard library crates are needed.

I don't want to put any words in his mouth, but Alex is of the opinion that the portability lint is not viable (at least as originally defined). I'm starting to come around to that perspective. I still think there is some value in checking and enforcing compatibility constraints. But this is somewhat orthogonal, as some linting mechanism can be added at any time (maybe not necessarily cfg based).

Why can't Cargo just not supply std as an implicit dependency on targets that don't support it? That will cause crates that use std to fail to build, but they would fail to build anyway if std exists but is lacking most or all of its functionality.

I suppose making std build everywhere would simplify the implementation a bit, since Cargo wouldn't have to check whether or not std is available. Implementation convenience is not a bad thing; it may be a good enough justification for making std build on all targets, if there's no downside in doing so. But you're portraying it as a necessity rather than a convenience, and I don't see why it would be.

Do you mean that explicit std dependencies would be fully optional (and not affect defaults), or that they wouldn't be supported at all?

I think they should at least be supported, because they're a natural way to specify features, as well as a minimum version. If they're not supported, feature-ifying std would require some bespoke Cargo.toml key to specify which std features you want, which seems unnecessarily complex.

Cargo doesn't know which targets support std.

This isn't about implementation convenience (modifying std will be substantially more work). It's about user convenience. I'm not sure where I implied a necessity. This is about making the user experience smoother.

By an "option", I mean it is on the table as an alternative to a std that builds everywhere.

I don't agree about it being a good way to express a minimum version, particularly if there are multiple crates (which would require updating many places). The way the rust version works is not the same as how crate resolution works, so I think it could be confusing by conflating the concepts.

As for features, we haven't made any progress on how that could potentially work. Cargo features as-is probably won't work due to compatibility guarantees. It would be helpful to me to gather concrete use cases for std features, since I haven't seen many. If anyone has ideas for things they would like to see, I would appreciate if you could leave a comment on Build the standard library with different cargo features · Issue #4 · rust-lang/wg-cargo-std-aware · GitHub. Until we have an idea on what features would actually look like, it may be premature to assume how they should be expressed.

It could learn.

How so? As long as the default set of features includes everything currently in std, existing crates would continue to work.

This seems bad UX-wise: majority of Cargo.tomls would have these lines, which would dilute their value to the human.

I think it might be right to conceptually treat std deps as just usual deps, but in the surface syntax Cargo should implicitly add [dependencies.std] sections, and have an opt-out syntax (dependencies.std = false, or some such).

2 Likes

If std is available on all targets and there is nothing like the portability lint how would we setup CI to test that our crates are no_std compatible?

Currently the only easy valid test I know of is to build with something like thumbv7m-none-eabi which doesn't ship std, then if any of your dependencies happen to import to std you will receive error[E0463]: can't find crate for `std`. If std was available but stubbed out for thumbv7m-none-eabi, similar to the wasm setup, then this would presumably build successfully and you would have to actually attempt running tests to see the unimplemented! error messages; which is a much more difficult CI setup.

4 Likes

Could we get more detail on this? Honestly, this thread's proposal sounded to me exactly like "the portability lint design was basically right, so we're gonna make std-aware cargo assume that's the future we're headed toward". So if the portability lint is no longer on the table, yet this thread exists, we must have completely different understandings of what it was or what build-std is doing.

2 Likes