Discussion: changing the sysroot for cross-compilation


#1

One of the last remaining things that prevents me from using stable rust for some of my no_std projects is that I still need to cross-compile libcore, liballoc, etc. This leads me to use tools like cargo xbuild, which are forced to be unstable because they involve changing the sysroot.

I wanted to start a discussion on how we might stabilize this ability while also being forward-compatible with a future in which cargo subsumes xargo. Specifically, I really don’t know much about how sysroots work. I would like this thread to be a place where we can flesh out

  1. What are the challenges to stabilizing a mechanism for changing sysroot?
  2. What might a possible design be that is forward-compatible with merging cargo and xargo?

#2

So… to seed the discussion, let me propose the following:

  • A “sysroot” directory is defined as a directory whose contents are a bunch of subdirectories, each named after a target triple. Each triple subdirectory contains the compiled form of central rust libraries compiled for that target triple. In particular, they must contain libcore and libcompiler_builtins and may optionally contain liballoc and/or libstd.
  • cargo and rustc gain a --sysroot=/path/to/sysroot flag, which specifies the path to the sysroot to use for whatever operation you are doing.
    • If the sysroot specified does not contain the required target triple, an error is reported.
    • The default value for this flag is wherever rustup normally puts stuff. I do not stabilizing this default location for now.
    • The --sysroot flag is available on stable rust.

#3

+1 for rustc gaining a stable --sysroot flag - GCC can be built with support for the same option and it is very convenient for building freestanding C projects.

However, I think Cargo could automatically choose the correct sysroot (based on the triple passed to it with --target), and for us to stabilise a new sysroot location along with this - ~/.cargo/toolchains/{triple}/. In the future, Cargo could build custom toolchains for cross-compilation in this location, and rustup (and eventually Cargo itself if they’re merged, which I think there’s a plan for?) could move to installing the downloaded toolchains here (instead of in ~/.rustup/toolchains/).

This enables:

  • Existing tools to build a sysroot wherever they please, and pass it to rustc with --sysroot
  • Future Cargo to build toolchains for cross-compilation and pass them in the future, in a location that makes sense

One question for the discussion: if we centralise the sysroot for all targets, what should two targets with the same triple defined with .json files be stored under? I only have knowledge of the internals of cargo-xbuild, for which this is not a problem because we store the sysroot under the /target/ file for that crate. I don’t know how xargo does this.


#4

What you described sounds like a good long-term goal and agrees with what I imagine for the future too. However, my impression was that the cargo team is avoiding anything major for the time being, so I was trying to find a minimalist approach. In other words, the minimal set of stabilizations that are future-compatible and also allow cargo xbuild to run on stable.

I think my proposal also accomplishes this, no?

The solution that comes to my mind is to simply check if the targets are actually the same (e.g. compare hashes of the .json files) and just throw an error if they don’t match.


#5

In fact, now that you mention it, I bet this could be done without adding a new flag to cargo at all. We only add the --sysroot flag to rustc for now, and tools should add RUSTFLAGS=... to the environment before invoking cargo.


#6

Yes, sorry if I was not clear. I believe the minimal set of stabilizations needed atm would just be the addition of --sysroot to rustc, and no changes to cargo. This should allow tools such as cargo-xbuild to work with stable versions of Rust, and is also forward-compatible with the long-term goal I laid out, and wouldn’t create useless-in-the-future-but-stable features of Cargo that we have to keep around.


#7

Perhaps I’ll write up a Pre-RFC soon…


#8

Note that the sysroot concept could be removed altogether, because Rust doesn’t really need it anyway: in theory, std and the rest could be just a usual dependency in Cargo.toml. Such “std is a usual dependency” situation would be beneficial for several reasons: for example, one would be able to select optimization flags for std oneself. This is in contrast to current situation where std has to come in one size that fits all, and, for example, contains a significant amount of debug info, which blows up the hello-world binary size.

I think this is the latest summary of the idea: https://github.com/rust-lang/cargo/issues/4959#issuecomment-374015022.

The main non-technical problem with this idea seems to be that it is a rather significant change of how things work, so it’s much harder to pull off (you need a dedicated person/team) then to incrementally paper over drawbacks of the existing implementation. So, we’ve been talking over it for years, but it never came to fruition, although, as I understand it, there are no blockers except for design/implementation work.


#9

The problem with RUSTFLAGS is that it’s just a single string, not an array of arguments. An example where this leads to problems is https://github.com/rust-lang/cargo/issues/6139.


#10

@phil_opp I’m not that familiar with Windows, but on Linux that can just be solved by escaping the spaces. Either way, I recognize that it’s not ideal, but it is minimal and doable. Better interfaces can be constructed as they become needed, but my concern for now is just to be about to use stable rust.

@matklad Thanks for the info! I’m curious if you see a compromise between that thread and this one. I would really like a std-aware cargo too, but that seems so distant.

Also, I’m not convinced that libcore/std/etc will ever be fully normal libraries. They are inherently a bit special by virtue of always being included in the prelude. It seems like there will always be a need for a stable mechanism for specifying where these crates are. If not sysroot, what would you recommend?


#11

Unfortunately escaping doesn’t work, neither on Windows nor on Linux.

I understand that we should keep the changes minimal, but stabilizing something that doesn’t work if your username contains a space seems like a bad idea. I would prefer to also add a --sysroot argument to cargo or alternatively some --rustc-argument flags that can be used instead of RUSTFLAGS.


#12

Let me write up my thoughts about the issue. I am probably not the right person to talk about sysroot to though, I’ve never cross-compiled a rust crate!

Let’s define what a sysroot is, first. This boils down to how compiler finds a foo crate for an extern crate foo declaration. There are two mechanism for this:

  1. you can pass rutsc --extern foo=/path/to/foo.rlib on the command line
  2. compiler has a special directory where it tries to look up crates by name, (not unlike C’s include/library paths)

Everything beside std uses the first mechanism. std and its dependencies use the second mechanism. Here’s a simple program which demonstrates these lookup rules:

λ cat main.rs
extern crate rand;
fn main() {}

~/tmp
λ rustc main.rs
error[E0658]: use of unstable library feature 'rustc_private': this crate is being loaded from the sysroot, an unstable location; did you mean to load this crate from crates.io via `Cargo.toml` instead? (see issue #27812)
 --> main.rs:1:1
  |
1 | extern crate rand;
  | ^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0658`.

What happens here is that rand is used in a sysroot, as a private implementation detail of libstd. However, because it is in sysroot, this impl detail is observable, and so the user gets a confusing error message about an unstable feature, instead of the usual “can’t find crate”.

It seems to me that this difference between std an other libraries is not essential. Std is definitely special because of prelude and lang item. However, it doesn’t need to be special in a way how rustc discovers std crates: they also can be passed as arguments. Of course, we should assume that, unless overridden, extern std=$(rustc --print sysroot)/lib/rustlib/... is passed by default. Otherwise, running rustc main.rs by hand would be a nightmare.

Another accidental difference between std and usual crates is that std is always prebuild (distribution of Rust includes rlibs), while crates.io dependencies are build from source. As xargo demonstrates, it is possible to build std locally. However, it is an interesting question stability wise.

Currently we guarantee that for each tier-one platform we will supply prebuild standard library with each compiler. If we stabilize ability to build your own sysroot, we, by default, guarantee that it is always possible to build standard library locally. I think this is an unreasonable guarantee: for example, if stdlib depends on some native code, you’ll need to have a C compiler to build it. Even if we remove all C from stdlib (like jemalloc), we can’t guarantee that we won’t need C in the future. So, we need to be very careful with specifying what “you can build sysroot from sources” means.

Also, while writing it down, I’ve got the idea of an incremental step to make sysroot situation less special. Will write it in the next comment.


#13

Hmm… that’s deeply unfortunate. Is there a fundamental reason why that can’t be fixed? It seems more like a bug in rustc…

Also, on a slightly unrelated note, why do rust and cargo take so many environment variables? I would much rather use flags.


Ah, this is an interesting point.

I don’t think this is accidental. Not having to rebuild large crates over and over seems essential. I would be pretty annoyed if I had to keep rebuilding libstd. In fact, I have often wanted the ability to reuse built crates from builds of different projects.

More generally, from the point of view of rustc, everything is either (1) the current crate being compiled from source or (2) a dependency in the form of an rlib, right? The difference you pointed out only matters to cargo.

I think there is a subtlety here: I’m not proposing that we guarantee you can build a sysroot. I am proposing that we guarantee you can specify a sysroot that is already built. It also happens to be true that you can build a sysroot (and it’s likely to stay that way on any platform that rustc developers work on for obvious reasons), which is good enough for me, but it also allows flexibility to change the way things are done in the future.


All of this seems to point to another reasonable (maybe even preferable) alternative, as @matklad pointed out: we just make it possible to pass --extern core=/path/to/core.rlib and have the compiler actually respect it, while defaulting to wherever the unstable location is. This seems to have a few benefits:

  • It can be implemented purely as a change to rustc, not cargo.
  • It does not require stabilizing sysroots or any currently unstable locations, which by extension, means we don’t have to deal with stability guarantees.

However, it would be hard to use unless the behavior @phil_opp pointed out is fixed.


One other thing I thought of: even if we can specify --extern core=/path/to/core.rlib, is it even possible to build core on stable? Does anyone know what work needs to be done on that front?


#14

So, my grand idea was to use a plain old Cargo worksapce to build a sysroot, but then to just copy the artifacts to their usual location. That way, we dodge the implicit dependencies issues, while preserving the current sysroot interface (no dependencies, yay!)

However, we already use Cargo to build stdlib, and we already specify proper dependencies in Cargo.toml: https://github.com/rust-lang/rust/blob/b1ca3907e00211b2f645133af3574ca22e4f4f4d/src/libstd/Cargo.toml#L15-L25. We still need multi-stage build for crates like libtest, which are part of sysroot themselves, but need to link with std. So, there’s comparatively little to clean up:

  • allow to add std = {path = "..."} as an unstable feature for Cargo, solely for the purpose of building sysroot itself (it will be a stepping-stone for treating std just as library, but we can punt on the design at the moment).
  • move all sysrot crates (libcore, libstd, libtest, etc) to a separate worksapce/folder in the rust-lang/rust repository (I think this has a nice side-effect: new contributors can clearly see which parts are compiler, and which parts are runtime)
  • instead of handling sysroot staging in compiler’s rustbuild system, handle them via Cargo.

#15

It seems so:

~/projects/rust/src/libcore tags/1.30.0*
λ git log -n 1 --oneline
da5f414c2c (HEAD, tag: 1.30.0, upstream/stable) Auto merge of #55315 - pietroalbini:release-1.30.0, r=Mark-Simulacrum

~/projects/rust/src/libcore tags/1.30.0*
λ rustc --version
rustc 1.30.0 (da5f414c2 2018-10-24)

~/projects/rust/src/libcore tags/1.30.0*
λ RUSTC_BOOTSTRAP=1 cargo build
warning: the cargo feature `edition` is now stable and is no longer necessary to be listed in the manifest
warning: the cargo feature `edition` is now stable and is no longer necessary to be listed in the manifest
   Compiling core v0.0.0 (/home/matklad/projects/rust/src/libcore)
    Building [                                                           ] 0/1: core                
    Finished dev [unoptimized] target(s) in 23.25s  

#16

Thats just a hack to support bootstrapping rust using a beta. (or stable for stable and beta channel)


#17

So if I understand you correctly, to use a custom target triple, one would just use that triple, and cargo will do all of the building/moving of artifacts in the background?

Also, the std = {path = "..."} mechanism would only be used for the compiler itself, right? i.e. I wouldn’t need to use it when cross-compiling?


#18

Note sure: I think it should remove the user-specified staging from this proposal and simplify building the sysroot to just cargo build. You’ll still need an unstable compiler to build and use sysroot (you can build a sysroot using just a stable compiler, but that ability itself is not stable).