Perfecting Rust Packaging

Hey, Rustland.

You want to have a perfect Rust experience on your platform, and so do I.

But Rust needs your help to understand how to package Rust well and consistently across a variety of package managers.

This is a complicated topic. I’m going to describe some simple objectives, ask a bunch of questions, then try to organize the resulting information into a plan with discreet steps that we can take to harmonize the Rust packaging story.

I’ve already pinged the following individuals that I know are involved in packaging Rust to alert them to this thread.

  • Debian: Angus Lee, Luca Bruno, Sylvestre Ledru
  • Fedora: Fabian Deutsch, Radek Vokal
  • Ubuntu: Hans Hoel.
  • Chocolatey: Mike Chaliy.
  • Gentoo: Jauhien Piatlicki.
  • SUSE: Kristoffer Gronlund.
  • Arch: Alexander Rødseth.
  • Windows: Vadim Chuganov.
  • Homebrew: Nobody (Who will represent Macs?!).

Objectives

I’ll refer to the Rust project and its artifacts as ‘upstream’ and everything else as ‘downstream’.

Here are some of the goals I think Rust has:

  • Getting Rust and Cargo into the main package archives of a major Linux distribution.
  • Packages for Cargo always available where Rust packages are.
  • Downstream packages organized similarly to upstream packages.
    • e.g. we currently have distinct rustc, cargo, rust-docs packages, though they are almost always installed as a combined rust package.
    • In Debian I might expect the ‘rust’ package therefore to pull in the same sub packages.

That’s really the important stuff. We want Rust to be available, and with consistent quality, where people expect. Let’s clear the way.

Some of the topics I’m imagining worth working out are bootstrapping, package organization, trademark matters, maintenance responsibilities, interactions with cargo’s on package management, multirust, side-by-side installs, cross-compilation, std packaging, cargo versioning.

I’m sure you have others.

I’d like to figure out the basic requirements here, draft some Rust packaging guidelines and a plan to fully implement them both upstream and in one major downstream Linux distro.

Discussion Topics

  • What is the division of packages that make up the Rust system? I would like this to be consistent(ish) across distributions. Upstream rust, for example currently has ‘rustc’, ‘cargo’, and ‘rust-docs’, and will have ‘rust-std’ (potentially many for arbitrary targets) soon. For distribution we package all these into a single ‘rust’ package so most people never see the subpackages. Debian already has discreet rust-lldb and rust-gdb packages so they can avoid making rust itself dependent on gdb/lldb. Should upstream make this distinction as well? What other considerations does downstream have when deciding the division of packages?

  • What are the unique documentation requirements downstream? The upstream doc package puts the license in /usr/local/share/doc/rust and the HTML docs in the ‘html’ subfolder thereof. It seems likely to me that some distros want to handle their docs in a consistent, distro-specific way.

  • How should multirust be packaged? Toolchain switchers for Ruby are packaged, but multirust in particular is specifically designed to install from the upstream binaries. What is the role of such a tool in a distribution? Is, e.g. RVM modified to work with the distribution package system?

  • Similarly, what are the requirements for side-by-side (SxS) installation in distros? It’s common to see many packages for gcc, with various names, often installing binaries with names like gcc-4.9 to distinguish from the ‘default’ gcc. Does Rust need to be able to produce rust-1.1 and rust-1.2 compilers that can be installed next to each other without interfering with the others’ operation? We’ve put no thought into this upstream.

  • How do we need to improve Cargo releases? Right now new Cargo releases are tagged every release cycle, but crucially we don’t actually pair those Cargo commits with the Rust releases, so the Cargo that comes with Rust will have a version number different from the previous Rust release, but it is not the same Cargo commit as tagged as a release. We also don’t distribute source tarballs for Cargo releases. I assume packagers would be much happier if we bumped Cargo versions in lock-step with Rust and made sure the correct builds of Cargo get released with Rust.

  • Bootstrapping Rust. Bootstrapping Rust requires Rust. This is not uncommon for compilers, but being a new compiler, getting every distro that wants to provide Rust doing their own self-sufficient bootstrapping is hard. It doesn’t help that upstream bootstraps off of arbitrary snapshots, not stable releases. Debian currently bundles a stage0 blob with the source. This is apparently acceptable in a broad sense, but they do not like it. Don’t know what other distros are doing here.

    It seems to me like we might be at a stage where we could insist on bootstrapping from the previous release (and only the previous release), which is only 6 weeks in the past. I used to be opposed to this as too restrictive, but this is a major downstream pain point and we can make some concessions here. My concern with this strategy though is that distros would still need to keep up with our upgrades or risk missing the next snapshot and falling off the bootstrapping wagon, and many distros are not in the habit of updating their software every 6 weeks.

    A slight alternative to this might be to provide a little script that bootstraps accross multiple snapshots, a chain of snapshots, and we could guarantee that there’s an unbroken chain of bootstraps from, say version N-4 (6 months). That way we could continue doing snapshots at arbitrary points in time, as long as we ensured that there is a snapshot that corresponds to every stable release. This would give distros a grace period wherein they aren’t required to stick to the Rust upgrade schedule.

  • Bootstrapping Cargo. This often seems to be the hardest part of porting Rust to new systems. Both Debian and *BSD are using cargo-bootstrap.py to build Cargo without a pre-existing Cargo. Is this the right solution? Should we officially support building Cargo without Cargo to make this easier?

  • Packaging Cargo-built Rust crates. Cargo is a package manager. How do we make it interoperate with the native package manager? I know this is a problem that most modern language ecosystems face, and solve to greater and lesser degrees. Which systems are doing this right and why? I know almost nothing of the problems here.

  • The standard library is going to soon be packaged independently from the compiler, in order to allow install of arbitrary cross-compilation targets. After this we’ll likely have a package called ‘rust-std-{version}-x86_64-unknown-linux-gnu’. An important change after this is that a single rust installation may be installing packages with more than one target triple (i.e. an install will have ‘rustc-{version}-x86_64-unknown-linux-gnu’, ‘cargo-{version}-x86_64-unknown-linux-gnu’, etc. but also ‘rust-std-i686-unknown-linux-gnu’). I’m not sure how such a scheme maps to downstream package managers. Would, for example, Debian package a cross-std as something like ‘rust-std-i686’?

  • LLVM packaging. It’s long been a concern that Rust uses its own fork of LLVM. Recently though Rust’s fork diverges little from upstream LLVM, only doing a few optimaztions that upstream doesn’t want. Rust is, in theory, compatible with stock LLVM 3.7 (and even earlier), though upstream never tests this. Which distros care about being able to use the system LLVM? A relatively low-maintainence thing upstream could do to make this easier might be to set up automation to guarantee that Rust builds with some stable release of LLVM.

  • Default CPU features. @gus pointed out recently that rustc’s default code generation is for i686 while Debian’s 32-bit x86 distro targets i586-class machines. Should we be adjusting the target specs for these systems? How? Any insight into how other compilers deal with this?

  • Distro-specific linker flags. @gus (Debian) also pointed out that there may be need for distros to always pass custom flags to the linker. I don’t have any further information about this. What’s the use case?

  • Upstream we use the custom rust-installer as our packaging system. Part of the intent here, as seen in rust-packaging, is that this format is simple enough to be used by downstreams to derive their own packages. I’m interested in whether packagers are using the rust-installer artifacts as intermediates in their own packaging, and if not, how they are selecting the correct set of artifacts to package. If deriving downstream packages from our own packages isn’t a viable solution, what else might we do to ensure that the contents of packages are consistent across distributions?

  • What sort of packaging guidelines can the Rust project provide, and will downstream care? I’m imagining specifying the package division, package names, the prefered scheme for packaging cross-std, bootstrapping strategies, branding and trademark guidelines.

  • How can we help downstream stay up to date? Is there anything we can do in rust-packaging, like maintain and test packaging rules?

  • Many proprietary projects (think Chrome) produce ‘universal’ debs/rpms, where they are compiled (like our upstream bins) to be maximally compatible, at the expense of distro packaging guidelines, and distributed by upstream. This seems like a good idea for us to do, particularly before Debian and Fedora have gotten their packaging stable and they trickle down to their own downstreams. Is there any desire for upstream Rust to produce such packages?

References

11 Likes

(I suspect this discussion is going to stretch the forum tool)

Just to be clear here, I suggest the "rustc" package should also include the rustdoc executable.

Debian breaks out the rust-std libraries into a "runtime" package (runtime dylibs only) and a "-dev" package (link-time dylibs and rlibs). This is done mostly for cross-compilation technicalities and should be invisible to the user (who just apt-get installs rustc). Specifically, when cross-compiling, you need the -dev package for the host/target arch, and the runtime package for the build arch (because rustc itself needs it).

Note that we replace the link-time dylibs with symlinks to the run-time dylibs in the packaging. They aren't byte-for-byte identical because they come from different stages of the rust build process, but they're meant to be equivalent. This saves 70-ish MB on disk, from memory. I think it would be reasonable to make the upstream rustc install process always do this too.

Also note that distros typically include "source" packages, and we should probably be consistent here too (where possible). I suggest in particular using "rustc" and "cargo" as the source package names (and not "rust" and "cargo", since upstream seems to have also moved to "rustc" for the rustc source archive).

Yes, we move the files all around to fit Debian expectations. Realistically I don't think upstream can predict where that is going to be (particularly across non-Linux platforms) and you may as well just pick a standard GNU-ish layout (like you currently do).

Note Debian also has a policy of removing "web bugs" from HTML documentation for privacy reasons. Eg: we post-process the Rust docs to point to local copies of the Rust logo, etc. Again, just FYI and I don't expect upstream to do any more to accommodate this use case.

This is highly relevant to an an active Debian bug currently under discussion. Should distros package a "nightly" rustc (or some version built to allow access to "unstable" features)? I definitely think this needs some careful discussion and we should have a common policy on how/where/if we expose nightly packages.

This seems attractive at first glance, but consider that cargo, etc need to be taught to use an appropriate versioned executable before it is useful - so I suggest we don't over-engineer this without a clear need/expectation that this will be useful.

It's straightforward enough to do one or the other approach - and an earlier version of the Debian packaging had versioned executables like this. I removed it during a big package re-org because I was trying to simplify things and didn't have a clear use-case in front of me. Happy to follow the community's lead here.

Yes, all these things! Cargo releases should be clearly tagged and build with the most recent stable rust release. I don't care whether the pre-built binaries that ship in the combined upstream binary blob are built from each other, since I won't be trying to rebuild exactly them. I would prefer formal source tarball releases to just git tags (but can live with git tags).

Embedding the upstream stage0 works, and we have argued successfully to have this allowed into the Debian archive. Our current approach can't scale to all the Debian architectures, however, and is the primary reason we only support amd64+i386 architectures currently.

What I'd like to do is ship the stage0 blob for just one architecture, and use that to cross-compile all the other architectures. Complications are that you can't "just" rebuild any rustc release with itself (eg: for bootstrapping other archs by cross-compiling), although I believe this is possible by jumping into the build after stage1 (just haven't tried yet). I'd also really like to avoid re-doing this for every subsequent rustc release, which requires building subsequent rustc releases with something already in the Debian archive (ie: the immediately prior stable rustc).

I haven't thought this through much, but a hypothetical alternative to cargo-boostrap.py might be a cargo subcommand that dumps out a list of source URLs and a shell script of rustc commands that can be run somewhere else without cargo.

I don't think we need to put much work into inter-operating with the native package manager (or phrased differently, if we try that we're going to get it wrong and it will end up awkward and a second-class-citizen anyway).

From a distro point-of-view, to ship a Rust-built executable (let's call it "mozilla") we want:

  • Has to build from things (rustc, cargo) already in the archive.
  • Want to have the dependencies exposed in the regular packaging metadata in some way.

Both of these are for potential security releases. If there's a security fix required to (eg) libpng-rust, then we want to be able to easily find all packages that include libpng-rust and recompile them.

I think that provided there's a way to tell cargo "build this source using these other sources" and have all that provided locally on disk somehow, then I think we're good. It doesn't really matter how easy it is to put those sources there - provided there is some automated way to do it.

I think it would be ok to keep the package-managed and user-cargo-managed software separate.

Huh. Does this mean the standard library will build without using unstable compiler features?

The Debian packages are already split out like this (for this exact reason). The packages are called (effectively) libstd-rust-7d23ff90:amd64 and similar. I don't think we need any further help from upstream for this, which is why I'm a bit surprised/unsure about what changes you're suggesting.

We don't currently have any non-Debian target architectures packaged, but I plan to introduce an architecture independent (in the Debian sense) libstd-rust-xxx-$triple for any standard Rust architectures where we also have a cross-compiler packaged. At the moment I think this is only win32/mingw, but I expect to see an Android cross compiler soon too.

In particular, I was surprised to see Rust recently jump to an unreleased version of LLVM. I thought those days were behind us - and the commit in question didn't even seem to have a discussion of why it was necessary and what breakage we should look out for (I've since learned that it was for the native-win32 target, which is probably why we didn't notice any downside).

Debian has built the last few releases against the system LLVM. rustc-1.2 required a few test patches to compile against LLVM 3.6, and we've since moved to LLVM 3.7 now that it has been released. The Rust test coverage is pretty good and so far we've relied on that to detect LLVM breakage - I'd hope that this continues to be a sufficiently thorough test.

Testing against an LLVM release would be good - but only if we also act on it when it breaks. If we're doing that then I'd question the value in vendoring LLVM at all...

In general, I think we're going to continue to define new target triples for various additional platforms and variants. Some sort of wiki/doc for tracking them would be good, but perhaps just using comments in the existing mk/cfg/* files is the right place.

In a sense, the upstream library author "controls" the API but the cargo user (or distro builder) is the one who should define the ABI. Basically all the codegen options (rustc -C help) might need to be tweaked by the person building the software to fit local conditions/policies/architectures.

A concrete example is the Debian hardening options. Most of these are designed for C and don't really apply to Rust, but some do - and we currently patch the rustc build process to allow us to pass -Wl,-z,relro to the link step, for example.

In particular, cargo currently makes tweaking compiler options hard, in an (overly strong imo) attempt to ensure reproducible builds.

rust-installer is just an implementation detail of "make install", as far as I'm concerned. In particular, I move some files around to different places, and don't include the "metadata" bits that rust-installer creates.

Debian has standard tools to notice new upstream releases, so continuing to do exactly what you're doing is good. Improving the bootstrap story directly improves this, for non-amd64 architectures.

It would be slightly easier if Cargo had regular upstream tarball releases published as github releases (or on any other webpage) since the tools for polling git repos are not quite as mature/featureful.

I hold a biased position here so this is awkward to answer :wink: From my point of view, I'd hate that because then I have to also work around whatever you do in your package, and I'd rather you just contributed whatever efforts to Debian proper instead. I don't think I can fairly weigh the benefit of such a thing to an end user.

3 Likes

It would seem like the simplest thing to do would be to have cargo additionally look in, and prefer, /usr/lib/$ARCH/cargo/ or moral equivalent for its registry. I've not dived into the source, but looking around ~/.cargo this would seem to not require a huge rework.

This is good for the Debian builders, which will just fail if everything isn't available, and is convenient for developers who can use the pre-built Debian packages if they match their requirements or seamlessly pull and build if they don't.

1 Like

Re: cross compiler. Debian recently (as in this year) started to provide GCC cross compiler packages, so there is a good precedent now, e.g. gcc-arm-linux-gnueabihf.

This, along with all the usual headaches induced by having a build environment that doesn't have internet access, have been the main headaches for me when packaging cargo for openSUSE. It'd be great if cargo itself only depended on rustc, or possibly if there was a minimal bootstrap cargo build without the cargo dependency, which could then be used to build a proper cargo binary.

I haven't even started thinking about packaging crates yet. However, as a start, there's the document which describes how rubygems are packaged in openSUSE today here: openSUSE:Packaging Ruby - openSUSE Wiki

Using the gem2rpm tool, making a .rpm for a rubygem in openSUSE is pretty much trivial. I would hope that it would be possible to create something similar for cargo.

This is not a major concern for me, at least.

I know that FreeBSD currently does build against their system LLVM and ran into issues with --release builds on that platform. I pinged @koobs on Twitter.

I'd prefer Rust to do that by default. Offline documentation should be offline. The current state of the documentation is that it is problematic without internet, gets worse with a flaky internet connection like trains ("let's try to load the font for 2 minutes") and is only acceptable on a standard-grade local internet connection.

6 Likes

It would be nice if the packages for the host-targeted rustc/cargo/librust-std would not have target triple suffixes in their names, or these names would be given to umbrella packages that pull in their correct counterpart for the host platform.

1 Like

I don't care much about Homebrew, but I pinged the macports maintainers.

Please count me among the Fedora interested parties. For reference, here’s the packaging bug, but there hasn’t been much activity lately.

I’ll try to give a fuller reply later, but one thing I don’t see mentioned at all here is ABI. Fedora usually avoids static linking in packages, but I don’t see how we can ship any dylibs if there’s no way to update them. Even just std is questionable, it seems.

I think Fedora would even be fine rolling along with 6-week rust updates, IF we could do it without rebuilding every other rust-built package each time, not to mention whatever the users built themselves.

At a minimum, I think we need some clear guidelines of what is and isn’t expected to be compatible in ABI.

1 Like

@brson: Would be nice to ping the FreeBSD port maintainer too (you can find him here).

It would definitively help a lot if there would exist some official "cargo-bootstrap.py" which could be used to build cargo without the need of a snapshot. Or maybe the better way of doing this would be to teach cargo to generate some kind of script that contains all actions neccessary to compile a package (including all it's dependencies).

rustc statically links in LLVM libraries. When LLVM is installed system-wide, this is maybe not the right choice. In case of FreeBSD, beginning with LLVM 3.7, they stopped building static libraries. So it's no longer possible to use the system-wide LLVM anymore and one has to use the LLVM shipped with Rust, which costs a lot of build time (~1 hour for LLVM, 30 min for rustc). Optional dynamic linking to the rescue?

Global caveat, I don't pretend to represent all of Fedora. I'm just answering to my best understanding, and it's a bit theoretical since real Fedora packages don't exist yet. This also went very long, so I hope I didn't cross my wires anywhere...

I don't think it's so important to align upstream and downstream packaging. Certainly each distro will have different policies about the nitty gritty subpackaging.

But as a general guideline, I do agree it's a good idea to suggest a meta package to pull in the basics. Sort of like the haskell-platform package, which is nearly empty but depends on all the other packages to complete the platform.

Currently Fedora uses /usr/share/doc/$NAME, but it used to have $NAME-$VERSION-$RELEASE. The best practice is to use the %doc macro right in the build directory, without otherwise installing them at all. There's a similar %license macro too. An html subfolder for docs is fine too.

I'm not sure if Fedora really deals in such things. There's the "alternatives" system, but I think that's usually for completely different implementations, like java runtimes or the mta. My guess is if multirust were to be packaged at all, it would only be useful to manage user-installed rust versions, not the distro package.

Ideally, if stability and eventually ABI is done right, this won't be necessary. Fedora's only alternate gcc is compat-gcc-34, so we're reaching way back here. Those executables are installed with simple suffixes, gcc34, g++34, et al, and the rest of gcc's runtime is already taken care of, isolated like /usr/lib/gcc/$target/$version/.

I would like Cargo to have its own proper release and source tarball, yes. Especially since the versions don't match, I think it should be built as its own srpm distinct from rust itself. I don't really care if it's released in lockstep, but I suspect it will be anyway since they're practically developed together. Just need to know which rust-x.y is required for that cargo release, even if that's always the latest.

Bootstrapping from the previous release would be fantastic. Then stage0 will only be needed once, and we can self-build from then on. That exactly matches Fedora's exceptions to the ban on prebuilt binaries.

I guess for rebuilding the current version, say for a minor bug or security fix, there's a way to skip the initial stage? I haven't tried in a while. Point is just that this is a case where the existing rustc is already the same version as what we're building, so it might need to skip over some assumptions about what the "snapshot" contains.

As for the 6-week release train, I fully expect Fedora Rawhide to keep up with this. If the package maintainers slack off, then it will be justified penance to make them step through 6-week increments, no problem. I don't see the value of an upstream build script doing this, because then you'd need all those intermediate sources in the srpm.

Then in released Fedora, either we'll have answered the questions like ABI sufficiently that they should keep up every 6 weeks too, or else they'll be locked into some specific version and never need to worry about bootstrapping again. The next Fedora release will just have inherited from Rawhide which will be keeping up.

A blessed Cargo bootstrapping script would also be great. If not, then it'd be nice to follow the same suggested model as Rust, building with the prior Cargo release, with consideration for rebuilding on the current release too.

I personally don't like rpm-generators and the like, if that's what you mean. It just needs to have the right knobs to control installation paths, bindir, libdir, etc., since distros have different ideas about this. A DESTDIR control on the install command is also needed, e.g. even though I have prefix=/usr, I'll want to install to $DESTDIR/usr/... during the package build.

Are you talking about source packages, or just the binary packages you distribute? I suspect the latter. The source of the standard library seems inseparable from the rest of the compiler, and should be part of the distribution - much like how libstdc++ is always part of gcc. (at least these days; not sure if that's historically always been true.)

For just binaries, meh. I think distros will vary, and upstream doesn't need to try to align how packages are named. It's good enough to just specify how rustlib/$triple should be found. For Fedora, I expect we'll just build everything natively, with those libs in their own subpackage, and then an x86_64 host can just install that rustc-libs.i686 compat package like all other compat libraries are handled.

Well, if it's so close already, why not take the goal of switching to stock LLVM altogether? Those few unwanted optimizations -- can you provide references to these discussions? Maybe they might be accepted with some rework? (but I don't know what these are...)

Anyway, yes, Fedora is one that would prefer not to use bundled LLVM. It's possible to get exceptions, but better if we don't need to. And this really ought to be dynamically linked too.

We usually deal with distro CFLAGS or equivalent in an rpm macro like %optflags. For example, Fedora's 32-bit x86 target currently gets "optflags: %{__global_cflags} -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables". I imagine we'll add something similar for rust packages, and in their spec then can just run something like cargo build %{rustflags}, or maybe even just %{cargo_build}.

Right now Fedora's %__global_ldflags just has hardening options.

I haven't looked at those before, but at a glance, no I don't think Fedora needs that. That is, we need to run something like cargo install or whatever at rpmbuild time, but in the user's hands install/uninstall task should be fully handled by rpm itself. Maybe it could be useful to produce a manifest to aid the rpm %files section. But since things are usually installed in a clean DESTDIR, you can usually just wildcard entire paths.

As for consistency across distributions -- what are you hoping for specifically? What problems are you trying to avoid? Distros all have their own policies, so I don't think you'll get a complete homogeneity here...

Package names, division, cross-std, are all going to vary on distro policy. For bootstrapping, I think we just need clear instructions how to do it, especially offline, whether from stage0, the prior release, or rebuilding the current release. Certainly branding and trademark guidelines need to be stated if you expect any control there.

I'm not sure. There's enough interest that a few people have uploaded binary packages on copr, and I'm sure those users would prefer an upstream repo. Even after it is in Fedora proper, your own repo still might be useful if Fedora ends up staying conservative on updates. Beta and nightly repos might be cool too. BUT - if you're messing with all that, you probably want multirust with user-installed rustc, so rpm repos are beside the point.

This is an additional tangle. In an ideal Fedora-packaging world, every rpm-packaged crate would be built and installed separately from all its dependencies, and all dynamically linked together as needed. Those dependencies should be found and used in library form, not from full source. (I guess there's no header equivalent needed.) But I fear that this world of Cargo.lock fixed dependencies and vague ABI promises is really incompatible with that ideal.

The hand-wavey planning in my head right now says that we should really only try to package rustc and cargo for now, and their docs, and leave everything else up to users building from crates.io. But that would mean nothing else of substance can be built with Rust into the distro, which isn't so useful...

EDIT: It may be of note that fedora-devel has a large thread on anti-bundling requirements right now. I like Adam Williamson's recent reply comparing how python and perl are good with distro packaging, and how PHP and JS are horrible -- I want Rust to be on the good side! But all of those are interpreted languages, so at least they don't have to worry about ABI. Lars Seipel mentions work happening in Go, which may be a better comparison, but I know little about that ecosystem.

Note that Cargo is not a package manager in the sense of RPM, dpkg, etc. By and large, it manages dependencies within a package build, and all files it modifies are expected to be in the home directory of the invoking user. Even the install command is designed to avoid stepping on the system package manager's toes. Cargo's default way to resolve dependencies is to pull their sources off crates.io or the source repositories as specified in Cargo.toml. AFAIK reachability of internet resources is not a guarantee in distros' package building environments and in any case should not be relied upon: normally all input sources are extracted from the source package or come from installed dependency packages.

In my opinion, it makes more sense to see Cargo as a tool invoked in dowstream package's build, like, say, Ant for some Java projects (Maven might be a closer analogy, but I don't have actual experience with packaging Maven-based projects). There might be helper tools to convert metadata from Cargo build files to downstream package metadata, but this should mainly be a concern of the downstream's developers.

To satisfy the no-downloads restriction, there should be a way for Cargo build to override dependency resolution with globally installed packages. This also means committing to some format of installed library packages. Should they install .rlib artefacts plus some global equivalent of .cargo overrides? Or package the source tree?

2 Likes

Thanks for the thorough replies, @gus.

Yes, agreed, but mainly because it always has, and I don't see being able to divorce the two any time soon.

This sounds like it may be the same as a rustc/std package split. I would expect that - given our current file structure - a rustc package contains the stuff we drop into /usr/local/{bin,lib} today, and the std packages to contain the stuff in rustlib.

I agree in principle. I've been considering that the host std might live in the 'normal' place with the other libs (/usr/lib) and all cross-libs might live in rustlib.

Yes, agreed. Upstream wants to distribute source packages too and this would be our division.

Great. Noted.

Aha. The 'build with the most recent stable rust' requirement I hadn't considered. To be clear, the paired Cargo needs to be able to build with the rustc it's paired with, but not the previous stable rustc?

Do you take the Debian source for rustc from the source tarballs or from git?

Does the approach I outlined, where we guarantee a bootstrap chain from some previous stable release, work? You might have to run a script to rebuild lots of rustc's to catch up from whatever Debian's prior stable release was. How long is the window in which Debian might not upgrade Rust? Will Debian testing always stay up to date? Skip one release, two?

To be clear, you want Cargo to report the versions of native deps, so that e.g. Debian can override them with a different native dep for bugfixes?

Are these other sources just the native deps, or do you want to redirect cargo so that it gets Rust crates from another place? Like security fixes in native deps, would Debian identify security bugs in Rust crates, and then forcibly override the deps of that crates' downstreams?

Haha! No. We just want to decouple the distribution so we can distribute arbitrary numbers of stds and not force people to download them.

The changes I'm suggesting (and we're working on now) are for upstream's own purposes (described above) - we need to be able to distribute cross-std in discreet units.

Yes, definitely. We would gate on it and force ourselves to be compatible with LLVM stable.

Great. Thanks for the clarifications. So as I understand, for now at least, it's the C bits that you want to customize. It looks like incorporating something like your patch will fix that for Rust's own build system.

Nice link! Thanks.

Great. Thanks for the data point.

AFAIK this would just mean that we output copies of all the scripts, fonts, etc we use during the build, and don't rely on any CDNs for that stuff?

Can you expand on that? If I understand correctly you are suggesting that the complete std package be called 'rust-std-nightly.tar.gz' (as opposed to 'rust-std-nightly-x86_64-unknown-linux-gnu.tar.gz'). Note that all our existing packages contain the host triple in their name. What is the difficulty you are foreseeing?

Will do! You are on my contact list.

Yes, this is the case. Using dylibs for the most common thing dylibs are used for (providing ABI-compatible updates) is not viable in Rust right now. Static linking is strongly preferred by the entire system atm. Solving this is far-future territory :frowning:

Since Rust much prefers static linking, will Fedora use it, or will Fedora want to dylib crates anyway?

We may be able to make some quick fixes here on native dylibs. I suspect that a lot of crates are 'vendoring' their native libs and not providing an option to dynamically link them. Not sure. cc @alexcrichton.

Ouch... I don't think there will be any hope of this any time soon. Presently in Rust, when anything upstream changes, including the compiler, all downstream must be rebuilt.

Done. Thanks. I can't believe I completely excluded BSD. I've also pinged @dhuseby

Yes, I think we this is or will be an option. There's a PR open now trying to improve support. I was not aware that 3.7 didn't support static linking. Wonder what we're doing in-tree...

There isn't currently, no. Somebody earlier in the thread mentioned a desire for this as well. I hadn't previously considered the case for point releases. Do you think that generally people will expect to build point releases with the previous point-release and not the previous major/minor release? Since we haven't done a point release (in years at least), if we don't plan for that it may bite us when the time comes.

Yes, just talking about the binaries.

I've only heard it mentioned informally that they aren't suitable for upstream LLVM. Perhaps @alexcrichton remembers the actual discussions with upstream.

I think neither cargo nor rustc will obey any such environment variables by default. The build of Rust itself is flexible enough to override this in the rpm definition, but subsequent uses of rustc don't have a way to obey e.g. %optflags. This would be a concern for packaging cargo crates and for simply using rustc on the system.

Somewhat of a follow up on my previous comment. It sounds like there is a difference between what rpm should do and what the compiler itself should do by default. Is it ok not to use the global ldflags if e.g. Bob is just running rustc hello.rs?

Forgetting to include files, putting files in strange places, keeping up with upstream changes in filesystem layout, removing maintenance burden from downstream.

Thanks for those links! I haven't reviewed yet, but I will.

Thanks for the amazing response folks.

The cargo version that we include in a Debian release needs to be buildable with the rustc also in that release. So if you want us to match the rustc/cargo pairing that you're using for your pre-built bundles, then yes we'll need it to be able to build with the paired stable rustc.

We take source tarballs (and then import them into our own git tree, but that's mostly a coordination choice on our side). Currently we also repack the source tarballs to remove src/llvm/* (unneeded, and big) and src/etc/snapshot.pyc (shouldn't be in the upstream tarball anyway).

Note that even when we kept the full src/llvm/ source, we still had to remove src/llvm/cmake/modules/LLVMParseArguments.cmake because CC-BY 2.5 isn't DFSG-free. We also have to find/include the appropriate jquery source for the minified version shipped in upstream git - again due to Debian's interpretation of "compiled" for javascript. It would be helpful if the latter was already included in upstream source, and perhaps you could work with LLVM to address the former? (but that's just me being lazy :wink: )

Yes, I think it would be fine to say that we need to hop through each stable rustc release - or re-bootstrap everything. We intend to upload every rustc stable release into Debian unstable, and have that trickle down into testing in the usual manner (~2 week delay); Debian stable is then a snapshot of whatever testing was at that point in time (so Debian oldstable->stable will almost certainly skip rustc releases).

A Debian (entire distro) testing/stable release needs to be able to rebuild whatever rustc version is included in it using only other packages also in that release (so using the same rustc version), and each new rustc release that we upload into unstable needs to be buildable with the immediately prior rustc release already in unstable (to avoid re-bootstrapping). I think this means we'll need to improve the Debian build to detect version and jump straight into stage2 for the former case, since stage1 won't build with its own release (until upstream moves away from using #[cfg(stage0)] as a versioning mechanism) - this doesn't sound too bad, but I haven't actually done it yet so I don't know what traps await..

EDIT: once again, I've forgotten that jumping straight into stage2 isn't possible because unstable Rust features won't be available. Sigh, need to think of something else (perhaps package rustc twice with different --release-channel choices)

Reporting versions is the easy bit (and could be done manually, although tooling is always nice). By "want to expose dependencies" I meant "vendoring sources" is technically workable, but not a great solution.

In a hypothetical future Debian stable release that included rustc/cargo and (eg) rust-mozilla, the Debian security team might need to patch the (eg) rust-ssl library used by rust-mozilla. We need to be able to tell cargo to rebuild the same rust-mozilla source using the now-modified rust-ssl library (so can't pull from network). We also want it to be easy to find and repeat this for every package that uses rust-ssl (so "vendoring" a new copy of rust-ssl source in every application that uses it is bad).

I sketched out a possible solution elsewhere, that follows what golang currently does in Debian. In this proposal, Debian Rust "library packages" are just source archives that get dumped somewhere centrally on disk. We then need some way to tell cargo to build using the libraries in this central location rather than going out to crates.io or wherever it would normally go. I think we can generate whatever version metadata/index cargo wants, if such a thing is required.

With this proposal, the rust-mozilla Debian package would have a regular Debian build-dependency on the rust-ssl-dev Debian package. The security fix would be as simple and routine as "release a patched rust-ssl-dev Debian package, and rebuild/release all Debian packages that build-depend on it". rust-mozilla and any others would pick up the modified rust-ssl source from the central on-disk location and it would all just-work.

As a practical matter, we would only assemble and upload libraries into Debian that were actually required by Rust executables also in Debian - this isn't attempting to be a viable alternative to cargo and crates.io for regular upstream software development.

The latter. If we want to have an executable built using cargo in Debian (this includes cargo itself), we need to be able to tell cargo to not use the network. As explained in the security example above, we'd also like to be able to point cargo at a central location so we can share those crates (source, not proposing rlibs/dylibs) across multiple cargo-using executables.

FYI: Luca's excellent work on packaging cargo might give you some idea of the blunt tools required to get current cargo to not use the network.

Ah I see :slight_smile: It would be helpful to distinguish between "source" and "binary" packages, since I thought you were talking about splitting out the rustc and std source packaging (and I've had to resolve this ambiguity in some of the other discussion).

Well no, the link step isn't specific to C. To make that more clear, in my example above I want to pass -C link-args="-Wl,-z,relro" to rustc. My other examples were about enabling/disabling cpu features and ABI details, which obviously need to influence the code that rustc compiler itself produces.

We can (and do) patch this for the rustc build system easily enough (since the rustc build system exposes these details quite openly), but this is harder for cargo and other Rust applications (built using cargo) since cargo intentionally doesn't offer this flexibility (not saying whether that's good or bad, just highlighting there's a clash in use cases here).

Yes. That would also solve the privacy problems of people. We could also implement a special mode (--offline) that is used in that case.

Hi,

I didn’t read the whole thing yet (I plan on doing it later today), but as the rustc packager for Exherbo, the main critical blocker for us atm is https://github.com/rust-lang/rust/issues/16402 (which doesn’t seem to have been mentioned here yet)

Without this fixed, each time a user wants to reinstall/update rust, she has to uninstall the installed one first.