Perfecting Rust Packaging - The Plan

Just to update everyone here, I tried to put together a quick proof-of-concept of getting cargo to use local packages. It doesn’t work yet, but I’ve learnt a few things along the way.

In no particular order (or difficulty):

  • Even something “simple” like cargo requires a lot of crates :stuck_out_tongue: I think I’ve built 37 “library” packages in order to build cargo, and that’s with me cheating a bit and collapsing some dependency chains that specified supposedly incompatible versions of the same crate (but I hack the Cargo.toml dependency and reuse the one version for both).
  • Lots of over-specified version requirements across crates.io
  • Several crates declare blanket dependencies on winapi/kernel32 that aren’t actually required unless you’re building for a windows target (for example, cargo itself)
  • Often the only indication of copyright is the keyword in Cargo.toml entry (no explicit LICENSE text nor copyright comments in readme/source)
  • Many upstream git repos don’t actually make releases (or declare git tags, etc) other than the snapshot that happens to get uploaded to crates.io. This effectively means we need to build distro packages from what’s in crates.io or else we have no hope of matching the semver dependencies used between packages.
  • A number of *-sys crates ship a full upstream C library source, which they only use as a fallback for when the local platform version couldn’t be found. I hadn’t actually noticed that before, and it makes packaging (only slightly) interesting because now we have a bunch of extra files to either strip out, or audit licenses.
  • The good news with distro packaging is that we can use the cross-language distro package dependencies to just make sure we always have pkg-config, the required C library, headers, etc installed giving us a much simpler and more predictable result. In just about all cases, we hit that first pkg-config line in build.rs and the rest is skipped entirely.
  • Repackaging from crates.io rather than upstream github isn’t great because:
  • Can’t share work between upstream repos that contain multiple crates. Probably can’t do this anyway (easily), since the “sub crates” often declare wildly different crate versions so we wouldn’t be able to do a 1:1 crate version and distro package version.
  • A number of files are often missing from what gets uploaded to crates.io. In particular, documentation, examples, license files - things that aren’t actually cargo buildables.
  • Several “sub crates” assume the original source subdirectory layout, and derive their crate name from the directory. I needed to patch several of these to add explicit Cargo.toml name=... directives once I started shipping them in my own (differently named) $name-$version directories.
  • crates.io doesn’t make it easy to verify sources. There are checksums buried away in the metadata (retrieved via a git checkout), but ideally there’d be a simple http-accessible signature alongside the source download (like there is for rustc source itself).

Some implementation details (all open for discussion, this is mostly just a strawman POC):

I hacked up a quick/horrible python script to fetch source from crates.io and it autogenerates most of the bare minimum debian/* required to package up a library. The end result is (currently) a debian package named “librust-$crate-dev” that contains the crate source, with a patched Cargo.toml. The astute here will notice this package naming scheme implies we can only support a single version of the crate at once, without spawning off more explicitly versioned Debian package names (this is probably something that needs to be changed). The patched Cargo.toml has all dependencies rewritten from libc = "0.1" to libc = { path = "/usr/share/rustsrc/libc-0.1" }. I also rudely truncate all more explicit semver dependencies to 2-digit x.y (to prevent having to update all the Cargo.toml paths all the time), and use the Debian package metadata to preserve the original more-specific version requirement.

When you install the package, the crate source gets dumped into (rather arbitrary) /usr/share/rustsrc/$crate-$x.$y.$z with a symlink from $crate-$x.$y for Cargo.toml path convenience. I made the guess that x.y would be good enough to at least verify the approach, and so far I’ve hit other problems first. Note that I completely ignore Cargo.lock - I’m sort of unclear on whether that’s a bad thing, or expected.

So! I eventually have an example non-library Rust thing that I’m trying to build in this environment (in my case “cargo” itself). The debian/rules packaging script currently tries to build this with CARGO_HOME set to a temporary (empty) local directory to isolate from the local user’s settings, and I run ./configure --prefix=/usr (works fine), then make (and cargo build --release fails).

Some issues:

  • As far as I can see, cargo insists on updating the crates.io registry, even though there’s no dependency (afaics) which is using crates.io. Should I create a dummy registry checkout in order to prevent this?
  • Cargo goes ahead and ignores all my Cargo.toml path-rewriting work, and downloads all the crates from crates.io again, including things like winapi that shouldn’t even be in the dependency chain after my edits! Any suggestions on where cargo is getting this idea of the dependency graph from?

Note all of this is all entirely unrelated to the cargo package recently actually added to Debian unstable. That package uses a simpler (working!) build approach - I’m trying to explore the more general challenge of building cargo-using apps without vendoring all the dependent crates.

3 Likes