Perfecting Rust Packaging - The Plan

Can you explain this? I don't believe I've ever heard you say this before and I don't know what you mean.

Is this something we are doing wrong? Does bundling an i586-unknown-linux-gnu target not make sense in your understanding?

The linked issue describes the use case from @jauhien, and passing rustc -L /usr/lib is exactly what they are doing. A point was also made on that thread that rustc can't even be compiled when it's already installed (though that is surprising to me...). This is apparently blocking Gentoo though I don't fully understand their installation model.

Edit: I've modified the text to state this is a rustc problem, not cargo.

There's two ways you can look at it. One is that our current "i686" target is buggy because it doesn't actually work on all i686-class machines. The other way of looking at it is that using target triples to specify CPU features is a bad idea because we try to attach too many different meanings to the triple, so the right solution is not to add an "i586" triple, but rather add some other mechanism to distinguish CPU features.

Either way, adding an "i586" target to rustc without any other changes would be extremely confusing.

I’ve filed issues on everything but the i585 triple, which @eefriedman has concerns about, and CI for building Cargo with its own release of Rust (since @alexcrichton is cool on the idea and it’s a lot of work for marginal gain).

@gus What do you think of @eefriendman’s concerns that encoding the i586-ness of the plattform in the target triple is inconsistent with gcc, not the right way to encode CPU features?

My understanding of the problem is that there are three cases for linking LLVM: static bundled copy, static system copy, and dynamic system copy. The first is easy to handle because it’s decided by the configure script so we know when we’re in that case and can do whatever we need to, so the only difficulty is handling the two system library cases.

The current solution involves using #[link(..., kind = "static")] for static libraries and #[link(..., kind = "dylib")] (or usually leaving it off because it’s the default) for dynamic ones, which is hard because pkg-config and similar tools don’t tell you whether the library you’re linking against is static or dynamic. The reason they don’t tell you is because in the C world you basically don’t care: in either case you just pass -lfoo when running the linker (there are edge cases that don’t work, but those don’t work with Rust’s method either). As far as I can tell there’s nothing preventing Rust from doing the same thing.

That behavior is what kind = "dylib" produces. kind = "static" searches the system for the archive file, unpacks it, and then bundles all the objects into the resulting rlib (or whatever you’re producing). This is useful if the archive is some bundled library that you don’t want to install separately (like the bundled LLVM case), but it’s a weird thing to do with a system library. Just using the “dylib” behavior for any system library should just work because it matches the C behavior that linking was built up around.

(As a disclaimer I can only comment from a Linux point of view. As I understand it things are more complicated in Windows. There are some more details and discussion of this whole thing (including some Windows stuff) in a previous internals thread.)

@wthrowe thanks for the explanation and link to the other thread. I think I basically understand your critiques of how Rust handles linkage.

I'm afraid I don't really understand the concern. As far as I understand it, triples are just a handy way to encode a bunch of cpu architecture / ABI / platform options into a single string. My understanding (please correct me) was that gcc can indeed be configured with separate i386/i486/i586/i686-linux-gnu triples, and yes they just map to different default values for -march, -msse, etc. I'm not sure if @eefriedman is arguing that adding more triples is bad, or that adding triples is not sufficient/scalable and we need to add cargo support for passing arbitrary codegen compiler flags. I'd like a triple for my platform, and I'd like the ability to pass arbitrary codegen flags :wink:

I'm fine with adding a more specific i586-debian-linux-gnu, if we want to make it clear that it means only "the abi/cpu features that Debian assumes". I'm also fine with adding that just to the Debian rustc.deb if there's resistance to carrying it upstream. Really, I just need something I can pass to rustc/LLVM to get it to produce output that fits the Debian "i386" architecture definition (basically gcc's i586-linux-gnu), and doesn't assume the existence of pentium2 instructions. I can pass a bunch of compiler flags around, or I can create the triple that represents those same flags - either way I need enough support/hooks in my cargo executable to be able to do that, and right now that means a new triple.

My primary point is just that a system where “i686-pc-linux-gnu” mean pentium4, and “i586-pc-linx-gnu” mean pentium1 is really confusing. Also, if Debian retires i586 in favor of i686 sometime in the next few years, you’re going to run into trouble because i686 won’t mean what you need it to mean.

On a side note, I’m pretty sure the testsuite will fail on an “i586” target because of differences between SSE and x87 math.

i686 should actually be Pentium Pro?

I’ve filed an issue against Rust to apply -W,-z,relro by default. This is one of the reasons for wanting to apply custom command line arguments to rustc (which we still plan to do). I don’t know how feasible it is but it seems like the sort of option Rust might like.

Just to update everyone here, I tried to put together a quick proof-of-concept of getting cargo to use local packages. It doesn’t work yet, but I’ve learnt a few things along the way.

In no particular order (or difficulty):

  • Even something “simple” like cargo requires a lot of crates :stuck_out_tongue: I think I’ve built 37 “library” packages in order to build cargo, and that’s with me cheating a bit and collapsing some dependency chains that specified supposedly incompatible versions of the same crate (but I hack the Cargo.toml dependency and reuse the one version for both).
  • Lots of over-specified version requirements across crates.io
  • Several crates declare blanket dependencies on winapi/kernel32 that aren’t actually required unless you’re building for a windows target (for example, cargo itself)
  • Often the only indication of copyright is the keyword in Cargo.toml entry (no explicit LICENSE text nor copyright comments in readme/source)
  • Many upstream git repos don’t actually make releases (or declare git tags, etc) other than the snapshot that happens to get uploaded to crates.io. This effectively means we need to build distro packages from what’s in crates.io or else we have no hope of matching the semver dependencies used between packages.
  • A number of *-sys crates ship a full upstream C library source, which they only use as a fallback for when the local platform version couldn’t be found. I hadn’t actually noticed that before, and it makes packaging (only slightly) interesting because now we have a bunch of extra files to either strip out, or audit licenses.
  • The good news with distro packaging is that we can use the cross-language distro package dependencies to just make sure we always have pkg-config, the required C library, headers, etc installed giving us a much simpler and more predictable result. In just about all cases, we hit that first pkg-config line in build.rs and the rest is skipped entirely.
  • Repackaging from crates.io rather than upstream github isn’t great because:
  • Can’t share work between upstream repos that contain multiple crates. Probably can’t do this anyway (easily), since the “sub crates” often declare wildly different crate versions so we wouldn’t be able to do a 1:1 crate version and distro package version.
  • A number of files are often missing from what gets uploaded to crates.io. In particular, documentation, examples, license files - things that aren’t actually cargo buildables.
  • Several “sub crates” assume the original source subdirectory layout, and derive their crate name from the directory. I needed to patch several of these to add explicit Cargo.toml name=... directives once I started shipping them in my own (differently named) $name-$version directories.
  • crates.io doesn’t make it easy to verify sources. There are checksums buried away in the metadata (retrieved via a git checkout), but ideally there’d be a simple http-accessible signature alongside the source download (like there is for rustc source itself).

Some implementation details (all open for discussion, this is mostly just a strawman POC):

I hacked up a quick/horrible python script to fetch source from crates.io and it autogenerates most of the bare minimum debian/* required to package up a library. The end result is (currently) a debian package named “librust-$crate-dev” that contains the crate source, with a patched Cargo.toml. The astute here will notice this package naming scheme implies we can only support a single version of the crate at once, without spawning off more explicitly versioned Debian package names (this is probably something that needs to be changed). The patched Cargo.toml has all dependencies rewritten from libc = "0.1" to libc = { path = "/usr/share/rustsrc/libc-0.1" }. I also rudely truncate all more explicit semver dependencies to 2-digit x.y (to prevent having to update all the Cargo.toml paths all the time), and use the Debian package metadata to preserve the original more-specific version requirement.

When you install the package, the crate source gets dumped into (rather arbitrary) /usr/share/rustsrc/$crate-$x.$y.$z with a symlink from $crate-$x.$y for Cargo.toml path convenience. I made the guess that x.y would be good enough to at least verify the approach, and so far I’ve hit other problems first. Note that I completely ignore Cargo.lock - I’m sort of unclear on whether that’s a bad thing, or expected.

So! I eventually have an example non-library Rust thing that I’m trying to build in this environment (in my case “cargo” itself). The debian/rules packaging script currently tries to build this with CARGO_HOME set to a temporary (empty) local directory to isolate from the local user’s settings, and I run ./configure --prefix=/usr (works fine), then make (and cargo build --release fails).

Some issues:

  • As far as I can see, cargo insists on updating the crates.io registry, even though there’s no dependency (afaics) which is using crates.io. Should I create a dummy registry checkout in order to prevent this?
  • Cargo goes ahead and ignores all my Cargo.toml path-rewriting work, and downloads all the crates from crates.io again, including things like winapi that shouldn’t even be in the dependency chain after my edits! Any suggestions on where cargo is getting this idea of the dependency graph from?

Note all of this is all entirely unrelated to the cargo package recently actually added to Debian unstable. That package uses a simpler (working!) build approach - I’m trying to explore the more general challenge of building cargo-using apps without vendoring all the dependent crates.

3 Likes

I'd be interested in hearing more about this.... if anything, we've had an under specified problem, I'd think.

Just wanted to say thanks for all the work you've put into blazing the trail here!

This is being worked on in Need ability to add dependencies based on `#[cfg()]` · Issue #1007 · rust-lang/cargo · GitHub / https://github.com/rust-lang/rfcs/pull/1361

This was a reason I suggested the possibility of adding a make dist style cargo command, especially if the other option on the table is each distro manipulating the Cargo.toml which I have no interest in managing.

I use -Wl,-z,relro,-z,now in my build of Rust for Yocto. Its a fork from a previous repo and is a bit of a mess but I’m working to clean it up. The repo is called meta-rust.

I’ve actually landed a patch in Rust master to treat i386, i486, i586, and i686 in mk/platforms.mk recently. It had already done i386 and i686. Yocto also targets i586 and the meta-rust I cloned my repo from had been patching that spot for some time.

Sorry for interfering. Any plans for two specific things:

  1. Binding to custom crate repositories?
  2. Allowing to release pre-built artifacts for crates?

Thanks

@gus Thanks for working through all that and giving us the details. Next week the Rust team is meeting in person and we’ll try to regroup to understand the problems you are dealing with, how Cargo can ease them.

@target_san

  1. It is possible to use other repos than crates.io, though not super tested in the wild. See this unit test, which frobs registry.index in .cargo/config. @alexcrichton says you might also want to look at the implementation of cargo-vendor.

  2. I don’t think there are specific plans for releasing binaries, though it’s known to be a desirable feature.