Perfecting Rust Packaging


#54

Exact naming is probably just a bikeshed color. I would generally expect a subpackage that’s only used at build time to be called -dev on Debian or -devel on Fedora. If the crate has any kind of [[bin]] of its own I’d put that in the base package, otherwise there doesn’t have to be an installable base package at all.


#55

@jauhien I’ve been thinking more about side-by-side install and I have one more question for you.

So I understand how having control over the ‘filename extra’ helps you keep multiple copies of Rust’s libraries in /usr/lib, but I’m wondering about the contents of /usr/lib/rustlib. This directory contains artifacts that are not Rust crates, and so do not contain the hash, most notably compiler-rt.a. Without special care these will clobber each other if multiple instances of Rust are overlaid on each other. Are you dealing with that in any way? Should we?


#56

FWIW, I noted in bugzilla that Fedora’s clang already ships the equivalent to compiler-rt.a as libclang_rt.builtins-x86_64.a. I hope to configure Rust to just use that, but I haven’t researched how yet.

(Sorry if I’m replying too much. I even got a popup warning not to monopolize the discussion. I’ll try to step back for a while…)


#57

There is no automatic way to deduce this currently, but I think this is obvious enough for humans and that’s ok. Eg: for a “source based” library you just copy the entire source tree; for some executable app you just copy the (few) output binaries; for some hypothetical “binary rlib/dylib library” package you would just copy the (few) rlibs/dylibs.

If/when packages start including complex application data as well, then yeah we’ll need to make a more powerful “cargo install” to work out what goes where - but right now I think this is a low priority.

Yep, this is technically easy to do in the packaging metadata. As you go on to discuss, the implications of this are that we’d need to rebuild every Rust library package when a new rustc was released.

Just to complete this line of thought, we could in fact package and reuse dylibs by using similar tight restrictions at the distro packaging metadata level and it would work just fine with the same caveat that we have to rebuild everything (this time applications too) with each rustc release.

My proposal (and current plan) is to package libraries as source and not distribute rlibs/dylibs at all, for the sole reason that this format is more portable across compiler revisions. The downsides are additional cpu cycles at (application package) build time, we need to rebuild all affected application packages whenever a library is updated, and there will be some library out there somewhere that has a license that won’t let us ship source (but I don’t care about that right now).

I haven’t thought too much about “plugins” yet (both rustc compiler plugins and any project where the “deliverable” is a .so library, like a hypothetical pam module written in Rust). I have a suspicion they might require a tight version requirement on the compiler or std dylibs and perhaps need to be rebuilt on every compiler release. Provided the number of such packages is small, we can deal with that.

Does Debian really want to to package the source code? No. We’re just looking for what looks like the best tradeoff within the current limitations of the Rust toolchain and ecosystem. I expect/hope this will evolve quite a bit as we get more Rust applications “in production” and the ABI stability story matures. (My beard is showing, but yes I remember the a.out -> ELF transition that C-on-Linux went through for basically all the same reasons :wink:

As @cuviper also clarified, no this isn’t correct. Debian (and just about every other distro - notably not gentoo) have a clear distinction between “source” packages and “binary” packages (Debian nomenclature, but the idea is the same in Redhat, etc).

Note in particular that “binary packages” often include libraries - it’s anything that is the “output” of the package build process. The separation of pre- and post- build also results in a sharp distinction of “build-time” (“build-deps” in Debian speak) and “run-time” dependency relationships between packages. I expect the “binary package” jargon and the fact that they might include “libraries” is confusing, and I wish I had a whiteboard handy to draw boxes with arrows.

Each upstream project is bundled up as a “source” package (which typically contains source, duh), and the distro machinery centrally compiles that into (possibly multiple) “binary” packages (which typically contain binary executables, shared libraries, or data/config files of some sort). Regular end users download and install binary packages only. This is good because it doesn’t require CPU on the user end, and the run-time dependencies are typically much fewer/simpler than the build-time dependencies.

In “source based distros” (Gentoo is the major example, but also OpenEmbedded/buildroot/etc), users download the source packages and do the compile locally - with the help of the packaging tools. Requires lots of CPU, but allows them to have enormous flexibility in exactly how that gets built. The embedded folks like this because they can get the ultimate size and flexibility in their output. I’ve never understood why Gentoo users do it :wink:

So: my “source-based Rust libraries” plan is to have:

  • Debian “source” packages that include whatever library/application Rust source.
  • The Debian “binary” package for a Rust library will just be the source, installed in a known directory somewhere (ie: the Debian package build step is basically a no-op).
  • The Debian package for a Rust application will build-depend on (probably several) Rust library packages. The build-deps will ensure the Rust library packages (sources) are installed before building.
  • The application package build step will run rustc (via cargo) to compile the application and all the relevant library sources. The resulting (statically linked) executable goes into the application’s Debian binary package.
  • Note the application package has no run-time dependency on the libraries. When the end user “apt-get installs” the application package, they get just the statically linked executables, with no need to ever know about the library packages.
  • A security fix in one of the library packages requires a rebuild of any application packages that build-depend (at whatever depth) on the library package (this is visible in the Debian metadata).
  • A new rustc doesn’t require anything to be rebuilt, but we do need to ensure that all applications are able to be rebuilt.

This means: The rust-library “binary” packages will be basically identical to the rust-library “source” packages, except with different path prefix, and probably patches to Cargo.toml applied, etc. rlibs won’t be used anywhere, except libstd.

(Apologies for my posts being so lengthy, it’s hard to gauge how much background is already understood without body language.)


#58

Re: shipping source. Others already commented this is how Go is packaged. This is also how Common Lisp is packaged in Debian, because there are too many Common Lisp implementations to make binary packages for all implementations. Common Lisp libraries are packaged in source form, and compiled to binary form (Common Lisp has many native code compilers) locally at install time in post-installation script.


#59

An alternative is to have a naming scheme where the ABI hash has to figure in library package names (as is already the case for the dylibs themselves) to try to enable gradual updates and side-by-side installation. Debian’s library package naming policy actually requires that. But it will be ugly. The source package name does not have to be mangled, though, and simply-named package aliases can be provided to refer to the currently mainstream build variant.


#60

It could be what cargo package wraps up for uploading. In fact the build step could just run cargo package with yet-nonexistent options to leave the files unpacked in a specified directory, and the install step would copy the contents of that directory into the buildroot location for the system-wide Cargo package source directory, accordingly to an agreed-upon subdirectory naming scheme per package. If that scheme is $(name)-$(version), then you can just untar the crate tarball as it is produced now. The source package will most likely need the same content in the tarball form (AKA .crate artefact of Cargo), plus files for the downstream packaging system, the usual way.


#61

Thanks for the clarifications! I think I have a decent grasp on it now.


#62

@gus If we provided an out-of-the-box i586-unknown-linux-gnu triple that Debian could configure with, and didn’t use i686 features, would that work?


#63

Thanks for everybody’s patience. I’ve posted a new thread with a bunch of tasks taken from this thread, and naturally I’d appreciate review.


#64

It would certainly save me the effort! :wink:


#65

I would have thought that the -dev package would contain the source and the binary package would be effectively empty, while having the application package depend on the empty library package. This would allow for -sys packages to have native dependencies which are dynamically linked, like openssl-sys depending on openssl and therefore, applications like cargo would depend on openssl-sys, which would be empty but would ensure that libopenssl is installed?


#66

I don’t think there is a need for empty non-dev packages. When the application is built, it gets the dynamic linkage parameters from the -sys crates which are built from -dev packages, and then the package build tools turn the dynamic library dependencies which are baked into the application binary into package dependencies automatically. You may even shed some unnecessary dependencies this way, as IIRC rustc passes option -as-needed to the linker.


#67

My nomenclature might be a bit confusing. I’ve been somewhat carefully trying to use these terms without ever actually defining them anywhere:

  • “source package” - the input to the (Debian) packaging build process. In Debian this is a *.dsc and some tarballs, in Redhat (I think) this is a srpm. No-one other than the distro packager(s) ever see these.
  • “binary package” - the output from the (Debian) packaging build process. This is a *.deb or *.rpm. This is what you apt-get/yum/whatever install.
  • “library package” - a binary package that contains a library in some way. I see above I haven’t used better terms than “-dev” and “not -dev” to describe the way regular C dynamically linked libraries are typically split into multiple binary packages :confused:
  • “application package” - a binary package that contains an end-user (Rust) application. It most likely depends on several “not -dev” library packages.

So following the plan described earlier, a Rust library would be distributed in a “binary package” that bundles the Rust source. Since it is only a build-time dependency, it would be called “foo-dev” (or “foo-devel”) and there would be no “foo” (not -dev) unless there are some additional data files, etc required at run-time to use the library.

To continue your example, a hypothetical librust-openssl-sys-dev.deb package would include the Rust source and depend on (non-Rust) libopenssl-dev.deb for the unversioned C *.so used at link time. (Non-Rust) libopenssl-dev.deb depends on libopenssl.deb and perhaps other libraries, but that isn’t really any of our business. There would be no librust-openssl-sys.deb package. A hypothetical cargo.dsc source package might build-depend on librust-openssl-sys-dev.deb, which would also pull in the (non-Rust) libopenssl-dev.deb. Building the package would produce cargo.deb, which would have a run-time dependency on (non-Rust) libopenssl.deb to pick up the real openssl shared library and no run-time dependency on (Rust) openssl-sys at all because it has been statically linked. As @mzabaluev points out, the packaging build tools can usually deduce the run-time dependencies automatically (basically by running ldd over everything and seeing what dynamic libraries were actually linked).


#68

Thanks for the explanation, I didn’t know this was how it worked before. That’s cool.


#69

Hi, regardless of whether Linux distros package rustc and cargo or not, I expect there will be a strong need to be able to easily install newer versionn of rustc and cargo without relying on the distro. I hope, in particular, that the Rust project will distribute, at least, properly-signed and maintained RPM and DEB packages of new versionn of rust and cargo from its own repository, that can be installed on a variety of Linux distros. See https://launchpad.net/~terry.guo/+archive/ubuntu/gcc-arm-embedded for an example of this which has worked great for me and others.

Debian Stable and Red Hat are notorious for maintaining old versions of packages way longer than anybody wants to support. With the great amount of improvement to rustc and Cargo, I think we’re actually 1 or 2 years away from being able to expect any Rust library or application author to support any version of Rust older than the latest stable release. I personally don’t want to be bothered by Linux distros asking me for free help in backporting changes or adding compatibility hacks to support old versions of rustc and Cargo that they ship. I imagine other people will feel similar. Having an official PPA operated by rust-lang.org would help ease this burden so that we can spend more time focusing on creating the future instead of maintaining the past.


#70

Here’s a heads-up on a PR that affects downstream packaging: https://github.com/rust-lang/rust/pull/30353

This PR turns on rpath for the compiler by default in order to make the official binaries ‘just work’ without setting LD_LIBRARY_PATH. Distributions that do not want that behavior would now need to pass --disable-rpath.


#71

If distros 1) ship Firefox every six weeks (Ubuntu and Fedora do and Debian intends to AFAICT) and 2) have a policy that packages they ship have to be buildable with tools available from the distro repos, then those distros need to ship rustc often (ideally every six weeks) or there will be trouble first when Rust code in Firefox makes it to the release version and later when Rust code depending on a newer rustc makes it to subsequent release versions of Firefox.

I posted to dev-platform about this today.

What’s the current status of being able to build rustc stable release and std stable release with something more predictable than a particular nightly along the way?


#72

Some of us replied on the linked thread, but for reference: there are no known technical obstacles to adjusting the snapshot process to use the previous stable release, and the core team will formulate a plan at our next meeting (on Wednesday).


#73

As a +1 for this before wednesday; If each release is buildable from the previous release, that would at least be enough for us to be able to submit rustc for inclusion in openSUSE Tumbleweed (and then potentially in the next stable releases of both openSUSE and SLE). There’s more work needed to get cargo and crates packaged, but having the compiler itself available would be great.