Perfecting Rust Packaging

Global caveat, I don't pretend to represent all of Fedora. I'm just answering to my best understanding, and it's a bit theoretical since real Fedora packages don't exist yet. This also went very long, so I hope I didn't cross my wires anywhere...

I don't think it's so important to align upstream and downstream packaging. Certainly each distro will have different policies about the nitty gritty subpackaging.

But as a general guideline, I do agree it's a good idea to suggest a meta package to pull in the basics. Sort of like the haskell-platform package, which is nearly empty but depends on all the other packages to complete the platform.

Currently Fedora uses /usr/share/doc/$NAME, but it used to have $NAME-$VERSION-$RELEASE. The best practice is to use the %doc macro right in the build directory, without otherwise installing them at all. There's a similar %license macro too. An html subfolder for docs is fine too.

I'm not sure if Fedora really deals in such things. There's the "alternatives" system, but I think that's usually for completely different implementations, like java runtimes or the mta. My guess is if multirust were to be packaged at all, it would only be useful to manage user-installed rust versions, not the distro package.

Ideally, if stability and eventually ABI is done right, this won't be necessary. Fedora's only alternate gcc is compat-gcc-34, so we're reaching way back here. Those executables are installed with simple suffixes, gcc34, g++34, et al, and the rest of gcc's runtime is already taken care of, isolated like /usr/lib/gcc/$target/$version/.

I would like Cargo to have its own proper release and source tarball, yes. Especially since the versions don't match, I think it should be built as its own srpm distinct from rust itself. I don't really care if it's released in lockstep, but I suspect it will be anyway since they're practically developed together. Just need to know which rust-x.y is required for that cargo release, even if that's always the latest.

Bootstrapping from the previous release would be fantastic. Then stage0 will only be needed once, and we can self-build from then on. That exactly matches Fedora's exceptions to the ban on prebuilt binaries.

I guess for rebuilding the current version, say for a minor bug or security fix, there's a way to skip the initial stage? I haven't tried in a while. Point is just that this is a case where the existing rustc is already the same version as what we're building, so it might need to skip over some assumptions about what the "snapshot" contains.

As for the 6-week release train, I fully expect Fedora Rawhide to keep up with this. If the package maintainers slack off, then it will be justified penance to make them step through 6-week increments, no problem. I don't see the value of an upstream build script doing this, because then you'd need all those intermediate sources in the srpm.

Then in released Fedora, either we'll have answered the questions like ABI sufficiently that they should keep up every 6 weeks too, or else they'll be locked into some specific version and never need to worry about bootstrapping again. The next Fedora release will just have inherited from Rawhide which will be keeping up.

A blessed Cargo bootstrapping script would also be great. If not, then it'd be nice to follow the same suggested model as Rust, building with the prior Cargo release, with consideration for rebuilding on the current release too.

I personally don't like rpm-generators and the like, if that's what you mean. It just needs to have the right knobs to control installation paths, bindir, libdir, etc., since distros have different ideas about this. A DESTDIR control on the install command is also needed, e.g. even though I have prefix=/usr, I'll want to install to $DESTDIR/usr/... during the package build.

Are you talking about source packages, or just the binary packages you distribute? I suspect the latter. The source of the standard library seems inseparable from the rest of the compiler, and should be part of the distribution - much like how libstdc++ is always part of gcc. (at least these days; not sure if that's historically always been true.)

For just binaries, meh. I think distros will vary, and upstream doesn't need to try to align how packages are named. It's good enough to just specify how rustlib/$triple should be found. For Fedora, I expect we'll just build everything natively, with those libs in their own subpackage, and then an x86_64 host can just install that rustc-libs.i686 compat package like all other compat libraries are handled.

Well, if it's so close already, why not take the goal of switching to stock LLVM altogether? Those few unwanted optimizations -- can you provide references to these discussions? Maybe they might be accepted with some rework? (but I don't know what these are...)

Anyway, yes, Fedora is one that would prefer not to use bundled LLVM. It's possible to get exceptions, but better if we don't need to. And this really ought to be dynamically linked too.

We usually deal with distro CFLAGS or equivalent in an rpm macro like %optflags. For example, Fedora's 32-bit x86 target currently gets "optflags: %{__global_cflags} -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables". I imagine we'll add something similar for rust packages, and in their spec then can just run something like cargo build %{rustflags}, or maybe even just %{cargo_build}.

Right now Fedora's %__global_ldflags just has hardening options.

I haven't looked at those before, but at a glance, no I don't think Fedora needs that. That is, we need to run something like cargo install or whatever at rpmbuild time, but in the user's hands install/uninstall task should be fully handled by rpm itself. Maybe it could be useful to produce a manifest to aid the rpm %files section. But since things are usually installed in a clean DESTDIR, you can usually just wildcard entire paths.

As for consistency across distributions -- what are you hoping for specifically? What problems are you trying to avoid? Distros all have their own policies, so I don't think you'll get a complete homogeneity here...

Package names, division, cross-std, are all going to vary on distro policy. For bootstrapping, I think we just need clear instructions how to do it, especially offline, whether from stage0, the prior release, or rebuilding the current release. Certainly branding and trademark guidelines need to be stated if you expect any control there.

I'm not sure. There's enough interest that a few people have uploaded binary packages on copr, and I'm sure those users would prefer an upstream repo. Even after it is in Fedora proper, your own repo still might be useful if Fedora ends up staying conservative on updates. Beta and nightly repos might be cool too. BUT - if you're messing with all that, you probably want multirust with user-installed rustc, so rpm repos are beside the point.

This is an additional tangle. In an ideal Fedora-packaging world, every rpm-packaged crate would be built and installed separately from all its dependencies, and all dynamically linked together as needed. Those dependencies should be found and used in library form, not from full source. (I guess there's no header equivalent needed.) But I fear that this world of Cargo.lock fixed dependencies and vague ABI promises is really incompatible with that ideal.

The hand-wavey planning in my head right now says that we should really only try to package rustc and cargo for now, and their docs, and leave everything else up to users building from crates.io. But that would mean nothing else of substance can be built with Rust into the distro, which isn't so useful...

EDIT: It may be of note that fedora-devel has a large thread on anti-bundling requirements right now. I like Adam Williamson's recent reply comparing how python and perl are good with distro packaging, and how PHP and JS are horrible -- I want Rust to be on the good side! But all of those are interpreted languages, so at least they don't have to worry about ABI. Lars Seipel mentions work happening in Go, which may be a better comparison, but I know little about that ecosystem.

Note that Cargo is not a package manager in the sense of RPM, dpkg, etc. By and large, it manages dependencies within a package build, and all files it modifies are expected to be in the home directory of the invoking user. Even the install command is designed to avoid stepping on the system package manager's toes. Cargo's default way to resolve dependencies is to pull their sources off crates.io or the source repositories as specified in Cargo.toml. AFAIK reachability of internet resources is not a guarantee in distros' package building environments and in any case should not be relied upon: normally all input sources are extracted from the source package or come from installed dependency packages.

In my opinion, it makes more sense to see Cargo as a tool invoked in dowstream package's build, like, say, Ant for some Java projects (Maven might be a closer analogy, but I don't have actual experience with packaging Maven-based projects). There might be helper tools to convert metadata from Cargo build files to downstream package metadata, but this should mainly be a concern of the downstream's developers.

To satisfy the no-downloads restriction, there should be a way for Cargo build to override dependency resolution with globally installed packages. This also means committing to some format of installed library packages. Should they install .rlib artefacts plus some global equivalent of .cargo overrides? Or package the source tree?

2 Likes

Thanks for the thorough replies, @gus.

Yes, agreed, but mainly because it always has, and I don't see being able to divorce the two any time soon.

This sounds like it may be the same as a rustc/std package split. I would expect that - given our current file structure - a rustc package contains the stuff we drop into /usr/local/{bin,lib} today, and the std packages to contain the stuff in rustlib.

I agree in principle. I've been considering that the host std might live in the 'normal' place with the other libs (/usr/lib) and all cross-libs might live in rustlib.

Yes, agreed. Upstream wants to distribute source packages too and this would be our division.

Great. Noted.

Aha. The 'build with the most recent stable rust' requirement I hadn't considered. To be clear, the paired Cargo needs to be able to build with the rustc it's paired with, but not the previous stable rustc?

Do you take the Debian source for rustc from the source tarballs or from git?

Does the approach I outlined, where we guarantee a bootstrap chain from some previous stable release, work? You might have to run a script to rebuild lots of rustc's to catch up from whatever Debian's prior stable release was. How long is the window in which Debian might not upgrade Rust? Will Debian testing always stay up to date? Skip one release, two?

To be clear, you want Cargo to report the versions of native deps, so that e.g. Debian can override them with a different native dep for bugfixes?

Are these other sources just the native deps, or do you want to redirect cargo so that it gets Rust crates from another place? Like security fixes in native deps, would Debian identify security bugs in Rust crates, and then forcibly override the deps of that crates' downstreams?

Haha! No. We just want to decouple the distribution so we can distribute arbitrary numbers of stds and not force people to download them.

The changes I'm suggesting (and we're working on now) are for upstream's own purposes (described above) - we need to be able to distribute cross-std in discreet units.

Yes, definitely. We would gate on it and force ourselves to be compatible with LLVM stable.

Great. Thanks for the clarifications. So as I understand, for now at least, it's the C bits that you want to customize. It looks like incorporating something like your patch will fix that for Rust's own build system.

Nice link! Thanks.

Great. Thanks for the data point.

AFAIK this would just mean that we output copies of all the scripts, fonts, etc we use during the build, and don't rely on any CDNs for that stuff?

Can you expand on that? If I understand correctly you are suggesting that the complete std package be called 'rust-std-nightly.tar.gz' (as opposed to 'rust-std-nightly-x86_64-unknown-linux-gnu.tar.gz'). Note that all our existing packages contain the host triple in their name. What is the difficulty you are foreseeing?

Will do! You are on my contact list.

Yes, this is the case. Using dylibs for the most common thing dylibs are used for (providing ABI-compatible updates) is not viable in Rust right now. Static linking is strongly preferred by the entire system atm. Solving this is far-future territory :frowning:

Since Rust much prefers static linking, will Fedora use it, or will Fedora want to dylib crates anyway?

We may be able to make some quick fixes here on native dylibs. I suspect that a lot of crates are 'vendoring' their native libs and not providing an option to dynamically link them. Not sure. cc @alexcrichton.

Ouch... I don't think there will be any hope of this any time soon. Presently in Rust, when anything upstream changes, including the compiler, all downstream must be rebuilt.

Done. Thanks. I can't believe I completely excluded BSD. I've also pinged @dhuseby

Yes, I think we this is or will be an option. There's a PR open now trying to improve support. I was not aware that 3.7 didn't support static linking. Wonder what we're doing in-tree...

There isn't currently, no. Somebody earlier in the thread mentioned a desire for this as well. I hadn't previously considered the case for point releases. Do you think that generally people will expect to build point releases with the previous point-release and not the previous major/minor release? Since we haven't done a point release (in years at least), if we don't plan for that it may bite us when the time comes.

Yes, just talking about the binaries.

I've only heard it mentioned informally that they aren't suitable for upstream LLVM. Perhaps @alexcrichton remembers the actual discussions with upstream.

I think neither cargo nor rustc will obey any such environment variables by default. The build of Rust itself is flexible enough to override this in the rpm definition, but subsequent uses of rustc don't have a way to obey e.g. %optflags. This would be a concern for packaging cargo crates and for simply using rustc on the system.

Somewhat of a follow up on my previous comment. It sounds like there is a difference between what rpm should do and what the compiler itself should do by default. Is it ok not to use the global ldflags if e.g. Bob is just running rustc hello.rs?

Forgetting to include files, putting files in strange places, keeping up with upstream changes in filesystem layout, removing maintenance burden from downstream.

Thanks for those links! I haven't reviewed yet, but I will.

Thanks for the amazing response folks.

The cargo version that we include in a Debian release needs to be buildable with the rustc also in that release. So if you want us to match the rustc/cargo pairing that you're using for your pre-built bundles, then yes we'll need it to be able to build with the paired stable rustc.

We take source tarballs (and then import them into our own git tree, but that's mostly a coordination choice on our side). Currently we also repack the source tarballs to remove src/llvm/* (unneeded, and big) and src/etc/snapshot.pyc (shouldn't be in the upstream tarball anyway).

Note that even when we kept the full src/llvm/ source, we still had to remove src/llvm/cmake/modules/LLVMParseArguments.cmake because CC-BY 2.5 isn't DFSG-free. We also have to find/include the appropriate jquery source for the minified version shipped in upstream git - again due to Debian's interpretation of "compiled" for javascript. It would be helpful if the latter was already included in upstream source, and perhaps you could work with LLVM to address the former? (but that's just me being lazy :wink: )

Yes, I think it would be fine to say that we need to hop through each stable rustc release - or re-bootstrap everything. We intend to upload every rustc stable release into Debian unstable, and have that trickle down into testing in the usual manner (~2 week delay); Debian stable is then a snapshot of whatever testing was at that point in time (so Debian oldstable->stable will almost certainly skip rustc releases).

A Debian (entire distro) testing/stable release needs to be able to rebuild whatever rustc version is included in it using only other packages also in that release (so using the same rustc version), and each new rustc release that we upload into unstable needs to be buildable with the immediately prior rustc release already in unstable (to avoid re-bootstrapping). I think this means we'll need to improve the Debian build to detect version and jump straight into stage2 for the former case, since stage1 won't build with its own release (until upstream moves away from using #[cfg(stage0)] as a versioning mechanism) - this doesn't sound too bad, but I haven't actually done it yet so I don't know what traps await..

EDIT: once again, I've forgotten that jumping straight into stage2 isn't possible because unstable Rust features won't be available. Sigh, need to think of something else (perhaps package rustc twice with different --release-channel choices)

Reporting versions is the easy bit (and could be done manually, although tooling is always nice). By "want to expose dependencies" I meant "vendoring sources" is technically workable, but not a great solution.

In a hypothetical future Debian stable release that included rustc/cargo and (eg) rust-mozilla, the Debian security team might need to patch the (eg) rust-ssl library used by rust-mozilla. We need to be able to tell cargo to rebuild the same rust-mozilla source using the now-modified rust-ssl library (so can't pull from network). We also want it to be easy to find and repeat this for every package that uses rust-ssl (so "vendoring" a new copy of rust-ssl source in every application that uses it is bad).

I sketched out a possible solution elsewhere, that follows what golang currently does in Debian. In this proposal, Debian Rust "library packages" are just source archives that get dumped somewhere centrally on disk. We then need some way to tell cargo to build using the libraries in this central location rather than going out to crates.io or wherever it would normally go. I think we can generate whatever version metadata/index cargo wants, if such a thing is required.

With this proposal, the rust-mozilla Debian package would have a regular Debian build-dependency on the rust-ssl-dev Debian package. The security fix would be as simple and routine as "release a patched rust-ssl-dev Debian package, and rebuild/release all Debian packages that build-depend on it". rust-mozilla and any others would pick up the modified rust-ssl source from the central on-disk location and it would all just-work.

As a practical matter, we would only assemble and upload libraries into Debian that were actually required by Rust executables also in Debian - this isn't attempting to be a viable alternative to cargo and crates.io for regular upstream software development.

The latter. If we want to have an executable built using cargo in Debian (this includes cargo itself), we need to be able to tell cargo to not use the network. As explained in the security example above, we'd also like to be able to point cargo at a central location so we can share those crates (source, not proposing rlibs/dylibs) across multiple cargo-using executables.

FYI: Luca's excellent work on packaging cargo might give you some idea of the blunt tools required to get current cargo to not use the network.

Ah I see :slight_smile: It would be helpful to distinguish between "source" and "binary" packages, since I thought you were talking about splitting out the rustc and std source packaging (and I've had to resolve this ambiguity in some of the other discussion).

Well no, the link step isn't specific to C. To make that more clear, in my example above I want to pass -C link-args="-Wl,-z,relro" to rustc. My other examples were about enabling/disabling cpu features and ABI details, which obviously need to influence the code that rustc compiler itself produces.

We can (and do) patch this for the rustc build system easily enough (since the rustc build system exposes these details quite openly), but this is harder for cargo and other Rust applications (built using cargo) since cargo intentionally doesn't offer this flexibility (not saying whether that's good or bad, just highlighting there's a clash in use cases here).

Yes. That would also solve the privacy problems of people. We could also implement a special mode (--offline) that is used in that case.

Hi,

I didn’t read the whole thing yet (I plan on doing it later today), but as the rustc packager for Exherbo, the main critical blocker for us atm is https://github.com/rust-lang/rust/issues/16402 (which doesn’t seem to have been mentioned here yet)

Without this fixed, each time a user wants to reinstall/update rust, she has to uninstall the installed one first.

No, I meant that the binary package naming/dependency metadata should be so that you can install librust-std for the host platform using just its short name without the target triple (using Fedora as an example):

dnf install rust-std

Or, if side-by-side installation is supported:

dnf install rust-1.4-std

rust-std may then be a virtual package name to pull in the package for the current stable version (and the appropriate target triple suffix, if the actual binary package name is ramified with it).

Same for host-targeting rustc and cargo, if separate tool binaries are needed for different targets. The general principle is, packages for host tools and libraries are available by short names, but naming of cross-compilation tools has to designate the target triple.

It appears there is precedence for this. There's an exception noted under Fedora's guidelines for static linking for all OCaml programs:

Programs written in OCaml do not normally link dynamically to OCaml libraries. Because of that this requirement is waived. (OCaml code that calls out to libraries written in C should still link dynamically to the C libraries, however.)

It appears golang packages are also static linking everything, though I don't see an explicit exception for that. I'll have to look closer at how those languages are dealing with this. At a quick glance, it looks like golang libraries are just shipping sources, like @gus suggests here. Maybe it's not as strict as I thought.

If there's a system version of a library, those crates should definitely be dynamically linking that, not bundling their own. Those crates will have to be fixed as part of their initial packaging process.

It's not just about new point releases, but also in case rust needs to be rebuilt for any other reason. It could be a bugfix or security patch we want to apply, or it could be a simple rebuild due to other changed dependencies, say a new LLVM soname.

Once we get bootstrapped, the Fedora build root will have whatever rustc package is currently stable. So if we're on rustc-1.4.0-1, and we want to build rustc-1.5.0-1, that's fine, it's the previous release. Then a point release appears, we need to build rustc-1.5.1-1 using the buildroot's 1.5.0 we made earlier. Then if we need to rebuild for any reason, now rustc-1.5.1-2 will be built from 1.5.1-1.

There are ways to override the buildroot, but I don't think you can tag obsolete packages that way. Usually overrides are used to bring in some new package to chain build others, before sending them all to stable.

It sounds like @gus is also describing the same thing about jumping to stage2, but I hadn't even considered the problem of feature use. I don't think multiple --release-channel builds should be necessary -- there ought to be an option for even stable builds to use features. There's no need to be draconian, just let advanced use cases be advanced. IMHO :smile:

It doesn't need to be an environment variable, but we really do need control of compiler flags for cargo build of crates. Even just for simple stuff, like we usually want distro packages built with optimization and debuginfo. The crate author can set their profile preferences in Cargo.toml already, but IMO this shouldn't be treated as final. (Nevermind any -C options we might also want...)

Right, this is just policy for distro packages. Alice and Bob are free to make their own choices of compiler options. (But we still need Cargo to give them that control!)

For bootstrapping (or maybe even for regular builds), one could throw known-good stage0 compiler binaries for all supported build architectures into the source package.

We can bundle stage0 for bootstrapping, yes, but not after. See the policy here: https://fedoraproject.org/wiki/Packaging:Guidelines#Exceptions

Some software (usually related to compilers or cross-compiler environments) cannot be built without the use of a previous toolchain or development environment (open source). If you have a package which meets this criteria, contact the Fedora Packaging Committee for approval. Please note that this exception, if granted, is limited to only the initial build of the package. You may bootstrap this build with a "bootstrap" pre-built binary, but after this is complete, you must immediately increment Release, drop the "bootstrap" pre-built binary, and build completely from source. Bootstrapped packages containing pre-built "bootstrap" binaries must not be pushed as release packages or updates under any circumstances.

Note that Cargo has automation to ensure it builds on stable now, and I plan to have this continue to work relatively far back (~1.1 right now I believe) into the forseeable future. Almost all of Cargo's deps compile on 1.0.0 (and have automation to ensure that) and only a few require 1.1.0.

This was the purpose of the links attribute in manifests as it allows packagers to override build scripts with whatever they like. I do imagine, however, that some crates will need modification to work in all cases, but in principle we have all the machinery in place, just gotta make sure it's in use.

We actually haven't discussed the current meatiest patch (NullCheckEliminationPass) with upstream, but there's definitely nothing blocking us from using stock LLVM. I compile and test with it from time to time (especially whenever I upgrade LLVM), and I would be surprised to learn if stock LLVM 3.5-7 didn't work. @brson is right in that we need automation for this, but this should all have worked for some time now actually!

Note that build scripts are designed with this kind of use case in mind. For example the main OpenSSL bindings in Rust specify a links which allow you to completely override the build script entirely and it also by default uses pkg-config which allows even further customization of the build process while still assembling some custom shims rust-openssl uses. This sort of control should allow you to have any of the hooks you need to globally tweak how native libraries are linked into Rust programs through Cargo.

I'm a little confused by this, are you looking to turn crates into Debian packages? I would be curious to hear why this would be necessary, and otherwise I think I may be missing the motivation here for why this is happening.

One thing I've always been confused about with points like this is that at some point there has to be network activity, right? For example Cargo (and other Rust projects) fundamentally has dependencies which need to fetched from the network at some point. When is the best time that this is normally done for packages in Debian? Is there a point where a "source tarball" is built and that source tarball is intended to contain the entire state of the world? It's pretty plausible to add a command to Cargo to do something like this!

In general an "offline mode" doesn't make a whole lot of sense in Cargo because Cargo already works totally fine offline, so long as all the dependencies are downloaded. You just have to arrange for Cargo at some point to have already downloaded the dependencies and placed them somewhere. In that sense Cargo only ever talks to the network when absolutely necessary, and the problem is timing exactly when this happens to be when you expect it to happen.

Ignoring build scripts, I don't see where this is the case. All of them can be overridden by using a .cargo/config with the appropriate paths.

By "libraries" I meant to also include Rust libraries, not just native libraries: We may need to patch a Rust library and rebuild downstream Rust packages using the patched version. Actually, since almost all native libraries will be dynamically linked (and so won't need applications to be rebuilt), I guess most of these examples are going to be Rust libraries.

So: I don't think links or build scripts help here - unless I'm misunderstanding something.

I attempted to describe the motivation in the very section you replied to, so clearly something didn't work :stuck_out_tongue:

I'm not sure how to describe the rationale more clearly - after you've reread that post, perhaps you could describe how you see the distro security workflow proceeding for a fix to a Rust package?

There's a number of concerns driving such a simply worded policy:

  • Avoiding network at build time reduces a significant source of flakiness.
  • Allows offline development of packages.
  • Maintaining our own copy of upstream sources means we don't have to worry about upstream going away, or moving.
  • Some licenses (GPL, under an earlier interpretation) require the person giving you the binary to also make the source available
  • Some upstream sources contain unredistributable pieces (requiring repacking), or aren't easily available at all from a well-connected upstream location.
  • Just making the sources available on disk in a standard location/format transcends whatever favourite build tool/language/version-control/archive/verification of the day is in use, allowing greater experimentation and freedom with such tools. For example, rustc upstream releases tarballs and make install whereas cargo uses git and a cargo-based build, yet both Debian packages can be processed with the exact same build machinery.

So the way this works for Debian, and most other distros, is that someone obtains, verifies, and possibly modifies the upstream source, adds whatever packaging details, and then uploads that to the distro archive. From that point on, the distro uses that copy of the upstream source and never contacts the original upstream location.

Re "entire state of the world": the packaging metadata describes versioned build-dependencies between packages, so the build environment for a particular package is assembled on-demand from individual packages prior to building that package. There is never a "whole world" source assembled because it would be too big, unnecessary with incremental package builds, couldn't be shared effectively between similar-but-different environments, and more importantly not everything can coexist all at once (llvm and gcc both want to provide "cc", for example).

Yeah, I have a feeling that cargo can already do quite a bit (perhaps all!) of what we need - and I just need to work out what files to construct and place where. I (or someone else) should probably start putting together a straw-man so we can talk about the non-obvious bits... It would be useful to have a cargo --offline flag that threw errors so we knew when we had failed to construct the right environment, but I guess we can fake that up with some iptables rules or other acts-of-sysadmin initially.

1 Like

The override is neat start, especially if we could specify the rustc-flags we want anytime (whether or not it has links). That would let us use a global %optflags-like behavior after all! But if a build script is meant for other tasks too, say code generation, then we don't really want to inhibit that!

Aside, I find it odd to refer to foreign (non-rust) libraries as "native". It only serves to present rust as the outsider. If rust is to be a systems language, pervasive throughout the distro, then it has to "go native" too, no?

We don't want the whole world of crate dependencies bundled together, for the same reason we try to avoid "native" library bundling. Each crate that ends up in the distro should be maintained in one place, so they can be easily tracked as necessary. Realistically this means each is its own package.

1 Like

For library crates, that means the "binary" package (the unit of package installation in the system) has to provide the library in a form that can be used to build dependent crates. For now, the only format that can be expected to work is installing the sources in some system-wide location and have a way to tell Cargo to use those sources instead of fetching from the network. A more efficient form of distribution would be to provide .rlib archives, but that means maintaining backward compatibility on the static library metadata and ABI between Rust releases, ideally up to the next major version of Rust, or having to recompile and update all Rust packages at once when the compiler and the standard library is updated. Actually, once ABI backward compatibility is maintained, it should be a small change to switch to packaging dylib crates by default, resolving all concerns with bundling statically linked code.

2 Likes

Ok, it wasn't clear reading originally whether references were referring to C or Rust libraries, but with the mindset of both this makes more sense. Note that I wouldn't consider myself an expert at all in packaging, I'm just trying to understand the problem space!

So with all that in mind, I'm still a little confused about how you might be expecting things to work out. It sounds like you want to mirror Cargo.toml as normal Debian build dependencies and not use Cargo for dependency management at all. This would allow you to understand the structure, have each source tree be independent, and if you need to you can patch any crate in the ecosystem. On the other hand it also sounds like you want to use Cargo to build everything without needing modifying any sources and have it just pick up all the dependencies which happen to already be on the system. Does that sound right? Reconciling these two desires will be difficult to do, but it may indeed be possible.

We may also want to continue this discussion on a new thread (as I believe @brson wants to focus this primarily on packaging Rust/Cargo itself at least from the start).

True! I just use it to colloquially refer to "things normally found in a package manager" vs "normal rust crates on crates.io" kinda

While I think this makes sense, I think that something may need to budge somewhere on this. Unless all package managers are explicitly willing to entirely duplicate everything Cargo does for dependency management we may need to start assuming this may not happen and strive to find some other solutions.

For example, why do distros want to duplicate Cargo's dependency management? Or is it fair to say that distros want to do this? Is this only for security updates? If that's the only reason, then we can probably reach a more targeted solution, but it'd be good to explore this space first. (although like above we may want to continue off-thread to avoid getting too far in the weeds!).

The reason I mention perhaps finding another solution is that I'm not sure that distros really want a package-per-crate. Projects like servo have hundreds of dependencies, many of which are tiny and the likelihood that distros keep up with the rate of change of the entire Rust ecosystem seems... unlikely? The other reason I feel distros don't want to do this is that it doesn't currently really make sense to "install a Rust library". Cargo explicitly will not look at the system for dependencies (e.g. this is how it guarantees reproducible builds), and managing dependencies for a Rust project is idiomatically done through Cargo, not the system package manager.