Pre-RFC: Support package sets in cargo

I personally think package sets are one promising way of solving some of the problems with package repositories like Cargo. Please comment!

Summary

Support the use of package sets in cargo. Package sets are curated lists of (package-version) pairs that are known to work together.

Motivation

There have been many discussions about the downsides to Cargo's model for crate ownership. As we have seen in the npm ecosystem, bad actors use a number of tricks to get malicious code users' systems. These include persuading a package owner of a much-used package to transfer ownership; creating a crate, getting it included as a dependency of a much-used crate, and then introducing malicious code in a patch release; and releasing a crate with a common miss-spelling of a popular crate that contains that crate with malicious unnoticeable changes.

Another separate issue with package repositories is that during dependency resolution, multiple incompatible versions of crates may be required to satisfy the resolution constraints of the dependencies. Situations where it may not be possible to proceed with compilation include references to foreign code built using a build.rs, amongst others.

One approach to providing a solution to both these problems are package sets. As mentioned in the Summary, these are curated lists of pairs of (package, version) that are known to work together. Deconstructing this sentence:

  • curated - meaning that a person or group of people have selected the (package-version) pairs. It allows these curators to vouch for the contents of this particular version of a package, even if the package author is not trusted. The package author cannot introduce malicious code in the future, as only a specific version of the package is ever included in the package set.
  • (package-version) pairs - the package set is simply a list of pairs of packages and versions. If the package set is used, then only packages in the list can be installed, and each package may only be a specific version. The problems associated with multiple versions of the same package cannot occur.
  • known to work together - it is up to the curator of the package set to ensure that all package in the set work together.

There have been previous discussions about a batteries included version of libstd, and package sets are a way to facilitate this in Cargo without blessing any particular list of packages.

Guide-level explanation

In Cargo.toml, there is a top-level key package-set that can optionally be set to a URL. For example

name = "my-crate"
package-set = "https://domain.tld/path/to/package/set"

[dependencies]
rand = "*"
hyper = "*"
# ...

In this example the versions have been left unspecified since the package set selects a specific version that we already know. We can specify a more specific version constraint if we like, and Cargo will reject the dependency if our constraint does not include the version in the package set. Cargo will also reject the dependency if it is not in the package set at all.

Reference-level explanation

TODO

Drawbacks

TODO

Rationale and alternatives

TODO

Prior art

This design is inspired by Spago

Unresolved questions

TODO

Future possibilities

In the future Cargo could host package sets itself.

1 Like

As some prior art, Cargo already supports a separation of crates and packages (i.e. package has many crates). This is already leveraged if you e.g. have a library with a src/bin directory.

Here is a more concrete proposal for allowing multiple crates in a single package:

1 Like

There's a lot of prior discussion on this one, so lemme add what are imo the best recent threads:

1 Like

The curation part seems to overlap with cargo-crev. Crev aso allows a user to vouch for a particular version of a particular crate. On top of that it adds management of these reviews and a web of trust for reviewers, so you're not limited to a particular set, but can combine all reviews into an assessment of your project.

The limitation of the set to specific versions makes me wonder how to deal with addition of new crates to the project that aren't in the set, and combining of sets together. If one set says libc = "0.2.50" is OK, and the other set says libc = "0.2.51" is OK, Cargo will have to choose one or report a conflict. If it reports a conflict, that will be super painful to work with. If it relaxes version requirements, then "known to work together" part won't be satisfied.

Crates ecosystem uses semver pretty well. Some breaking changes slip through from time to time, these problems get fixed quickly. In my experience it's enough to pin crate version and report problem to crate author. So overall, crate breakage hasn't been enough of an issue for me to need someone to verify if some crates and crate versions will work together. They do.


Apart from security and compatibility, the sets also overlap with curation of a "wider standard library". That has been an issue for some users who expect libstd to have everything they need built-in. In this case curation of some bigger de-facto-standard library would be very useful. Maybe https://lib.rs/stdx should be revived?

2 Likes