Introduction
Federation is a mechanism by which different registries can be consistent with eachother.
Motivation
We want separate, independent registries. We want projects to have their own, official registries. We want them to go down without causing everyone else issues. We want to be able to audit them. We want to be able to restrict them. We want to be able to adjust them as we see fit.
Goals
- Less dependence on crates.io mod team to do what we want.
- Consistency/deduplication across registries.
- “Local” (same city/country/continent/ocean/etc) registries for improved caching (especially behind e.g. satellite connection). Since (currently) crates.io never uses HTTP for anything, it’s impossible to use a transparent proxy here.
- Official registries, e.g. “the Diesel project registry”/“the Tokio project registry”.
- Consensual redundancy. (user must manually select the registries that are still alive in order to use them - no MITM potential.)
- Allowing crates that depend on example.org/foo to be published on crates.io.
Non-goals
- Allowing crates to be published on any registry and later transferred to the proper registry based on namespace and signature. (this could be left to another RFC, as it’s not incompatible with this design.)
- Mastodon compatibility.
- ??? [WIP]
Outline
- This RFC defines a namespacing and registry management technique/system called “federation”. The namespaces are the domain names of the instances/registries hosting crates.
- Additionally, we bring forth some important changes to Cargo and crates.io:
-
Dependencies of the form depname = "version" will generate a deprecation warning, unless the crate explicitly sets a default registry. If no default registry is set, crates.io is used.
E.g., the following produces no warning, because default_registry was specified:
[package]
name = "example"
version = "0.0.0"
authors = ["foo@bar.example.org"]
default_registry = "crates.io"
[dependencies]
foo = "0.3"
Had default_registry not been specified, it would’ve produced a deprecation warning. crates.io should thus accept no new packages that don’t specify a default_registry. It’s also possible to not use default_registry at all, and instead specify foo = {registry = "crates.io", vesion = "0.3"} [TODO either figure out how to specify the registry as part of the name, e.g. foo@crates.io = "0.3", or mark this as an unresolved question]
-
Cargo gets a new config option: the user’s instance. This is the instance to be used for fetching all crates, as well as publishing crates. Additionally, the publishing instance may be overridden by Cargo.toml or a command-line option. This will default to crates.io.
-
As mentioned previously, all crates are fetched through the user’s instance, which defaults to crates.io. As such, crates.io will (permanently) cache remote crates, and there’ll be an API for remote registries to push their updates to crates.io. (this makes security updates propagate faster.)
-
We also expand the current policy of DMCA takedowns - take down the crate and all crates that depend on it - to also include cached crates. Additionally, to keep such crates from resurfacing, they also get added to a blacklist, which is checked whenever new remote crates are pulled in.
-
[??? I may be forgetting something here but I can’t remember what it was]
- This RFC defines a limited API between registries. It also defines some things that need to be registered with various other things - MIME types, .well-known’s, etc.
Flowchart
[TODO: explain this in english]
cargo update
|
v
GET https://$USERREGISTRY/.well-known/cargo.txt
|
v
GET https://$USERREGISTRY/$APIPATH/index.whatever
|
v
GET https://$USERREGISTRY/$APIPATH/crate/$NAMESPACE/$NAME --> [server] GET https://$NAMESPACE/.well-known/cargo.txt --> etc, same process as above --> add it to local index
|
v
build local index
cargo publish
|
v
GET https://$USERREGISTRY/.well-known/cargo.txt
|
v
POST https://$USERREGISTRY/$APIPATH/publish
|
v
[server] for each $REGISTRY in $REMOTE_REGISTRIES do < GET https://$REGISTRY/.well-known/cargo.txt --> POST https://$REGISTRY/$APIPATH/publish >
(the latter is so the remote registry doesn’t need to keep spamming GET requests, it increases overall fediverse stability and improves security, as security updates are pushed rather than pulled.)
Notes: Both servers can decide which servers they’re going to talk or not talk to. They can also decide what kind of operations they’re gonna accept or disallow. For example, an audited registry would not accept remote publishes.
When publishing, the server needs to specify some things about it, like its domain, signature/certificate, etc. The signature/certificate is added to prevent impersonation. The actual implementation details of this are [WIP/Pre-RFC]. The same mechanism could be used to authenticate clients. (No OAuth tokens - use public keys.)
Unresolved Questions
???
Prior Work?
The most widespread federated protocol is plain old email. It’s not the most suitable source of inspiration for a federated package manager, however.
The second most widespread federated protocol is probably ActivityPub. It definitely has had more success than XMPP in the “federated” (server-to-server communications) part. This is what this RFC is based on.
???
???