Use petnames for crates

This proposal should solve the problem that names on crates.io could not be reclaimed if they are blocked by an abandoned package. See also Reclaim Inactive Package Names through Community Voting - #18 by 2e71828 - community - The Rust Programming Language Forum for an alternative proposals.

As a side effect this proposal could also establish a cryptographic trust path to individuals signing releases.

I'm aware that this proposal still lacks a lot of details and might in parts be wrong.

Please read about Petname Systems. I also recommend "An Introduction to Petname Systems" linked from there.

TL;DR: In a petname system the user of an entity is responsible to name the entity. The entity or a registry can only recommend a name for an entity. Identity of entities is provided by a public key. Thus only a possessor of the corresponding private key can publish updates to an entity.

For any software package system (not just cargo / crates_io) this would mean:

  • A new crate receives a tuple of (UUID, Nickname, [public key, ...]) as identity. The public keys correspond to the private keys of any person authorized to sign a release of the crate. The set of public keys can change over the lifetime of a crate.
  • A crate release gets signed by at least one of its private keys (e.g. via Git tag or a signed tarball) and published to any number of repositories for discoverability.
  • A crate dependency in Cargo.toml gets expressed as (UUID, Petname, [public key, ...]). The Petname can be the same as the Nickname and most of the times will be the same. A downloaded crate gets accepted for this dependency if it was signed with at least one of the keys.
  • The Petname is the one to be used in code.
  • A crate repository (like crates_io) should analyze the dependency graph of crates and use a pagerank-like algorithm to list crates by their Nicknames in search results. Popular crates thus over time "earn" their Nicknames but can also fall in oblivion again.

Such a system would have as side effect cryptographic signatures of code authors and independence from any single repository. Repositories would just be search engines for crates that could just link to any download location, torrent, IPFS, git repository, ....

Cargo should manage a local registry of Petnames to (UUID, Nickname, [key, ...]) tuples. A user that uses crate A in one project with Petname B probably wants to use it with the same Petname in all projects.

Any signature on a crate release also is a public log of the public keys of any dependencies of this crate. This way public keys build up reputation over time without the need for key-signing parties on rust conferences. (Although these would still be beneficial.)

I’m reading suspiciously little about backwards compatibility.

You write

but >100000 crates already exist on crates.io, and even more crates inside&outside of crates.io do specify dependencies on crates from crates.io; so all that could be added is some new, optional mechanism on top of the existing one, right?


One property I keep reading there is “decentralized”. Would this really be a good fit for a centralized system like crates.io?


Do note that package names on crates.io already are “only” a suggested/default name. Actually, it’s even less strict: the package name can even be different from the name of the library crate, and then, any user is still free to choose a different name to use their dependency under, anyway.

4 Likes

It could be a foundation for building an ecosystem of alternate registries, allowing the same crate uploaded to different registries to be successfully unified by Cargo and/or securely downloaded from an alternate source.

1 Like

For me, one of the fundamental design constraints of the existing registry system is that a dependency won't disappear on you. That goes away with a decentralized system.

4 Likes
  • Backwards compatibility: Yes, I was assuming that a petname system would be an optional addition to the already existing infrastructure. Time would tell, which one would become more popular.
  • Renaming dependencies: Yes, it is already possible to rename dependencies and thus the problem of names taken on crates.io is more of a nuisance than a blocker for anything. Still, humans tend to really like naming things and to give much importance to names. There are even international conflicts over names of some geographic entities. My proposal is: Name the thing any way you like and we will see which name becomes popular for what entity.

Yes, thank you. I haven't thought about that. Thus it would not be that much of a problem if crates.io would go away since it would only be a search engine with an archive. But the same dependency could still be downloaded and validated from any other place on the internet.

Au contraire ! Just include a torrent client with cargo and every rust developer can volunteer to serve dependencies from their local cache. More popular and recent crates will automatically be served by more people. Availability also automatically follows the sun.

Any less popular crate will disappear over time. Some for older versions of popular crates. That can make it impossible to build an old version of a project when you need to.

1 Like

Nothing is going to disappear with this proposal. Right now you depend on exactly one foundation from one country supported by a few tech giants to run crates_io and keep your dependencies available. With this proposal anybody can set up a mirror and each mirror is as valid as any other.

Anybody can start operating a Debian mirror because the files are signed by Debian. Anybody could start operating a mirror of crates if these were signed by trustworthy signatures and not depend on the crates.io DNS entry

Right now the security of crates_io relies on (my interpretation):

  • The crates.io DNS entry
  • The security of the index Git repo
  • The login credentials of any crates maintainer for GitHub
  • The security of the ~/.cargo/credentials.toml file on the machines of crates maintainers

So not only would this proposal allow for the crates ecosystem to remain available independent on any single foundation. The proposal would also increase the security of the system.

Petnames don't imply mirrors and mirrors don't need petnames (https://github.com/rust-lang/rfcs/pull/3724 if accepted would allow untrusted mirrors of crates.io too). In any case I was talking about your suggestion to use bittorrent.

3 Likes

This seems like a rather significant step backwards in terms of ease of use and simplicity in reasoning about how things work. I believe a typical user has no problem with crates.io being a centralized repository and just wants to find and get crates quickly and easily. If anything, that centralization is a benefit.

1 Like

I believe your idea here does too many things at once, and I believe you could/should try to split it up into more minimal ideas for changes.

Some ideas/thoughts for illustration of possible splitting points include… (click to expand):

For example, cargo already supports decentralization in the form of git dependencies. Your proposal reads like you’re suggesting crates.io to become part of a larger decentralized ecosystem, which could mean a crate on there might be able to depend on another crate that isn’t part of crates.io itself.


Currently crates.io does not support hosting crates with git dependencies, or dependencies on crates in other registries. The reasoning behind this is what @epage called “one of the fundamental design constraints of the existing registry system”. If your proposal is to change this property, this can be a feature/discussion on its own, which absolutely doesn’t need to include e.g. any petnames functionality; and conversely petnames could work without the need for changing this design aspect of registries, or crates.io policies.


Another thing that this seems to propose is a more unified way of specifying dependencies. Your proposal reads as if it wants to give crate registries (you sometimes call them “repositories”) a more equal position/rank, next to other sources for dependencies. Currently one important thing that only registries do is to offer versioned dependencies, with semver-based versioning and version resolution, and corresponding tooling. Ways of extending these mechanisms to other forms of alternate sources can be its own feature, too[1], and would be a necessary precondition (as far as I can tell) for any “Repositories would just be search engines for crates” vision that you might have.

You do give a few design ideas on how to make this work involving signed releases and meta-information shared through custom registries/“repositories”. I believe it would be important for such a proposal to explore the design space a bit more broadly; e.g. one could consider more minimal approaches; for instance:

  • some way of including versioning within the git repository itself without a full additional mechanism of custom decentralized-crates registries?
  • taking a closer look at the existing mechanism for alternate registries and identifying shortcomings in the existing features/mechanisms when trying to use it for this kind of purpose… as a starting point, one could read through previous discussions e.g. around the RFC that introduced it

You do list a lot of technologies (even open-ended, with a trailing “…”) with “link to any download location, torrent, IPFS”, and of course any new kind of technology (besides the existing ones of git dependencies, and alternative registries). Of course, support for any such technology could be its own feature idea.

IMHO, to get towards any actually useful/realistic/successful proposal, it is pretty much necessary to split this up, into multiple parts/steps. Then you could reflect on what part/step seems most relevant/appropriate/useful to target first, and set focus only making a proposal out of only that that part.[2] (For some more thoughts on paths forward, also check out my final 2 paragraphs in this reply.)

I’m also noticing increasingly – the longer I consider the ideas you bring up here – that there’s a lack of motivation; you only cite a “problem that names on crates.io could not be reclaimed if they are blocked by an abandoned package” (some people might debate whether this is even a problem at all, and “abandoned package” is awfully vague, anyway). Even now, I haven’t really gained any good understanding about what aspect exactly of your ideas is actually involved in making a solution to this problem, and in what way it is solved through petnames in the first place (or whichever other aspects of your ideas here is relevant).


Even for a not-yet-very-detailed kind of idea, it’d be great if you tried to list some drawbacks and alternatives already, and/or identify unresolved design questions. Searching for drawbacks naturally comes with some more research – e.g. into prior discussions – to understand the values & design decisions behind the status quo. It is a bit more effort, but it also makes readers happy that can value if you spent this effort already. Being critical of one’s own ideas (by listing drawbacks and open questions) also sets up for a more productive discussion, anyway:

If someone else comes up with additional shortcomings, you don’t need to feel obliged to address those immediately. If they seem like a reasonable observation on points where a trade-off ends up being made, you can just (mentally or actually) include the list of drawbacks you’ve already identified yourself; if they seem to be based on a misunderstanding of your proposal, they could instead help you identify parts of the proposal that aren’t clear written down sufficiently clearly yet, or you can identify there’s open questions if the drawbacks apply only to potential ways of fleshing out the ideas more concretely.

Looking at some prior discussions, and understanding the status quo, is also very relevant to identifying alternatives. One alternative is often the status quo itself. Other alternatives might already exist in documented “future possibilities” of previous RFCs. Alternatives also help channel your own creativity into a direction that doesn’t just makes the proposal larger and large[3]. And last but not least, to even try and start to be able to lay out alternative solutions to a problem, you will probably notice if you haven’t identified your motivation / motivating problems in sufficient detail.


And perhaps try finding a bit more on prior art as well. You link to a Wikipedia page, but I don’t see any mention of how & where "petnames" are already used for package managers for software libraries. Really proper prior art saves you a lot of work.[4] You don’t need to come up with all the points of motivation, drawbacks, alternatives, etc… yourself, if you aren’t proposing a novel concept, or a novel application of a concept in a different context.


Your choice of “Pre-RFC” labeling for this thread does suggest this might be a serious proposal that you hope could lead somewhere real, which is why I’m giving pointers involving in particular these keywords above that come from section headers of the template for actual RFCs. You already noted that “this proposal still lacks a lot of details”, which is a good observation.[5]

In my experience, a typical “Pre-RFC” post also already tend to either already (roughly) follow the structure of real RFCs, or at least contain more the relevant contents for turning them into an actual RFC eventually. In this thread, I see an unrealistically large scope, and a lack or sufficient real details, and (lack of) considerations regarding the abovementioned keywords/headers (i.e.: motivation [in sufficient detail], drawbacks, alternatives, prior art), so writing an actual RFC doesn’t seem like the best next step to take.

So I hope it makes sense that I’m changing this topic’s title to remove the “[Pre-RFC]” prefix :wink:


  1. like, again, really its own point, as this does make already sense to discuss for existing alternative (non-registry) sources, like git dependencies, or path-based ↩︎

  2. You can keep the whole context of the idea, and/or as least some of the imagined subsequent parts/steps, sketched out as “future possibilities”. But of course, when minimizing an idea like this, each step (especially the first) needs to make sense and be useful on its own, even if nothing else from the overall big idea ever follows (possibly not at all, or at least in the concrete form imagined) ↩︎

  3. too much growth in the direction of “more features/capabilities” makes it less likely to lead to concrete actual results; listing out more alternatives instead should tend to make it more likely to result in anything real ↩︎

  4. That’s a benefit on top of other ones like making implementation much easier, helping with flesh out more minor design details later, improving teachability to users (especially if the prior art has a significant user-base itself), etc… ↩︎

  5. My suggestion, in case you were trying/planning to fill in all the details yourself, it to really consider limiting the scope first, otherwise you might be spending much effort on something that seems really unlikely to be accepted.

    Or perhaps you didn’t even plan to pursue this further than sharing your initial ideas anyway, I can’t really know that. ↩︎

6 Likes

You can already publish crates using UUIDs as package names. The only limitation is that they have to start with a letter. And users of these crates can rename them however they want. What exactly does this proposal achieve that isn't possible at the moment?

Besides, you should consider why nobody is using UUIDs for package names: Short, memorizable, unique names have a lot of benefits. If I understand the proposal correctly, there could be multiple packages with the same nickname but different UUIDs. This means that, in order to know which packages a project depends on, it's not enough to look at their names in Cargo.toml; I have to look up their UUIDs. And when someone recommends a crate to me, they have to send me the UUID to ensure I don't end up with the wrong one.

The English language has enough words, package names aren't a scarce resource. You can argue that "good" package names are scarce, but less than perfect names (e.g. serde2, serde3, ...) are still better than not having unique names at all.

I guess petnames make sense for decentralized registries, but there are good reasons why crates.io is centralized, and changing that would require a really strong motivation.

Petname systems make a lot of sense for identifying people, because people know each other, have communities, and we still can tell people apart, so it's relatively easy to agree "yes, we all mean that person".

Crates are more global. We rarely discover them through some crate-by-crate network, and often may need to find an arbitrary dependency for any given task. It's also harder to tell dependencies apart when they're just code that can be cloned and forked, so merging and disambiguating different sources is much trickier. We still need global names to unambiguously add crates to Cargo.toml/lock, deduplicate them in the build, report metadata about them (like vulnerability reports), etc.

We also have a catch-22 that we don't have decentralized identities, we don't have decentralized registries. When people will want to establish what's the "real" name of a crate, it won't be the UUID or the public key, it will be the crates-io name and owner's GitHub. Having just one aspect decentralized won't have the power to shift everyone away from the existing centralized solutions, and instead of creating an actually decentralized system, it will just create a confusing layer around the centralized one.

There's another aspect where the scarcity of good names has a positive side — it encourages contributing to the crates with good names, instead of crating forks.

2 Likes