Crates.io "first-come, first-served" for plain crate names - why and how should it be changed?

Cargo, as a package manager, needs a way to identify the published packages. Currently it uses the crate name registered on crates.io as the identifier, which is probably a timebomb imao.

Story of online-naming

The registration policies of online communities have evolved over time. Before 2010, often I registered for a unique username in a community, and that username was displayed to identify who I was.

Later on, almost all modern-designed platforms adopted the concept of "display name" which doesn't need to be unique. Of course, people still need to check your unique username if they want to identify you, but "display name" is there to show how you want yourself to be normally called, instead of being a unique identifier in said community.

For popular platforms, unique usernames are often not intuitive. A person with a display name John Smith might choose a username john-smith-7985 or name-too-common-haha. In practice, the username is hardly ever typed by anyone else, and is typed by the owner himself only when logging in.

In short, it is common for entities to be labeled with a long-and-ugly unique name (identifer) together with a short-and-nice display name.

Short identifiers are of course, possible, but often not available for those who want to name themselves with a popular name. It's not his fault if someone wants to be called "John Smith" even if he knows there're lots of John Smith's out there. Everything is ok as long as people can still identify him with the unique username.

Back to crates.io

I've seen a lot of threads related to crates.io naming, namespacing, squatting, etc. I think a lot of people have missed a point.

Namespacing is just a means, not the goal itself. As for me, I just want to be able to tell users how I want my library to be called in their code by default, even if there're crates with the same name registered on crates.io before.

Actually, when people are forced to rename their crates simply because crates.io name conflict, they don't often come up with new names that better suit the nature of those crates. More often than not, a prefix or suffix is used, which is just an alternative to namespacing.

Proposal

Instead of introducing namespaces which, in effect, make the fully qualified name of every crate "long-and-ugly", we can introduce a new attribute equivalent to "display name" of a crate, which in real life works the same way as a "default alias".

Pros:

  • Crates are displayed and used with their desired names from their creators, instead of being renamed solely to avoid name conflicts, which might lead to less intuitive names appearing repeatedly in code if their users don't manually alias them.
  • Searchbox in crates.io would treat the display name the same weight as a crate name. And the keyword correlation for both names should not be counted twice (only the larger one should be taken into account). There's no reason a crate named foo looks better than another named myorg-foo and is placed in front of it simply because the former one is "first-come, first-served", even if the latter one is intended to be called foo when it's designed, and is better in quality.
  • Lib developers wouldn't need to register names in advance for reservation. They wouldn't need to do a last minute crate-renaming either. Whether to remove empty placeholder crates on crates.io is another issue, but as squatting wouldn't do much benefit any longer, fewer people would be doing this or worrying about it naturally.
  • Backward compatibility is good, as the new "display name" attribute is completely incremental, and it defaults to the original name of the crate. If we implement it as a "default alias" for Cargo.toml, as we can now alias a crate with the rename-dependency Cargo feature, then legacy code should not be broken by this change. In fact, the change is just automating what is regarded as a solution / workaround in this issue

Cons:

  • Developers who are familiar with old crates.io might accidentally import wrong crates, but only if they manually type cargo add displayname / displayname = "<version>" instead of copying the command (which would correctly use cratename/rename-syntax over displayname) from the page.
  • Similarly, an experienced developer might come across code depending on a crate whose name is familiar to him, but not the actual crate that he knew - but this is something that might happen even now with dependency-renaming
  • Some people don't like the idea of two names for one crate, and the possibility of multiple crates sharing the same display name, even if it's a simplified version of the namespacing approach.

Things affected

  • Cargo can no longer assert that the crate name on crates.io (identifier) equals the crate name in the source code of it. Some extra mapping would have to be done. Or, if this feature is not implemented, then the crate developer would still have to manually rename the crate in his code. Hopefully with refactor tools this is not a big issue.
  • crates.io display name & install command should be changed accordingly.
  • crates.io search mechanism should be adjusted (see above).

This proposal aims at splitting the crates.io crate name's role as 1) a unique identifier for Cargo and 2) a way that the crate should be referred to in source code / doc / online contents, into two names. Hope it helps.

1 Like

Note that Cargo/crates.io already support two different names of a crate:

[package]
name = "foo"
version = "0.1.0"
edition = "2021"

[lib]
name = "bar"

Package name is a unique name on crates.io, while lib.name is what one usees in the source code.

10 Likes

I suggest a crate identifier be a random ID, something like ION/DID https://blog.ipfs.tech/ion-a-path-to-decentralized-identity/. AFAICT the name used by people has no requirement to be unique, it's just a search term in normal usage.

1 Like

Given the fact that the library/package names don't have to be the same already, afaict this boils down to:

Along with the unmentioned:

  • Change community norms to publish crates with differing library/package names

That seems like the hard step to me, there are very few packages that override their library name currently, the only one I can think of is futures-await which overrides it to futures as it was a temporary fork. I don't know how you would even start trying to move the community on this point.

2 Likes

Package name is a unique name on crates.io, while lib.name is what one use es in the source code.

I checked a few random crates on crates.io and found no one using this feature, probably because:

  1. If one has a name conflict issue when publishing to crates.io, and search for solutions / workarounds, nothing could be found to guide to this approach.
  2. Popular guidebooks don't give any details on the behavior of specifying these 2 names. So without proper doc or online posts referring to it, it is practically a dead feature. Try google cargo.toml lib name and you know what I mean.
If I have a package named `foo` whose lib name is `bar`, others would import it with the name `foo`, but:
  1. do I, the creator of the crate, refer to the crate as foo or bar in my source code?
  2. do my users refer to the crate as foo or bar in their source code, if they simply copy the install command from crates.io?

I can find nothing to answer these questions with an intuitive keyword via google, which is why people don't use the feature imo.

Forget about the 1st question. The source code of the crate refers to itself simply as crate. The 2nd question is answered by CAD97‘s reply

AFAICT the name used by people has no requirement to be unique, it's just a search term in normal usage.

That's my point.

Try search keyboard on crates.io, you would find a placeholder crate always on top when sorted by relevance (default option).

But it should't be.

Every crate that wants its display name to be keyboard should have the same relevence order when the keyword is keyboard, in which case other factors kick in so a placeholder crate can never be on top of others.

Now that the crate keyboard is empty, no one can actually use it, so we have to ignore the first result. But what about other popular names, for example bitfield? Guess how much popularity of the crate with the exact name bitfield comes from it being the first search result and is the only one with a matching name, while other decent alternatives might not even appear on the first page?

How much effort does it take to guide search engines to an alternative bitfield crate, if a creator can never place the exact name "bitfield" in the title of the crate's homepage on crate.io? How does the crate look if it's called "blablabla-bitfield" or something totally unintuitive like "bilge", compared to plain "bitfield"?

This is just unfair and misleading for someone who want to search for a crate that best meets the requirements, and is diverting developers from creating high-quality crates to thinking of good names and squatting them, or even discouraging them from starting the development of an alternative since it could not get enough attention with current search mechanism.

1 Like

That seems like the hard step to me

It's the developer of Cargo and crates.io who decided that when people want to publish crates with existing names, no help or guide is provided to them other than telling them to rename the crate.

It's the administrators of the community that forced this rule. And now people think it is hard to change because developers are used to it? I don't think so. At least newcomers to the rust community are not used to anything, and they might probably want to create their own crates, not regarding renaming the crate as a default option.

As for experienced developers, new features always need a guide. Without proper guide, even a switch from try! to ? cannot be made popular.

My feeling is that "name" should be deprecated and a random unique "id" replace it. At the same time add a "short_descriptor" (X characters long) which initially would default to the current "name" value. As time passes crates would update/add the short_descriptor field to some better than the old "name".

The place where the package name actually matters is when specifying dependencies. If we use a UUID for packages, or full URLs (like git dependencies), or whatever, you have “solved” the collision problem…at the cost of making Cargo.toml harder to read and edit. That’s why people are looking at namespace-like solutions: they don’t require giving up the “short” syntax for a dependency in Cargo.toml.

That doesn’t forbid you from discussing a change that would remove that, but you should call it out. Even assuming everybody switches to using cargo add for adding dependencies (I don’t, currently), there are still reasons why you’d want to manually edit Cargo.toml, and this would make that experience worse.

6 Likes

Whether to deprecate the current unique name is an individual issue.

I would say the current name attribute / field should be kept, serving as a descriptor and is still unique across crates.io, but page title and search should provide support to a seperate field display_name that does not need to be unique.

Whether the unique name is called "short_descriptor" or "crate_name" is just a change of wording. And I don't think an auto-generated unique id is needed, unless future crates can be published without deciding a unique name. (There must be a unique id for each crate in crates.io database. They just didn't expose those ids)

they don’t require giving up the “short” syntax for a dependency in Cargo.toml.

Exactly. That's why I propose for keeping the unique name, while just adding a display name on top of it.

In this way, the experience of specifying dependencies in Cargo.toml remains practically the same.

So ideally, a package like c2rust-bitfields (assuming the owner specified the display name to be bitfield) would be displayed as:

bitfield (c2rust-bitfields)

and:

  • be searched by either the name bitfield (being an exact match) or c2rust-bitfields on crates.io
  • be called c2rust-bitfields or bitfield in documentation according to the creator's preference
  • be imported by specifying the dependency to c2rust-bitfields
  • be used as bitfield in code by default
  • if the developers created the crate as "bitfield", there should be tools to help them keep that name in the source code, and the only thing they have to do is to come up with a unique name "c2rust-bitfields" as an identifier to be used by crates.io and Cargo.

I'd say as I switch from go to rust, the nastiness of package naming (for published crates) is one of the few things I don't like about the rust ecosystem, and I failed to find enough reason to defend that design. I grew even more confident on this view after reading the replies.

While pkg.go.dev is allowing colliding package names, pypi.org is not allowing that. But the administrators of PyPI are more active about that (see PEP 541). You can't just provide fewer methods to solve a certain problem (edit: or keep some needed features hidden in practice), devote less manual work to help the situation, and sit back expecting the outcomes to be good.

I see no reason the developer should be burden at all with creating a unique name.

I assumed the current name had to be a valid crate-name (limited character set, i.e. no spaces, ..) if that can be relaxed great, but if the user has to come up with a unique that seems unnecessary.

Well, what about doing these?

  • advise the developer to come up with a unique name for crate foo which would be used as an identifier by Cargo and crates.io, only when the name foo has already been taken
  • show the developer that the crate would be auto-named foo-5678 if no name is specified
  • warn the developer that the name cannot be changed later

I’m sorry, I still don’t understand what my Cargo.toml’s [dependencies] section will look like under this design. Can you show an example?

Do they say to rename the crate anywhere? I don't recall seeing any help or guide at all provided, the error message simply says you're not an owner

error: failed to publish to registry at https://crates.io

Caused by:
  the remote server responded with an error: this crate exists but you don't
  seem to be an owner. If you believe this is a mistake, perhaps you need to
  accept an invitation to be an owner before publishing.

It is true that there's no guides (at least that I've seen) to using a separate library name, but it's not necessarily up to the project to produce this, anyone could write this guide and try to popularize this alternative.


Points 2-5 are already how setting package.name = "c2rust-bitfields" + lib.name = "bitfield" works today.

Other than UI support for crates.io to search/display the library name this has worked since Rust 1.0 released.

1 Like

When a crate is created as foo and published with the name myorg-foo and the display_name foo, the title of its page on crates.io would be:

foo (myorg-foo)

and install commands would be:

cargo add myorg-foo

or

foo = { package = "myorg-foo", version = "1.0" }

The latter one would be used in Cargo.toml by a user of the crate. See here for more info.


As for how Cargo.toml should look like for crate foo (identified as myorg-foo on crates.io), I don't think I should be the one to decide.


Edit:

It seems you don't even have to change how you install / import the crate according to CAD97‘s reply

Good to know. It is just painful if one cannot get such info by searching for contents published online and have to figure it out by trial and error / asking for answer. SEO does matter.

Well, I don't think it's some random user's job to promote a feature if:

  • hardly anyone has given it a try
  • the feature isn't adequately documented

If the creator of the feature don't promote it with doc and user interface, who else will?

So the only two changes being proposed here are

  • Display/treat lib.name more prominently on crates.io, and
  • Nudge package culture to be more accepting of package.name != lib.name,

yeah? Because everything else already functions; myorg-foo = "1.0" already gives you ::foo as the extern crate name when the dependency renaming syntax ([dependencies].foo.package) is not used.

7 Likes

I think you're right.

This part is where new guides / tutorials would be needed.

And for error messages like:

error: failed to publish to registry at https://crates.io

Caused by:
  the remote server responded with an error: this crate exists but you don't
  seem to be an owner. If you believe this is a mistake, perhaps you need to
  accept an invitation to be an owner before publishing.

Maybe it should hint that if you want to publish a lib with an existing name, what can you do to make fewer unnecessary changes, and what the result would be like.

Another valuable tip to know. Makes me wonder why I see ugly-named crates everywhere that don't make use of this feature, if people know about its existence / behavior.