Pre-RFC: User namespaces on crates.io

I glanced at the previous posts and linked threads – well, it's a significant number of posts, so forgive me if I repeat or miss something.

I generally like the idea of having some crate namespaces that could group related crates. It is useful for easier discovery of related crates and when a crate belongs to a group, which I already know and trust, I know I can expect some standards etc. Quite often concern, related to the trust, was about acquiring the group identifiers and binding them to domain names or so. Maven was mentioned several times as both good and bad example. An interesting post mentioned the problem of gatekeepers. I'll try to share some thoughts that might perhaps be inspiring.

I think it is a good thing to have the possibility of independent registries. It allows for alternative sources to be created because of various reasons, for instance as a caching mirror of crates.io. However, there is no reason why registries could not have different policies, for example, someone might maintain a registry with redacted content and hosting high-quality proven crates only. IMO, the possibility of independent registries might solve the gatekeeper problem.

Maven for comparison works like that. Although there is the Maven Central repository, used as the default, it is possible to configure other repositories that Maven should use and – here it becomes interesting – even filter which (artifact) groups should or should not be provided by individual repositories. This can be used, for example, in companies with their own Maven repository to mirror and trust only some sources. Btw. Apache or Eclipse Foundation have their own Maven repositories, where even snapshot builds are available, and releases only are mirrored to Maven Central.

Assigning group identifiers depends on the registry's policy. This leverages the freedom provided by independent registries… and the freedom to choose. One could argue that it means that the group/crate does not have to be unique, because it depends on the particular registry. This is true… and it means: use reasonable registries to work with.

As noted above, Maven provides such a freedom (there is nothing stopping anyone to take an artifact and re-publish it to a different repository), but I guess there is no problem. Most people are happy with Maven Central… well, the cost for the trust is paid by the registration process and requiring the group to be the domain name belonging to the account owner.

I guess this point therefore leads to the question how to set the policy for crates.io? Following answers are likely flawed, but anyway here they are to kick off the discussion:

  • First come, first served might not be as bad. How are taken the account names on GitHub anyway?
  • Existing account names could be used as the pre-reserved group identifiers.
  • Suitable group identifiers could be offered to existing accounts for their projects (I guess that the offered identifiers might be proposed automatically based on the project names.)

How to handle crates published so far without any group? Maybe the registry could have a default public group and all existing crates would automatically belong to this public group. Referring to a crate without a group then defaults naturally to the registry's default group.

Groups serve for grouping and don't have to be linked to crate names. By this I mean that for a group like serde, the crate would still be named serde/serde_json and thus nothing changes for the compiler. I guess this point might raise the question whether the groups are really good for anything. But yes: there is the relationship to maintainers, filtering if multiple repositories shall be used and a group may create a crate without concerns whether the name was already taken or not.

If I use Maven again as an analogy, the group identifier never appears in the source. Just by convention, the namespaces in the artifact usually begin with the group identifier and sometimes (especially in the OSGi world) artifact identifiers include the group identifier too.

What happens if I meet foo/random and bar/random? Not much: in such a case I'm forced to rename random to something else. But how often such a name clash occurs? I guess the clash occurs more often with such generic names on the side of the crate's authors. Even with group identifiers it remains good to offer a distinctive name, but the pressure is not so high, especially when the nature of the crate makes less probable to use it in conjunction with a possibly same-named crate. It opens the opportunity for someone to implement my/serde_bson and after a great success to offer it to serde to adopt as serde/serde_bson… not much changes for the crate consumers.


Last to mention might be the check list:

The proposal should be backwards compatible with the current situation.

Defaults should do the trick: the default registry crates.io with the default public group. Users must explicitly tell otherwise.

The namespaces should be stable and immutable like the rest of the registry, and it should not be possible to rename them. Once a namespace is created it should not be removed.

This I would leave to the registry's policy. But that's IMO no problem, of course, crates.io would never remove a group.

Implementing namespaces shouldn't hurt the usability and ergonomics of crates.io, for example by having random or identifiers users aren't expected to remember.

Although I used in examples the syntax group/crate, which I personally like, this point remains flexibly open :wink:

Namespaces should not restrict others' ability to publish packages without the namespace (for example, by allowing people to reserve a foo-* prefix).

If the registry's policy allows a default group… no problem. But using explicit group (even the default for the given registry) should be probably encouraged and preferred to the default.

Namespaced crates should be usable inside the source code without any renaming, and that should not create collisions.

Almost, see the notes above.

Well, that's all I have. Good night.

1 Like