Namespacing on Crates.io


#70

Maven has a high barrier to publishing. (I’ve no idea how to despite using it for multiple years!)

Cargo doesn’t. That’s a benefit.

Assuming you require domain ownership to publish (a large, new barrier), what happens for example, if I own tehcodez.fake, publish some package that gets used widely, then accidentally let it lapse and someone else grabs it?

If you don’t require ownership, how is it any better?


#71

Hmm…this might be mostly backwards compatible if we allowed the following:

  • if there is only one crate with that name on crates.io, then, the organization is not needed in the cargo.toml to reference it (but a warning is issued, warning can be turned off)
  • if there is more than one crate on crates.io with the name under different organizations, then, the organization is required and it would be a build error if it were missing
  • after the cut-over to the new rules, and once the updated cargo/compiler support is stabilized, then the clock would begin with say a 90 or 180 day grace period during which no-one can create packages under their org with the same name as a crate under another org. This gives everyone time to update the cargo.toml’s to eliminate the warnings.
  • after the grace period, people can begin registering crate names under their org with the same name as a crate under another org.

Questions:

  • Would everyone get a starting “org” that corresponds to their VCS (github only for now) and user-name from there?
  • Would people be able to request/register other “Org” names? What would be required? Proving ownership of a DNS domain? How?
  • Would we rate-limit “orgs” if we allowed requesting additional orgs?
  • How does this solve the problem of wanting to switch to a new, maintained version of an existing, unmaintained crate?
  • Could adding the correct “org” to cargo.toml be automated through rustfix (or something similar) where there is no ambiguity?
  • What if you want to use crates of the same name (but different purposes) from different orgs as dependencies? Does crate-rename in cargo.toml suffice?

Downsides;

  • requires changes to both crates.io and cargo (and possibly the compiler) plus ideally rustfix (or something similar)
  • allows duplicate crate names going forward that would more likely require crate-rename whereas the pre-fix option never allows duplicate crate names (the crate is still always a unique name which includes the namespace prefix)
  • requires tieing to outside “registries” to validate ownership of an “org”

Upsides:

  • pretty much what “maven” and similar does so we know it works
  • would still work well if there is eventual “federation” proposal
  • others?

#72

You bring up a lot of good points!

I’d would go for an approach which is more driven by crate authors and avoids putting any new requirements on users:

  • the first time a namespaced crate is published
    • crates.io checks whether the author also owns the name in the legacy global namespace
    • if true, crates.io cross-publishes the crate to both locations, with the legacy location forwarding to the new one
  • afterwards, when a user of a global crate runs cargo update, cargo will upgrade the Cargo.lock file to the new namespaced crate, using the forwarders automatically generated by the crate author

This means that other authors can publish the same crate name under their organization without any restriction, because no forwarding would happen in that case.

I would require that a file with the random string is placed at a well-known location on the requested domain, similar to how cargo login works already:

Instead of appending the crates.io-generated string to cargo login, you would just place it on the domain you want to use, and crates.io would verify it. As long as the string stays there, you own the domain.

Maven uses some notification on their website like this: https://mvnrepository.com/artifact/commons-lang/commons-lang

I imagine that more automated solutions could be found too, but I believe this would often be a manual process, especially if the original author cannot be contacted anymore.

As outlined above, this would absolutely be possible and could be automated.

With the forwarding approach I described, there would be no ambiguities at all, because it would always be clear which namespaced crate is the replacement for the global crate.

Yes, I think this would be a valid usecase for crate-rename.

I think this approach would be better than the hyphen approach. Some people even suggested that crate authors should always redefine their lib.name to strip out the namespace with the hyphen approach.

So with hyphens, you would have renames everywhere and by default; with domain namespaces you would only need renames in the unlikely event of an actual clash.


#73

I’m liking this idea more and more, but, I’m still not sold on the idea of tying it to DNS domains. Perhaps a hybrid approach?

Here is what I have in mind:

  • As of a certain date, no new top-level/legacy crates may be uploaded. ALL existing crates are considered to live in the “Legacy” namespace.
  • At the same time, anyone may request a new “Org/Namespace” that has the same naming rules as crates do (as far as allowed characters) with the following restrictions:
  1. You may not claim an “Org/Namespace” name that is identical to the name of an existing top-level crate, unless you own that crate. If you own that crate, and request the corresponding namespace, the top-level crate is moved into the name-space with a “forward” rule from the existing top-level name (as you’ve defined). So, if you own the top-level crate “serde”, then you may lay claim to that namespace prefix and “serde” would become “serde/serde”. NOTE: You do not have to do this. You just can. Existing crates owned by you that started with “serde-” would likewise move to the “serde” namespace automatically (with forwarding). Existing crates starting with “serde-” not owned by you would remain as legacy, top-level crates. Existing “serde--**-” crates not owned by you would be separate namespaces that you could not claim or make crates within. So, you would own the “serde-” namespace, but, all legacy top-level crates that started with “serde” would cut holes out of your namespace that would be potentially owned by others at some point (or would remain unused). So, if someone else owns “serde-pink-pony”, then, you would not own “serde-pink-pony” as a namespace and could not create anything beneath that (whether or not the owner of “serde-pink-pony” claims the “serde-pink-pony” namespace ever). You would though, own the “serde-pink-*” excluding “serde-pink-pony and serde-pink-pony” subsets (unless another owned “serde-pink” crate).
  2. Similarly, you may not get an “Org/Namespace” name consisting only of a left-prefix of any existing crate unless you are the sole crate with that prefix or majority (not plurality) you have a plurality of dependent crates to crates you own with that prefix that are not owned by you owner of crates with that prefix (where prefix is defined as 1 or more full segments and segment is defined as any allowable character other than “-” or “" and "-/” are the segment separators) AND claiming that prefix would not confict with rule #1. So, if you own more crates that have more dependent crates not owned by you than than all others combined that start with “foo-bar-”, then you can lay claim to the “foo-bar-” namespace. If you do, all your crates with “foo-bar-” are automatically moved into that namespace with forwarding. The same rules as #1 regarding sub-namespaces and existing top-level crates apply. So, if under these rules you could claim the “foo-bar-” namespace, but, there was an existing crate called “foo-bar-baz” then the “foo-bar-baz-” namespace would not be available to you (again whether or not the owner of the “foo-bar-baz” crate ever claims that namespace).
  3. If no-one has a majority of crates for a particular prefix and more than 1 owner has crates with that prefix, that prefix may not be claimed as a namespace (see exception later).
  4. Once a “namespace” is claimed, no one other than the owner(s) of that namespace may publish crates under that namespace. Crate names do not need to include the name of their namespace in their name, but, they can. Legacy crates moved into namespaces would maintain their full-name which would be redundant, but, new crates created under the namespace need not. There would be an ability to rename crates under your namespace with automatic forwarding (you could never use the existing name for something else though).
  5. Any namespace not covered by the rules 1 through 3 could be claimed by anyone at any time; however, there would be rate limiting on getting new namespaces, something like: No more than 1 new namespace per day and 5 per week and 15 per month and 30 per year and 100 overall - without some intervention). No namespace covered by rules 1 through 3 could ever be claimed by anyone except through those rules FOREVER.
  6. If you claim a name-space under rules 1 or 2, you may move any top-level crate that you own that wasn’t automagically moved into a name-space you claimed under rules 1 or 2 into that namespace (with automatic forwarding).
  7. You may move crates you own from one namespace you own to another namespace you own (with forwarding) at any time.
  8. Forwarded crates remain forwarded forever. Forwarding may be transitively forwarded (this can be logical or actual and would be an implementation detail on crates.io). Forwarded names may never be used to publish new crates. Namespaces may not be forwarded, but, all the crates in the namespace may be forwarded to another namespace.
  9. The owner of a name-space may permit on non-owner to publish a crate under the namespace upon request and approval.
  10. The owner of a namespace may delegate a sub-name-space to another owner. Once they do so, they lose control of that sub-name-space.
  11. If you are the owner of 1 or more crates and would like to claim a prefix that you are not eligible for under rules 1 through 3 because you are not the “Majority Owner” of that prefix, you may ask other crate owners with that prefix to cede rights to that prefix, if they ALL agree, you get the namespace, but the sub-name-space carve-outs as described in rule #2 still apply. This can be automated.
  12. If you cannot get all owners to agree under 11, you may, after some period of time, make a request at large to all the owners of all the crates on crates.io where everyone votes and if you get a super-majority (66%) of a quorum (at least 60% votes cast or a defined voting time-period elapsed, say 180 days), you get the namespace.

I believe this proposal gets all the good properties you wanted out of using DNS as ORG Name, but, doesn’t tie to external things and allows for more flexibility in namespace names AND allows existing Crate collections the option of claiming a namespace associated with that already established “brand” (think “tokio-”, “serde-”, “diesel-”, etc. Also, it is 100% backwards compatible if we allow the crate to be referenced as crate=<crate>, ns=<namepace> (for new cargo) or crate=<namespace>-<crate> for existing cargo (with crate renaming). So even old cargo could use new crates published in namespaces (besides the legacy ones that are forwarded). NOTE: Forwarding is a “server-side” thing and old cargo need not know about or understand it, but, the new cargo could know about it and update cargo.toml.

Some things to note about this:

  • New namespaces are not automatically created. Someone must request them and are granted them subject to the described rules.
  • Once namespaces are permitted, no new crates are allowed to be uploaded to the root/legacy namespace.
  • Namespaces do not have to follow the names of any existing crates, but, non-owners of crates cannot create namespaces that conflict with existing top-level crates (ever), but, existing top-level crates can continue to be published and updated by the author forever (unless they choose to move them to a namespace as described under the rules)
  • Crate names themselves can remain short (and even become shorter) because you will no longer have to do things like “serde-*” to create a bunch of related crates. You can simply create a namespace and then create related crates in that namespace (including sub-name-spaces).

So, how about it? Tear it apart? Like? Don’t Like? Problems? Down-sides?


#74

This could be simpler:

  • - is not allowed in the prefix, so you can have foo-*, but not foo-bar-* as a prefix. This eliminates complications from overlap.
  • Unprefixed crates can be registered indefinitely. This eliminates complications from switchover and legacy status.
  • Prefixes are given on first-come-first-served basis, with no connection to existing crates. This eliminates need to design algorithms for giving out prefixes based on legacy crates.

#75

I’ll once again throw out the idea of using GitHub organizations:

  • outsources name registry to another system
  • outsources antispam to another system
  • outsources organization membership to the system which is already the IdP. In for a penny, in for a pound

Domain names buy you the first two, but given the existing GitHub IdP integration, the third issue is the big one for me. Using anything but GitHub for this purpose will involve building a complicated organization membership management system into crates.io. GitHub nicely avoids that.


#76

Do you mean you can add new top-level crates with new top-level names indefinitely? I personally think it would be better to end that once namespaces are allowed; otherwise, there is still a huge incentive to squat top-level names.

How so? Once you start having name-spaced crates, you would have to have some way for existing cargo to use them or you’d have a backwards compatibility issue. So, whether or not you continue to allow top-level crate registration, you’d still have issues with crates created in namespaces being backwards compatible with existing cargo (unless I’m misunderstanding what you are proposing).

Personally, I would prefer that existing popular crate collections don’t get a namespace granted to a non-owner that uses that “name”. It seems like “brand/reputation” here can be an important thing to preserve.


#77

If by name, you mean user name, we’d still have that, no?

Does it really? If we rate-limit new namespaces, then the only “spamming” possible is uploading crates that are “spam” to a namespace you own. Which would be the same as under what you propose.

I guess I just don’t see that as a positive thing. Why make crates.io dependent on other commercial services to that degree? I’m having trouble understanding why that is a good thing long-term?


#78

Note that GitHub allows renaming of orgs/usernames. To make things worse, they make the old name free for taking.


#79

To me it doesn’t matter if the top-level name refers to a crate or a namespace — it’s still a thing to squat. If foo is precious, so is foo-* (or foo/).

The value of namespaces for projects (i.e. ability to publish project-foo without worrying someone may grab it first) remains the same regardless whether non-namespaced names are allowed.

So I’m proposing to add namespaces as an optional feature, not as a mandatory thing.

Indeed. There are only handful of such collections (rust, serde, tokio?), so these could be granted manually.


#80

Yes, but, it is easier to rate-limit new top-level name-spaces without inconveniencing legitimate users than it is to rate-limit top-level crate names. That’s where I think the win comes from (partially).


#81

Registration of prefixes is a different operation than publishing of a crate, so they could still have different rate limits.


#82

Organization name. Re GitHub antispam:

Yes, beyond mere rate limiting, GitHub actively monitors for all sorts of suspicious behavior including spam, and will flag/lock accounts which appear to be doing spammy behaviors. While crates.io could probably benefit from a system (or failing that, exponential backoff rate limiting), GitHub gives it to you for free today, along with many other features.

GitHub already provides all of the functionality for organization membership management, which is ultimately an access control function. From a security perspective, outsourcing this functionality to a “tried and tested” system is much less risky than trying to greenfield it all.

The set of members of any organization is sourced as OAuth users from GitHub, so any system that manages organization membership for crates.io is ultimately building on top of GitHub’s user model anyway.

I understand and sympathize with concerns about centralized systems, but crates.io has already gone down that road, and that’s unlikely to change any time soon.


#83

Use of GitHub login and ability to give crate ownership to a GitHub Org already gives crates.io this spam protection and user management.

But GitHub is kept one level of abstraction away from crates and their dependencies, so crates-io can add support for other account types, and even migrate off GitHub if that turned out to be necessary.


#84

I think it’s possible to retain this property for organizations sourced from GitHub as well


#85

I’d consider that a deal-breaker with respect to immutability of crates/name-spaces.


#86

Renaming a GitHub organization does not necessarily require a corresponding change on the crates.io side.


#87

A general suggestion to everyone working out the details of a namespacing proposal – you should create an FAQ in whatever RFC becomes of this. This discussion here is quite long and high-velocity, making is difficult to keep up with all the changing details.

I would guess the most commonly ask questions would involve existing crates with hyphens in their name.


#88

The minute you do this, people will start complaining and (rightfully) demand that other IdP are also accepted.

So your system would turn from

cratename // global namespace

to

somegithuborg-cratename // your proposal

to

someotheridp-someorgatthatidp-cratename

in no time.

When we have arrived at that … what’s exactly the difference to doing the simple, obvious thing … and using domain names from the start?

I think it’s not that onerous to send a HTTP GET request to some host and check the reply.


#89

A similar idea that would allow other uses to continue publishing using any name they can today is [Pre-RFC]: Packages as Namespaces.