Namespacing on Crates.io

The huge difference is that domain names does not rely on a single company VCS. But they are still an accepted common reference all over the internet.

git dependencies also do not rely on a single company, git-appraise = { git = "https://git.nemo157.com/git-appraise-rs" } would work today if I could be bothered deploying my git-smart-http protocol handler there.

but would crates.io cache those dependencies?

would it be censorship-resistant?

You are right, I read too fast, I thought Centril was talking about Github.

But my comment is still valid if you replace the word “company” with “VCS”. It’s good to be able to use git, that is pretty common nowadays, but I don’t want the package management to be dependent of a particular version control system.

This is the tricky question, and one where my ignorance of the Rust crate/package/module system makes it difficult to field an answer.

A strawman option, which probably won't work but can perhaps serve as food for thought, would be to always require an as directive when importing these crates, e.g.

extern crate '@foo/bar' as bar;

Regarding the source of authority for organization names, since crates.io already relies on GitHub as its identity provider, I think the most simple and straightforward option for crates.io organization names is mapping them directly to GitHub organizations. This would also allow for outsourcing organization membership to GitHub via the existing OAuth integration.

Domain names are an alternative to consider but personally I’d prefer a single source of truth for names. If people are against GitHub for this purpose, the next best option IMO would be for crates.io to have its own registry of organization names.

There were several proposals to remove extern crate from the language completely, as the new module system with absolute paths doesn’t really need it. If anything, it’d probably have to be handled in Cargo.toml.

Handling it through Cargo.toml seems better to me. I can imagine something like this:

[dependencies]
bar = { version = "1", org = "foo" }

for a @foo/bar crate

Aren't the package name and the crate(s) it contains separate? Today, the package name and crate name are the same, but, does it need to be that way? My understanding was that it didn't. In fact, couldn't a "package" at some point contain multiple "crates".

1 Like

Thanks everybody for all the replies, obviously this is a complicated issue and I appreciate everybody’s ideas around this topic.

What I’d probably advocate for, most likely naively:

  • Un-namespaced crates would use regular crates.io (rand would be rand).
  • Namespaced crates would assume a namespace on crates.io (naftulikay/rand) which could be backed by the VCS of your choice.
  • Fully-qualified crates would still be possible (rand = github.com/naftulikay/rand).

This is not a panacea, it’s obviously more complicated than this.

The main underlying issue of crate squatting won’t go away on its own. Moderation will become increasingly difficult. This may be able to be automated, have bors or another bot scan crates.io for empty repositories with zero pulls, but we’d still have to have somebody manually review things.

An anecdote: one of the many reasons I stopped using PyPI was because of the namespacing issue. There’s a lot of garbage up there and names that I couldn’t use, so I just made users go to GitHub (git+ssh://...). crates.io is rad though, and it would be unfortunate if we started driving users away from it.

The issue becomes what dependencies are allowed on different repositories, e.g., can a crate published on crates.io depend on a crate on GitHub or elsewhere? If so, how would one avoid brittleness, disappearing crates, etc.?

Another proposal: minimalist approach.

A namespaced crate name like naftulikay/rand is only one character different from the current pseudo-namespacing done on crates.io like naftulikay-rand.

So we could have backwards-compatible namespace approach by allowing users to register a prefix like naftulikay-*, so that only owner of the prefix can publish new crates with that prefix.

While this doesn’t address squatting, it does add namespaces in a backwards-compatible way following existing naming conventions. It works with Cargo and the Rust language without any changes, and the implementation only needs relatively small changes to crates.io.

10 Likes

The truth of the matter is that if Cargo ever supports more that GitHub or wants to not support GitHub, repo squatting would still be a thing, but it would be the VCS’ problem, not mine per se. A user could go on a new VCS and grab microsoft as a username and then we’re right back to the same problem.

I’m okay with Cargo only implicitly using GitHub for now, but if this ever expands we have collision again.

This indeed sounds like it would address the problem without placing undue burden on cargo.

1 Like

I think it stands to reason that a crate-collection like winapi would like "winapi" to appear as a prefix in the Rust code when referring to one of those crates.

We are moving away from users having to wrtie extern crate; the new way of doing things is just:

use mycrate::bar;

Yes; as @Nemo157 noted; but you still need a way to refer to the crates in paths.

This indeed adds more weight to the idea of simply reserving “pre-fixes” as the way of name-spacing:

  • tokio-* (prefix is tokio and once reserved, only the owner of that prefix can create more packages with that particular prefix)
  • serde-*

No new support in cargo or the compiler or the language required (it seems at first blush). Only thing needed is changes to crates.io:

  • How to reserve a prefix (first come, first serve)?
  • Limit number of prefixes a single user/org can reserve without intervention
  • Do not allow reserving a prefix that already has existing crates with that particular prefix unless the owner of said crate is the one asking for the reservation. If there are multiple crates with independent owners both eligible for the same prefix, then, the owner with the most creates starting with that prefix is granted the prefix. Existing crate that are not authored by the author are either grandfathered in, or (better), redirected to a new name and the old name becomes a reserved/dead name until the original owner surrenders it.

This probably requires more thought, but, something like that should work and would be backwards compatible as @kornel already pointed out.

Reserving prefixes seems to me a reasonable alternative and would give @retep998 the ability to get all the winapi- crates without having to make them all (perhaps moot since the crates already exist now…).

This doesn’t really require namespacing.

1 Like

The aggressive “ad hoc namespaces” would be to prevent publishing of the crate /([^-_]*)?[-_].*/ where the crate $1 exists.

It does probably make sense to limit namespace ownership to some value n per linked account, though. But I think it could (and maybe should) require ownership of the crate to claim that crate’s namespace.

1 Like

Keep in mind that not everybody wants to control their prefix. I explicitly do not want a crate being called diesel- to be restricted to projects maintained by Diesel. There are dozens of crates that other people have made called diesel-* or diesel_* to extend Diesel in various ways, and I would have no interest in preventing them from existing or using that name. Indeed, by not having an explicit namespace which implies control by a person or group, people are able to more easily find crates that are meant to extend or integrate with Diesel, without those projects having to be blessed by the Diesel project.

1 Like

Ideally, those would be called: somebodyelsesprefix-cratehelper-diesel. So, if I as the owner of the “cookiemonster-” prefix wanted to create a diesel helper that targeted the “countchocula” database, I’d name it:

  • cookiemonster-countchocula-diesel

Now, I know that’s isn’t how things have been, but, it would be a better place to be (IMHO).