Pre-RFC: User namespaces on crates.io

Putting aside the UX and aesthetics of the syntax, let me go back to my second bullet point and conclusion:

"Adding new syntax or other things that require changes to the module system significantly complicates the problem, the number of touch points in the overall project, and should probably be avoided. [...] A simple namespacing solution would be one which adds an immutable namespace identifier to the crates.io database [...] and ideally doesn't touch the module system or require changes to anything other than crates.io."

You claim your RFC is "precisely what [I] ask(ed) for", but this looks like new syntax?

I think it would certainly require changes to Cargo and anything which consumes Cargo metadata:

error: failed to parse manifest at `Cargo.toml`

Caused by:
  invalid character `~` in dependency name: `~github:1234/somepkg`, the first character must be a Unicode XID start character (most letters or `_`)

One note about UX/aesthetics, or more specifically this:

I think it's very important that namespaces have human meaningful names, not (potentially large) numbers. This is why we have DNS instead of just using IP addresses. For new users, a GitHub user ID looks like 60937195. Having crates with numbers like that in them rather than a human-meaningful name means that humans can no longer remember package names, which is a huge step backwards in UX.

3 Likes

Moderation note: Folks, this conversation is not headed in a good direction. Before commenting, please consider whether your contribution meaningfully advances the discussion forward.

I think the big takeaway from this discussion and RFC has been heard: communication is hard, doubly so for contentious issues such as this one. In future proposals, it would probably be a good idea to prioritize reaching consensus on a high level design with the cratesio team before plodding through the details.

8 Likes

My proposal changes crates.io package names, not crate names. The Cargo.toml file would look like this, which parses fine:

[dependencies]
something = { package = "~github:1234/something" }

There are no changes required to Cargo or rustc to accept this new package name syntax.

I don't think package names need to be memorable, or user-meaningful. Given that users can still upload packages without namespaces, I think this feature would be primarily used by people who want to upload packages where their name is already "taken" by an existing registration.

My plan for my own packages, should I need to put them on crates.io, is to use UUIDs. The UX isn't as good because crates.io and docs.rs treat the package name as the crate name, but at least the library would still be accessible to projects that are required to register with crates.io.

Just because Cargo doesn't error during parsing, doesn't mean this requires no changes to it. As far as I can tell it is not possible to have the character / in any package names with the current Registry Index Format Specification. As a test I have published a registry that uses the standard / == directory separator and installing from it fails with a weird error (and I expect any fix for this would be to error earlier since the protocol does not support it):

[dependencies]
bs58.version = "0.3.1"
bs58.package = "~github:1234/something"
bs58.registry-index = "https://ipfs.io/ipfs/QmVbNCcr9XMxZjf744qYsDjfgtKCcpv6BL4RpuKTvEnXCP"
> cargo build --config net.git-fetch-with-cli=true -Z unstable-option
    Updating `https://ipfs.io/ipfs/QmVbNCcr9XMxZjf744qYsDjfgtKCcpv6BL4RpuKTvEnXCP` index
  Downloaded ~github:1234/something v0.3.1 (registry `https://ipfs.io/ipfs/QmVbNCcr9XMxZjf744qYsDjfgtKCcpv6BL4RpuKTvEnXCP`)
error: No such file or directory (os error 2)

Trying to force in a file with a / in the name isn't allowed by git (and would be a really bad idea anyway):

> git mktree
040000 tree ae1a8b238b69ecd9da638f6426099c9ca8a43b31^I~github:1234/something
fatal: path ~github:1234/something contains slash

What might be possible is using some other character to separate it, but testing that also fails with cargo internal errors, though these I believe are bugs and this should work since the protocol places no limits on the characters allowed in package names (other than those implied by using git as the transport):

[dependencies]
bs58.version = "0.3.1"
bs58.package = "~github:1234|something"
bs58.registry-index = "https://ipfs.io/ipfs/Qmeuz3m8QWdRahB8Vvgk9R5kvASEvrnNZGaFyy3WNJqSz7"
> cargo build --config net.git-fetch-with-cli=true -Z unstable-options
    Updating `https://ipfs.io/ipfs/Qmeuz3m8QWdRahB8Vvgk9R5kvASEvrnNZGaFyy3WNJqSz7` index
  Downloaded ~github:1234|something v0.3.1 (registry `https://ipfs.io/ipfs/Qmeuz3m8QWdRahB8Vvgk9R5kvASEvrnNZGaFyy3WNJqSz7`)
error: failed to find ~github:1234|something v0.3.1 (registry `https://ipfs.io/ipfs/Qmeuz3m8QWdRahB8Vvgk9R5kvASEvrnNZGaFyy3WNJqSz7`) in path source
note: this is an unexpected cargo internal error
note: we would appreciate a bug report: https://github.com/rust-lang/cargo/issues/
note: cargo 1.44.0-nightly (390e8f245 2020-04-07)

(Though, I'm not sure how cross platform the character : in filenames is, I could see this being impossible to use on Windows).

Ah, I forgot to change the package name in one more place, Cargo does restrict the package name more than git does, updating the registry again to

bs58.registry-index = "https://ipfs.io/ipfs/QmPyWhUY98UFQyKRHTBgGJXXtdNPXCBMHXrrFD9QdU4iGT"

gives an error:

error: failed to parse manifest at `/Users/nemo157/.cargo/registry/src/ipfs.io-ee09e7da863f2c3c/~github:1234|something-0.3.1/Cargo.toml`

Caused by:
  invalid character `~` in package name: `~github:1234|something`, the first character must be a Unicode XID start character (most letters or `_`)

Sonatype has a full-time staff which manually curates groupIds and promises a maximum turnaround time of two days when submitting new ones.

7 Likes

I signed up for a groupId in OSSRH in 2017, based on a domain I control. It was processed by a human (JIRA ticket ID OSSRH-30792) and the process took about 2 days.

Regardless, OSSRH's groupIds are at least in part manually curated by a full-time staff at Sonatype.

Please calm down.

The article that @bascule links is still part of the official, current, documentation (linked from https://central.sonatype.org/pages/ossrh-guide.html as "Why the wait?"). So its publishing date is not that important.

The ticket you link is for claiming a subpath of com.github., where the author only has to proof that they are indeed under control of the account.

Your anecdotal evidence may differ, but sonatype still documents what @bascule describes and has experienced, so dismissing their evidence as "anecdotal" is unfair, as they have experienced the documented process.

For what it's worth, I've been chatting with people about optional crate-based namespaces for years, and gotten mostly positive responses from people on the teams as well as community members. A lot of projects have wanted it for basically being able to signal ownership in an unspoofable way (e.g. async/foo, hyper/foo, icu4x/foo, tokio/foo).

The goal is explicitly not to solve general squatting (just squatting/overlap of prefix-foo crates). There are some open questions about how stuff gets imported but it's all solvable.

I've been intending to pre-RFC this after incorporating all the feedback from team members and the community (i.e. something that is actually likely to merge!), but every time the topic of namespacing comes up it devolves into an unconstructive mess, where everyone wants different things and people have strong opinions.

I have more time on my hands now and was intending to start drafting this, but honestly, after seeing this discussion, I feel pretty reluctant again. I might do it anyway, but please, please be more constructive in such discussions.

15 Likes

Created: 04/23/17 10:39 PM Resolved: 04/24/17 11:07 PM

So a little over 24h from the time the ticket was opened to resolution.

Last thing I'll say on this thread: the whole point is that OSSRH works by a JIRA ticket system. crates.io does not have one, nor does it have staff to provide this kind of support. It needs fully automated, immutable solutions.

Package systems like Maven Central which require this sort of human support staff aren't good examples for proposed changes to crates.io

(ticket link for those reading along: Loading...)

The ticket was created on 04/23/17 10:39 PM, with response by a human at sonatype by 04/23/17 10:39 PM asking for user action (first release + a comment). This happened at 04/25/17 04:06 PM. This is "about 2 days". The last action may be in the hands of the requester, but it's still part of the process. You could argue that the reviewer part is the end of the process at the side of sonatype, but in this case, you'd need to express this here, instead of just claiming that @bascule is factually wrong.

In general, looking at "namespace" tickets in the repository does show that sonatype is rather fast (good!), but quite a lot of them have had manual intervention - especially if people don't own a domain or need access to a namespace on behalf of their employer.

1 Like

Moderator note: Thanks for the extra details, everyone, but let's not dive any further into Sonatype's exact process unless absolutely necessary. Please steer this back to specific lessons/suggestions for the Rust project.

6 Likes

(post deleted by author)

It does not work on Windows, as I found out a few months ago, pretty painfully. (for something totally unrelated)

1 Like

I've read this thread, and I do not think they have been shown incorrect at all. I wonder what you are referring to here.

Moreover, "names must be immutable" is a value proposition, a design goal, it's not per se a statement that can be true or false. I doubt you'll convince anyone by telling them that their design goal is "incorrect". The benefits of this design goal have been stated repeatedly (most importantly: no "left-pad"-style issues), and I haven't seen good arguments against it in this thread. You seem to value proper immutable names less than other concerns, which is fair, but what is not fair is expecting everyone to agree with that.

Anything DNS-based will either end up in a situation where you own a DNS name but not the corresponding cargo namespace (as a previous owner of that hostname already used the cargo namespace), or else ownership of a package changes suddenly and without an action by the previous owner when ownership of the DNS name changes. Ergo, using DNS is a no-go for an immutable registry like crates.io.

3 Likes

I read and quoted that comment. It did not contain any arguments for how to achieve immutable names with DNS, or for why using DNS has sufficiently many benefits to make it worth dropping immutable names.

2 Likes

One problem with a DNS-based solution: domain names cost money. Would we really be comfortable telling someone if they want a namespace then they need to pay up?

Link? I don't recall seeing that anywhere in this thread, and imo it's a major issue.

There’s plenty of public suffixes, like github.io, neocities.org, and js.org, for anyone that wants a domain for free.

I'm not seeing a good rationale for using domain names. If a crate namespace owner has to prove that they own a particular web domain, what problem does that solve? How is it better than just using arbitrary identifiers?