Pre-RFC: Formal squatting policy on crates.io

Alternatively make packages behave more like namespace so that by publishing foobar only you also get to publish foobar/xyz or another way to combine different levels of packages. That way orgs like amethyst would get a reasonable naming scheme without having to worry about conflicting names or other published packages with that naming scheme.

2 Likes

That doesn't solve the issue of squatting crates like swmon is, though.

who cares? all their stuff gets moved to swmon/ and we forget about them.

1 Like

No one seems to have any objections to the three criteria proposed... It all seems to be "why even have rules if people will get around them?". And then bring up completely unrelated issues such as hijacking, as if people can't go buy a random crate name and execute the imagined attack right now.

1 Like

Any policy for forcibly transfering ownership of crates from their current owners has security implications which you are blowing off, much in the same way you were accusing others of blowing off a potential namesquatting policy because it's trivially circumventable.

...is not an excuse for ignoring the security implications of a namesquatting policy. Claiming we should ignore a threat because another threat of similar severity exists is security nihilism.

6 Likes

This squatting issue has gone in circles already, so here's a fast-forward:

  • There is a value in short, memorable and relevant names. If you eliminate squatting by not letting anyone have them, then you haven't solved the problem, you've destroyed the value.

    • There's an infinite number of nonsensical meaningless names (like "nokogiri" or hashes), so using them is a way to around squatters, but that is a loss of value in naming.
    • crates ecosystem is very open-source-centric, so it is possible to cooperate to ensure that the crate with the best name is actually useful, and not a mine to avoid.
  • namespacing moves squatting of crate names to squatting of namespaces. I'll register the "google/" namespace, and we're back to square one.

    • using GitHub user/org names in crate names increases dependence on GitHub and effectively outsources moderation to them. Naming of Rust crates would be governed by entity that doesn't care about Rust's problems.
    • User names don't make good crate names. Is it retep998/winapi or retep989/winapi. Do you really want to make remembering this mandatory for using Rust on Windows?
    • GitHub user and org names can change (only internal ID is stable), and old names become available for registration again. This creates a huge problem when someone changes their name on GitHub.
  • Malware will be removed by crates-io, so that isn't a motivation for anti-squatting policy. Security aspects are a huge can of worms themselves, and much wider than just typosquatting. Many of these things can be done and would be useful regardless of squatting policy.

25 Likes

Small note, “nokogiri” is not a nonsense name; it’s a kind of Japanese saw. Japan is where Ruby originated, and parsing a string is kind of like chopping it up.

17 Likes

One could imagine a system that has

  • a nonsensical name for each crate
  • sensible->nonsensical mapping in each toml
  • a public curated sensible->nonsensical list + a lint/tool to check against it
  • local settings in $HOME folder for which list to check against

Compiler/cargo would use nonsensical names for dependency resolution. Users would always see sensible equivalents from either the curated list or from local overrides. Code would use sensible names from toml.

Re-assigning a sensible curated name would be rare and require a heavy process akin to RFC. It would be possible to quickly replace a mapping with a stop sign in case of malicious code detection.


Curation could be assisted by/augmented by/replaced with information on how many other crates refer to nonsensical name X under sensible name Y. Perhaps some measure of crate popularity/ranking could be taken into account. In any case users should be free to replace this with an alternative mechanism of their choosing.

2 Likes

Personally, I would still be interested in someone taking the time to write an exhaustive analysis and I still question the value of continuing another spitballing/pile-on thread on the subject.

2 Likes

I just re-read most of that thread, but I didn't see any direct responses to any of kornel's namespacing bullet points (and about half of them didn't even get mentioned). Plus, despite having read most of the past threads on namespaces and squatting, I've never gotten the impression that any of these problems are uncontroversially "solved" by other communities, in the sense that there's no debate to be had anymore. It's obvious that other communities have taken other approaches, but there was all sorts of debate not only over whether their decisions were right for them, but also over which of their successes and failures would translate to Rust's ecosystem.

In other words, I don't think the situation is anywhere near simple enough to justify being so dismissive about these objections.

(personally, the arguments that namespaces wouldn't really accomplish much seem solid to me, but for squatting I've never been able to form an opinion either way; there's decent arguments in a million different directions and no clear way of breaking that logjam)

6 Likes

For everyone: please keep in mind my original proposal. Regardless of you preference, namespaces do not currently exist. Squatting does. I put forward three criteria that would need to be met for a crate to be transferred. Please address that and stay with the original topic.

6 Likes

Nailed it. This is what makes these particular discussions so exhausting:

  • Claims that it's "easy" to solve problem X if you just do Y, and that everything works great for project Z which chose to go in the proposed direction
  • Disregard for:
    • the voluminous past discussions on the topic
    • considerations of the decisions already made and the need to continue to support those, a.k.a. "legacy considerations"
    • the associated drawbacks of the proposed approach along with issues raised and re-raised in spite of all of that

That's also to say: I have also seen these things done right in other threads, which do a good job of summarizing past issues and discussion, try to introduce something novel or address previously raised problems, and are pragmatic about the pros and cons. I can't say I'm seeing that here.

+1 to both

7 Likes

I should clarify, I'm of the opinion that without

the only thing which is possible is

1 Like

At this point I'll quote josh:

In particular see this comment that gives the teams thinking (as of January 2019) and notes outstanding issues. The comment following it lays out some additional concerns.

4 Likes

I agree that any attempt to rigorously define namesquatting is an open invitation to come up with crates that are squatting on some name without violating the letter of the new rule. "I'll know it when I see it", possibly delegated to some third party that is not interested in securing the name for themselves or even the community, would be a more useful rule.

I am more interested in what would happen to the "liberated" crate names. Will names be confiscated on demand or on a regular basis? If it's the latter, there's nothing stopping namesquatters from taking them again under a different username. If it's the former, who will determine which user will get the name if multiple users request the name at approximately the same time?

But your "original proposal" is not original. This topic has been discussed extensively and contentiously. In those discussions the difficulty in setting criteria for squatting or the impracticality of proposed moderation measures have lead many to back solutions in which namespacing is a component. Even if one doesn't share that view, the conversation can't be ignored. Insisting that we only discuss your proposal is wanting to center your perspective without considering and learning from the long debate on this topic -- especially since nothing you've said is new to that discussion.

Which is to say, I'm in favor for the minimum barrier to raising this discussion on IRLO should be providing an exhaustive summary as @anp suggests, then explaining how the post adds to, resolves, reframes, or refutes some significant part of that past discussion. Otherwise, I think these posts should be summarily closed with a pointer to that minimum criteria.

[Note this is my personal opinion and not the opinion of the crates.io team]

I agree with many on this thread who say your proposal is trivial to bypass, and I don't see a reason why adopting such a policy would be good for the crates.io ecosystem. Any new person squatting crates wouldn't be affected by it, since they could just add a public item to their crate. I doubt it would affect old squatters as well: most of the big squatters did that intentionally, and I guess are still active online, just choosing not to reply to mails about crate transfers: after reading this post they could simply update all the crates with a script adding public items to it.

I don't personally think this policy would be useful, but I'd love for eRFC 2614 to be picked up again.

5 Likes

By "meaningless" I mean it's not possible to guess what the crate is for from its name. It's like a one-way hash: only after you know the answer, you can see how it's connected to the name.

It's cute when a couple of popular crates do this, but it's not a scalable solution:

12 Likes

This is why we have a description field that shows up in search results:

Which is important even if you could name your crate with what your crate does, to resolve ambiguity. Is the hash crate for the data structure or the one-way function?

4 Likes

Descriptions are of course helpful, but there are other places where crate names are used. Clear names have a benefit of not needing to be followed by a description everywhere.

57

Not all names are clear, but that doesn't mean that clear names aren't valuable.

10 Likes