Pre-RFC: Formal squatting policy on crates.io

jhpratt · November 15, 2019, 8:57pm

Currently, crates.io has no policy on squatting (ref):

We do not have any policies to define 'squatting', and so will not hand over ownership of a package for that reason.

Surely it makes sense to allow someone to use the crate name if an existing "crate" is clearly not being used and there is no apparent attempt at doing so. Off the top of my head, here are some basic criteria:

Unquestionably useless — zero useful behavior. A trivial check for this is if there are no public exports.
Most recent publish at least six months ago. If combined with a possible public repository, this would indicate lack of development.
Crate owner cannot be reached, or was reached and did not respond within one month.

Excluding the last point, this would include every crate by swmon, which I think is a good baseline. I'd think that it would make sense to have the potential new author contact the crates team, which would then start the one-month time period for attempted contact.

Let me know your thoughts — I certainly think something needs to be done in this area.

17cupsofcoffee · November 15, 2019, 9:09pm

This has been discussed many times over the years - the main issue that comes up is how easy it is for the squatters to move the goalposts:

If you have a complexity check, they can just export a trivial function or two.
If you require publishing every six months, people will just push another version every six months.

This leaves you in a situation where there's still squatting, but legitimate crate authors have to jump through hoops to keep their crates.

This isn't too say I'm against the idea of a better policy (there's some cases of squatting that are completely blatant), but the reason it's not been done until now isn't because no one cares, it's because it's a hard problem to solve in a way that'll make everyone happy

RustyYato · November 15, 2019, 9:11pm

Previously on crate squatting:

Some Pre-RFCs to deal with it

Some related incidents

jhpratt · November 15, 2019, 9:12pm

@17cupsofcoffee Absolutely it's possible to move the goalposts and trivially bypass these requirements. My suggestion is to create the absolute minimum that most people would agree would constitute "squatting". Further requirements could come in the future, for sure.

bascule · November 15, 2019, 9:17pm

Even if you manage to accomplish these things:

Formalize a namesquatting policy (yours seems rather complex)
Find people willing to review disputed crates under a namesquatting policy (either volunteers willing to put in a lot of time / deal with spam, or paid to review crates via ??? funding source)

...those people then become a potential point-of-attack to perform crate hijacking / software supply chain attacks. So not only do they have to give up time (or you have to find some way to pay them for their time), but they need to practice good security practices or they put the crate ecosystem in danger.

Is namesquatting really such a problem that you think it makes sense to potentially endanger the entire crates.io ecosystem with increased risks of software supply chain attacks in order to solve it?

If you try to automate it by performing mechanical checks for something like "no public exports", then crate squatters can just export a public stub, and you've accomplished nothing.

jhpratt · November 15, 2019, 9:27pm

As I said in my previous response, this is intended to be a baseline. Yes, it is trivial to bypass by having a public export and/or updating on a regular basis. Many crates that are squatted do not do this, and I suspect that swmon (as an example) won't be going through to update their 100+ crates if a policy were put in place, given that there appears to be no way to contact them to take over the crate anyways (contrary to what the README says on crates.io).

This is by no means intended to be a catch-all for squatting; rather, it should be unquestionably clear as day that the name is being taken for no purpose.

I believe most of this could be automated to some level. Checking for zero public exports is doable, as is the six-month update requirement. At that point, a "claim this crate" form/link/whatever could be put up, letting a user initiate an internal process. The server could send the crate author an email (getting it from GitHub if available?), letting them know that if they don't respond, the crate will be transferred.

How would this allow for hijacking? Perhaps every version published should be required to be empty? That would eliminate any crate that used to be useful but no longer is.

anp · November 15, 2019, 9:59pm

Given this awesome and very long list of previous discussions I would expect any new proposal to summarize the current status and address outstanding points raised previously. This is an expensive discussion to revisit from scratch.

naftulikay · November 15, 2019, 10:50pm

Yes, this topic has nearly been discussed to death already, and there aren't solutions that address everyone's concerns

Since PyPI is also victim to similar squatting issues, I've specifically opted out of hosting there on many of my projects from time to time; up until fairly recently, PyPI's uptime has also been a significant issue, which thankfully crates.io does not also experience. In any case, if you really want to dodge the squatting issue or work with a forked crate, you can always bypass crates.io and link directly to the Git repository of the software you want.

josh · November 15, 2019, 10:57pm

RFC 2614 provides a first step towards this. It lost momentum, and needs reviving.

derekdreery · November 16, 2019, 8:04am

It seems like the issue of bad actors on package sites is getting worse more generally: https://hacks.mozilla.org/2019/11/announcing-the-bytecode-alliance/

kornel · November 16, 2019, 12:06pm

Like porn, squatting is very difficult to strictly define in legal terms, but not as hard to identify when it happens. Keep in mind that squatters always know when they squat.

If a squatter takes the charade to the point of actually regularly publishing unique, working, useful code, then that's not squatting any more But anything less than that can be spotted by a human.

I know crates-io doesn't have enough people to review all crates, but I think mere risk of a human reviewing a squatted crate and removing ownership dramatically changes the equation for squatters.

Currently crates-io gives squatters a guarantee of immunity. They can squat and keep the crate forever, no matter how obvious the squatting is. It's easy, and allows them to act in overtly malicious and anti-social ways, and know nothing will happen, because crates-io officially doesn't care.

But if crates-io merely said squatting is not allowed, and ownership may be revoked, users may be banned for it, then it's immediately harder to do anything "useful" with squatting. crates-io doesn't even have to enforce it at scale, merely give a credible threat. That will be enough to prevent squatting with an intent to sell later, because potential buyers would report the crate instead.

josh · November 16, 2019, 1:27pm

lnicola · November 16, 2019, 4:43pm

I fully agree with you. If "I know it when I see it" works for porn in a legal context, I don't see why we can't have it as a policy for crates.io.

derekdreery · November 16, 2019, 6:31pm

I fully understand the reasons for not creating policy to address the issue of squatting and pro-actively making it harder for bad actors (as opposed to re-actively removing malicious crates), as these activities would consume significant resources. Rather than solutions, I offer some thoughts about the current state of play:

I notice that npm have moved in the direction of namespaces (e.g. @babel/core) which we could do in rust/cargo.
Progress on sandboxing procedural macros and build scripts is very welcome.
It's actually quite hard to view the source that you have downloaded from crates.io. Maybe there could be a cargo tool that shows you it easily.
My current way of working around this problem is that I have a bit of a web of trust in that I recognize certain authors, and trust them to check their dependencies, but mostly I just ignore the problem and hope for the best, which is not a great state of things.
A minimal crate resolution strategy could help to make sure you don't pull in a new malicious version of a package just by running cargo update. Also, community-curated collections of (crate, version) sets where there is only one version included per crate (known as package sets) that are known to work together could provide a kind of "LTS" for packages, meaning that once checked by the community (or a WG or similar), stay immutable and trusted. I've used these with purescript and they're awesome not only for security, but also because you don't end up in dependency version hell. A good description of what they are can be found on the purescript package-set repo.
If you trust the authors of the library, use dynamic versioning; if you don’t, use a fixed version.

- My recipe against dependency hell

chrisd · November 16, 2019, 6:51pm

I think the issue here is crates using a name but not doing anything with it. IMHO, dealing with crates that are actively malicious or otherwise untrustworthy should be treated as a separate issue from squatting.

The problem here, as I understand it, is only that a name is unusable due to squatting and it can also add noise to search results.

derekdreery · November 16, 2019, 7:25pm

I think there is some overlap, both in terms of the things themselves (malicious activity and squatting) and in terms of their mitigations. For example the @babel/core namespacing style both helps work around squatting, and means that you don't need to allow transfers in crate ownership (just fork the crate and release it with a different prefix), thereby also helping with security.

Just an idea off the top of my head: If you had both namespaced packages and package sets, then you could have package sets with the additional restriction that only one package with a given name (without the prefix) is in the set, and then you wouldn't have to include the prefix when selecting packages from the set.

bill_myers · November 16, 2019, 7:32pm

Could just go for the "perfect" solution of requiring all new crate names to be "<user_public_key_hash_in_base64>/<readable_crate_name>", where submitting requires providing a public key that hashes to the specified hash and signing the submission request with it.

In addition to making squatting no longer a meaningful concept, this also allows Cargo to get packages from untrusted registry servers, since the crate signature verification provides security itself (not against version downgrades, but the version field in Cargo.toml addresses that).

bascule · November 16, 2019, 9:49pm

The alternative to @org/package syntax I've seen suggested is a way to claim a myprefix-* or myprefix_* crate namespace, which would work with the existing module system.

HeroicKatora · November 16, 2019, 10:07pm

Alternatively make packages behave more like namespace so that by publishing foobar only you also get to publish foobar/xyz or another way to combine different levels of packages. That way orgs like amethyst would get a reasonable naming scheme without having to worry about conflicting names or other published packages with that naming scheme.

jhpratt · November 16, 2019, 10:15pm

That doesn't solve the issue of squatting crates like swmon is, though.

Topic		Replies	Views
Crates.io package policies policy	61	30297	April 19, 2019
Pre-eRFC: Crate name transfer policy	19	2758	March 25, 2019
Idea: Per edition reclaiming of crates.io crate names libs	10	494	March 17, 2025
[Pre-RFC] Crates expropriation policy cargo	34	3863	March 25, 2019
Namespacing on Crates.io cargo	92	13596	April 23, 2019

Pre-RFC: Formal squatting policy on crates.io

Related topics