Crates.io squatting

I think case by case is really the only way it can be done, but it can be augmented by having a reporting function on crates.io that would make it easier for the folks deciding to focus their attention.

People have aired concerns earlier about crates becoming available for reuse: if you depend on some crate and that crate suddenly changes ownership, there’s all kinds of tricky trust issues. Are all the downstream users okay with giving as much trust to the new owner as they did to the previous one? In the end a crate can execute arbitrary code on your machine, so there are big risks here.

At the same time, I do think that shrinking away from trying to fix these situations is not a good long-term strategy. The same person 5 years later might also be less trustworthy for whatever reason; better to hand off their useful crate to some other maintainer who has proved to do good work.

No matter what, the arbitration would likely (1) have to provide transparency on what decisions are made, and roughly on what basis, but (2) must be allowed some leeway; very strict rules can easily be gamed, and that’s a risk, which is why I think actual people deciding together would be the best way forward.

You can combine 2 and 3. Community clicks a button and there is a review team which studies the request.

A combination of all these would be best.

  1. Define which cases are definitely fine (e.g. you get a grace period of 6 months reserving a crate, but after that a report may be accepted)
  2. Define which cases are definitely bad (spam, typosquatting)
  3. The gray area in between will need human input.
2 Likes

I think forcefully taking ownership of crates in anything more than the most trivial cases will be very contentious, and I’m not in favour of it. We absolutely should not take a package from a person only because that package has <5 active users, the publisher has <20 packages, and the community mistrusts the publisher. Things can break for real people when you un-register a package that’s in use. On top of that, the moderation schemes that are currently being proposed can easily take 1 FTE of continued effort, which is way too much for a small organisation like Mozilla Rust to pay.

And if we pursue a policy of un-registering only trivial cases, then is this committee really worth the effort? Determined people can still get around it. And there might still be boatloads of squatting.

Of course there isn’t so much spam right now. I understand that some people are afraid that there will be an “explosion” in spam or whatever, but this is not a given, and it’s also not something that’s much harder to fix retroactively.

But I’m in favour of proactive non-committee solutions like CAPTCHAs, phone validation and rate limiting. All of those I think will be generally accepted by the community. Perhaps even a fee schedule.

2 Likes

Hi! I stopped following this topic after 20 replies so maybe someone already mentioned this but what if we will use as crate unique identification not only its name but combination of author and crate? The same way as Github and other similar services do.

So in Cargo.toml one should use:

[dependencies]
regex = { version = "1", author = "JohnSmith" }

For the backward compatibility if author is not mentioned then cargo should use regex crate from the author who registered it first.

This approach will also prevent problem with changing ownership of the crate.

Of course this will make cargo a bit more complex but I think it worth it.

3 Likes

@HaronK I was thinking the same, but with the original author’s GitHub username. The Cargo.toml author isn’t necessarily unique, and email addresses can change.

AFAIK publishing still requires a GitHub account - assuming other identity providers become supported in the future, that would imply another level of namespacing. So the full logical name would be something like:

idprovider.username.cratename

e.g.

github.fredbloggs.frobnify

@mrec You right. I also see that there can be more than 1 crate author. Is there ‘main’ one? Oh! Even more there are also owners of the crate…

In fact the original author, or even the publisher of the last version does not need to have any permissions to the repository :slight_smile:

We also could introduce crates.io authors. So all currently existing crates won’t have any author but new ones will do. And if in Cargo.toml author is not mentioned then it will look for crate without author.

1 Like

I really like how Docker Hub has done it. I’m not that into the details, but I know they have official Repositories and then they have the user ones.

If you're going to put github.com/username/cratename in the namespaced identifier, you can just as well add https:// to the identifier and use Git URLs instead :slight_smile:

1 Like

Why not notriddle (Michael Howell) · GitHub :arrow_right: notriddle.github.io ?

Second of all; aren't you just moving the squatting problem over to GitHub? You can register names there just as freely as you can register them on crates.io. Top-level domains have cost, which limits squatters in a way that GitHub accounts don't.

1 Like

Top level domains also cost money that not ever programmer has.

3 Likes

There is a decent amount to think about to be careful with the behavior of Cargo to avoid automatically upgrading to an otherwise compatible version when change of ownership occurs. Having a public key hosted by the registry has other (solvable) issues to address.

Cargo could handle this by:

  • Including the appropriate public key signature in the Cargo.lock file; cargo would check that new releases are signed against the key. Otherwise the user would be prompted/warned/etc.
  • To allow for key rotation for projects which do not change ownership, new public keys can be signed with the old private key. A registry host would prevent new releases being published signed with an old key.
  • In the case that a key is compromised, crates-io (and self-hosted registries) can prevent uploading new packages signed by the poisoned key.

While I generally like the domain name solution, I think it complicates the story around being able to find crates that are well supported by the community (such as serde, etc) and I think its valuable to at least come up with a story about how this could be improved in the context of domain naming (or similar idea such as pet-naming which was mentioned in a previous name squatting discussion).

1 Like

Such curation takes waaaay more work than writing "don't squat" in the rules and deleting worst spam from time to time.

5 Likes

I’ve reached out to the crates.io maintainers in the past. The answer was crate squatting not conflict with crates.io policy https://github.com/rust-lang/crates.io/issues/624 http://doc.crates.io/policies.html

1 Like

One squatter is even a Rick Astley fan: https://crates.io/crates/gcr (click on documentation)

1 Like

Who decides what is "Squatting"? I'd question why someone would want to spend their valuable time on this. What is to be gained?

If people start becoming creative about squatting, we will end up adding a "Squat of the Week" to TWiR... :sweat_smile:

8 Likes

Probably the same thing as URL squatting. It's a bet on rust becoming popular, and thus good crate names becoming valuable.

2 Likes