Crates.io squatting

I disagree. The best rating is "links". In other words, usage. The most used crates should get the highest rating. The Authors of the most used crates should get higher ratings. That's it. Anything else is gameable.

Except that Amazon ratings are heavily, heavily gamed.

You should visit Black Hat World. Find out how the awesome amount of effort that gets put into gaming PageRank.

3 Likes

When shooting down ranking/reputation ideas because they’re not perfect, can be gamed, etc., please compare them to the status quo, not imaginary 100% bulletproof ideal.

For example, there was a paralysis in discussion about crates-io crate ranking, because votes, reviews, links, metadata, IP addresses, user accounts, crate content, usage numbers, etc. all can be faked and gamed in some way.

So because no proposal was absolutely ideal bulletproof impossible to game, all were shot down, and we’ve stayed with status quo that is worse than all of them: raw, unfiltered, game-all-you-want download numbers without any attempt to even filter out bots and repeated installs in CI (and a disclaimer that it’s not a ranking, the crates just happen to be sorted by this not-a-ranking number…)

8 Likes

Which seems problematic if a domain is actually fought over - you wouldn't want to have to rename your software package, especially if widely used already.

1 Like

Indeed, take-overs and transfers would be a nightmare for contributors, especially if two crate names inside the package collide (or near-collide). And I do believe collisions are likely: programmers are not the most imaginative for naming, so core, base, etc… are likely to show up in many a namespace.

1 Like

Guilty.

There's a reason I use "foo", "bar", ... :wink:

1 Like

The status quo is that people get their recommendations through third parties like the Awesome List, GitHub, Reddit, the forum, Stack Overview, and general search engines. Not only is the community not centralized around any one of those places (which, in itself, makes gaming it trickier), they are all better at abuse mitigation than crates.io will ever be.

If I had my way, crates.io wouldn’t even have search or categories, and it would address everything with UUIDs.

Crates have human oriented identifiers, which at this point are baked heavily into the language and cargo ecosystem, so the boat on uuid has unfortunately sailed.

Awesome Lists add some level of moderation for discovery, but going back to the original problem - they can’t fix abuse on crates.io itself.

2 Likes

Is abuse on Crates.io a theoretical or actual problem? It’s not clear to me that there is an actual problem (yet). Just curious what your feelings on this are and how that should influence the discussion and solution.

It's perhaps a problem, but more importantly an opportunity. The opportunity is to make it from a working system into a system that makes programmer's work easier.

Exactly. To state a system should not be used without proposing an alternative is to propose doing nothing. And, doing nothing is the worst idea ever i.m.o

The fact that Google still exists and has not been burned down by angry mobs proves that ranking can be done successfully. Sure there are ways to cheat, but if that was a reason to shut down Google... o boy, finding stuff on the internet would be a mess.

Why not implement a selection of ways to track the reputation of crates by reviews, links, usage, etc and let people set up their own custom filters?

Amazon has implemented verified reviews to work against this (the review account must have the purchase registered). A similar thing can be done by designing a web-of-trust for rust, where you can filter the reviews based on the author reputation and exclude reviews from low-reputation authors.

Any problem in this world can be solved given the willpower to do so. I for one am not just going to roll over and accept defeat on the question of improving crate discoverability and naming.

It is an actual problem. It’s starting small scale (people grabbing 20 crate names they like), but it’s inevitable that someone will run a script and grab 1000 or 10000 names. The site allows this, but most importantly the policy allows it, so once someone squats crates io big time, they will have right to argue it’s all A-OK. Applying anti-squatting policy retroactively is a bigger ask than setting one beforehand.

To put it another way: if crates io doesn’t have squatting policy, the first person with a script will set the policy for you.

6 Likes

Earlier in the thread it showed a few people abusing the system by reserving a bunch of names requesting “contact” if they wanted to use these names. this thread’s gotten too long for me to find, but it’s somewhere here.

Is it worth making a distinction between "crates.io doesn't say anything about squatting at all" and "crates.io explicitly allows crate squatting"? It's my understanding that crates.io doesn't explicitly allow it, and doesn't explicitly forbids it. While a serial squatter could make the claim that they have no violated the letter of the crates.io policy, I think this discussion thread shows that the community is very much against it, and so this person would have a hard time arguing that they didn't expect there to be repercussions.

On Reddit, u/dirtlamb5 pointed out that the user cratesio has grabbed 188 short crate names just today.

1 Like

I wonder if it is possible this is someone proving a point?

I think what we need more is a policy that ensures a squatted crate is handed over to a person who needs it. Right now the squatter can refuse to do so and there is nothing much anyone can do about it.

One thing in this discussion I’m still not clear on: What is the definitive definition of “Squatting”? I’ve heard a lot of things that boil down to the equivalent of, “I can’t define pornography, but, I know it when I see it.”

1 Like

Everyone, Crates.io incident 2018-10-15

Thanks.

4 Likes

Editions provide a way to make transitions in certain parts of the language. Would they also provide a way to make transitions elsewhere in the ecosystem? Too late for it to be a Rust 2018 thing, but could each edition provide an implicit namespace on crates.io? E.g., to reduce the problem of "I have some legit dependencies on crate x" but it is no longer maintained—unfortunately the name will have to be taken forever—instead, “For Rust 2021, I’ll need to run a fixup script to change the dependencies to be 2018/x because the name is going to be freed.”

This might be a horrible idea, but it would let us put off supporting / in crate dependencies until the next edition release, and it would also allow people to usually just refer directly to the crate name (e.g., "to do this, you need to install clap" rather than "to do this, you need to install kbknapp/clap") since that kind of communication can come with the implicit connection to the current edition.

We have that kind of partitioning of names of entities in everyday speech when we refer, e.g., to a particular player being on the Bengals (implicitly referring to the current year’s roster) vs a particular player being on the 1987 Bengals.

A review of spammed/squatted crate names would then only have to occur in the run-up to a new edition. (Granted it doesn’t solve the short-term problem of massive spamming, but it provides a cleaner transition point for new crates to take over old crate names.)

1 Like

This is true of most concepts and not necessarily an argument against policies about squatting. There are fuzzy cases of whether or not something is theft or murder such that we may struggle to produce a formal definition that satisfactorily applies. In those cases determinations are made about the specific case and possibly our definitions of the concepts or regulations are refined, but it’s not a reason to give up having policies or enforcement mechanisms against those acts, or a reason to not believe those acts are thing that exist and can be discussed.

A policy could simply refer to “squatting” and give large leeway in moderation decisions, or it could adopt one of the many definitions offered in this thread such as @kornel’s. Obviously those definitions are not exhaustive, deterministic mechanisms that will unambiguously resolve every possible situation, but they at least identify creating a crate for purposes other than sharing substantive, non-malicious code with community as against the policy of crates.io.

I tend to think the maven-like domain name model is the best long-term path for cargo, but that doesn’t mean easier and more direct policies and mechanisms shouldn’t be put in place sooner.

2 Likes