Pre-RFC: Formal squatting policy on crates.io

Not everyone has public info, though. swmon is the first person that comes to mind, where there isn't even an email on file due to the age of the squatting.

Dealing with existing crates is a separate issue (overlapping with what to do with non-squatted but abandoned crates), and likely a lost cause. Changing squatting rules for future crates is much less controversial than changing retroactively.

3 Likes

The problem with a change like that is that it, by definition, has no teeth. Policies with no enforcement train people to ignore the policies.

1 Like

You're talking about the "happy path" in a system that fundamentally relies on manual review by humans, when the subject at hand is what goes wrong.

In general it doesn't seem like you've responded to any of the concerns about what can go wrong except to downplay them.

You ignored my question from earlier: "Is namesquatting really such a problem that you think it makes sense to potentially endanger the entire crates.io ecosystem with increased risks of software supply chain attacks in order to solve it?"

But perhaps I can ask a simpler question: can you at least acknowledge there are software supply chain risks here, and if you can do that, can you also give them due consideration?

I squat the names of crates I eventually intend to publish projects under. I think my success rate in that department is ~60%. I willingly give up names of squatted crates to people who ask unless I have immediate plans for using them.

Should that be "frowned upon"?

2 Likes

Do you think you're squatting and shouldn't be doing it?

I'm suggesting here the absolutely lowest bar of low bars — self enforcement.

I self-describe what I'm doing as squatting in the associated commit messages for the squatted crate boilerplate, but if I go on to release a usable library under a name I've squatted, I don't consider what I'm doing wrong, especially in absence of a formal policy.

I consider it quite useful to be able to reserve names I intend to use in advance.

If there were a formal policy condemning it, I would definitely think twice. I'd probably still do it, but feel bad about it, whereas today I'm elated to claim certain names and rather enjoy the absence of a namesquatting policy.

2 Likes

With squatting we have extremes such as taking a bunch of names out of spite/malice/trolling, where hopefully everyone can agree such actions are undesirable. And we have other extreme where author of a legit crate wants a placeholder for a create they're honestly planning to release, which is harmless. And between that we have lots of gray area.

I think it's possible to give reasonable definitions of the extremes, but there will always be some gray area. The tragedy here is that bringing existence of the gray is a surefire way to stop any action against the obviously unquestionably bad behaviors.

The problem can be reduced to heap paradox:

  • Let's say that registering 1 unused crate is OK.
  • Registering 208827064576 crates (all 8-char names) is NOT OK.

What number in between these two stops being too much? We'll never agree. But as long as we disagree, registering 208827064576 crates remains allowed.

9 Likes

@kornel that's a good way to frame the problem, and when framed that way, it seems like more of a general crates.io abuse issue.

1 Like

You're talking about the "happy path" in a system that fundamentally relies on manual review by humans, when the subject at hand is what goes wrong.

I'm setting an absolute baseline, as I've mentioned before. Every single crate by swmon matches all three criteria I laid out. That's over 100 crates. I do not dispute that there are a likely a number of squatted crates that don't fall under these criteria — my goal isn't to be the final solution.

You ignored my question from earlier: "Is namesquatting really such a problem that you think it makes sense to potentially endanger the entire crates.io ecosystem with increased risks of software supply chain attacks in order to solve it?"

But perhaps I can ask a simpler question: can you at least acknowledge there are software supply chain risks here, and if you can do that, can you also give them due consideration?

Using the three criteria I laid out, what risks are there? The first point is that it is unquestionably useless, so literally nobody should be depending on the crate. If it has a use case, it doesn't fall under these criteria, and won't be transferred.

I squat the names of crates I eventually intend to publish projects under. I think my success rate in that department is ~60%. I willingly give up names of squatted crates to people who ask unless I have immediate plans for using them.

Should that be "frowned upon"?

Somewhat. The key is that you're willing to transfer them. No one would be able to claim it for six months under my criteria, so you'd have plenty of time to get things together if you wanted to still use the name. And that's if the name-holder cannot be reached. Some users (again, swmon) can't be reached by any means.

2 Likes

And you haven't acknowledged any of the responses which point out why this is a bad idea except by dismissing them without actually addressing them.

If you set a minimum bar which allows people to claim crates as squatted as a fully automated process, then I think one of two things will happen:

  • new squatters will just re-squat previously squatted crates
  • the old squatters will republish their war chest of squatted crates to be slightly above your baseline

...and you have accomplished nothing.

If you loop humans in to do manual review (who's volunteering or paying them to do this work?), and they have the power to hijack crates from their current owners and transfer them to other users, then that runs a risk of software supply chain attacks.

When I brought this up before, you responded:

...so what's the point exactly?

Regardless of what approach you take, what you're prosing is a huge amount of work to accomplish something you admit is trivial to bypass.

1 Like

If you set a minimum bar which allows people to claim crates as squatted as a fully automated process, then I think one of two things will happen:

  • new squatters will just re-squat previously squatted crates
  • the old squatters will republish their war chest of squatted crates to be slightly above your baseline

...and you have accomplished nothing.

To me, that's more of an argument to do more, not less. The user I've been using as an example (swmon) appears to have no public activity in the Rust community. What evidence do you have that they will republish the 100+ crates?

If a new squatter comes in, hopefully they will at least respond (or be able to be reached, unlike the current situation).

If you loop humans in to do manual review (who's volunteering or paying them to do this work?), and they have the power to hijack crates from their current owners and transfer them to other users, then that runs a risk of software supply chain attacks.

When I brought this up before, you responded:

Yes, it is trivial to bypass

...so what's the point exactly?

Again I ask, what supply chain attack? One of the criteria is that the crate is unquestionably useless. No one would depend on an empty crate for obvious reasons; there is zero risk of breakage. If they are depending on it, there's nothing they could be importing, so, again, no breakage. As I said in a previous comment, all versions should be empty, though.

What is the harm in trying to partially solve the issue, rather than trying to sweep it under the rug by bringing up nonexistent risks? You mention that I have dismissed criticisms; feel free to quote which ones I have. All I've done is dismiss namespacing, as that (might) deal with the issue, not my specific proposal. Aside from that, I just requested people stay on topic.

This post is my opinion as an individual. I am not speaking on behalf of the crates.io team. Nothing in this post reflects any consensus that has been reached among the team.

I would like to point out that it is impossible to programmatically determine whether a crate exports any useful items. build.rs exists. Turing complete proc macros exist. Even if we ignore the halting problem, there are many useful crates out there which require some unknown set of features or native dependencies in order to compile.

Even more simply, crates.io is deployed on linux infrastructure. Any short term implementation of this would almost certainly be fooled by #![cfg(windows)]. This could certainly change, but any RFC would need to justify why this is worth the time and monetary investment required to fix this, instead of everything else we could be working on.

I'm certainly interested in seeing a realistic proposal come out of this, but let's stop pretending that there's any programatic detection of "useful" that exists. Any realistic proposal of determining squatting will be done by humans, and will need to clarify who is responsible for this work, and justify the time investment.

8 Likes

I do take your point about infrastructure concerns and the impossibility of automatically detecting if a crate is "useful". But surely detecting crates like this would be possible? It imports nothing, has no build.rs and doesn't declare anything but the default test.

That aside, I do agree any actual transfer of ownership should involve humans so I think your point about the human cost still stands regardless.

Would you agree that a quick check would be a decent threshold to pass before passing it off to a human, where edge cases could then be handled?

Usual "this is not the opinion of the crates.io team" stuff.

To be clear, there is no evidence, but I guess that account is a throwaway of another person: if I were to do something controversial like this I would think twice before doing it with my main account. Also, other big squatters are indeed active, while we might not agree with their motivations (I personally don't).

This is already "solved", as we now require email verification before publishing a crate.

My point is, it's so easy to work around this policy that it accomplishes nothing in the long term, other than maybe only removing the crates of swmon. At that point a policy "You shouldn't be swmon to register crates" would be as (or probably more) effective.

Are you suggesting the crates.io team should review each non-empty uploaded crate, or am I missing something?

2 Likes

(not providing an official crates.io team opinion)

There's another solution that I haven't seen discussed-- now that Cargo supports alternative registries, anyone can run a registry with any policies they want. This also enables you to have namespaces if you want; the registry becomes the namespace.

4 Likes

(Relatedly, a while back I threw together an example of cross-registry dependencies to show that dns based namespacing is possible, the major limitation is that crates from crates.io cannot depend on alternative registries, but an ecosystem of mutually trusting registries could potentially be built).

2 Likes

While alternative registries have their uses, they are a very unappealing proposition in this case. crates-io has very favorable position in Cargo, as well as huge headstart and mindshare, which prevents any alternative registries from seriously competing with it.

  • crates-io is the default, both in technical sense and as a place where users expect crates to be.

    • Installation from crates-io is the easiest. Cargo doesn't require users to explicitly specify crates-io as a registry. Crate names aren't obviously scoped to crates-io, they're implicitly assumed to be there.
    • OTOH any other registry has to explain how to configure the registry in Cargo, extra syntax to use this registry, and has to deal with risk of user confusion (and even security issues it may cause) when crate names overlap between registries.
  • crates-io crates are not allowed to use crates from alternative registries. This is a good policy overall, but it has a side effect of creating a strong network effect in favor of crates-io.

    • Because of this limitation, and incumbent position of crates-io, library authors who want to reach maximum audience must publish on crates-io, even if they also publish elsewhere. But publishing under two different names is a complication, and doesn't solve naming problem for crates-io users, who will be the majority.
  • Registries require trust, and crates-io is officially trusted by Rust. Even tough I can start a registry, I can't expect Rust users to have the same level of trust in me.

So asking users to switch to an alternative registry is a huge ask. It creates technical hurdles, has to overcome network effects, trust issues. Users would have to be really pissed off at crates-io to jump.

Overall the problem is as difficult as disagreeing with ICANN policies. Good luck with alternative DNS roots.

9 Likes

Many of the suggestions here are similarly huge asks of the crates.io and/or moderation teams. If those solutions are under consideration, then this solution should be as well.

I have seriously considered it. I've built lib.rs and prototyped a registry replacement. I have a year of real effort put in and first-hand experience that going against crates-io is an uphill battle that won't move the needle.

Alternative registries have good uses, but replacing whole of crates-io because of disagreement about one thing makes no sense.

13 Likes