Pre-RFC: Formal squatting policy on crates.io

I strongly disagree. If the maintainer decides to transfer ownership, that's one thing. However, I don't think we should allow takeover of existing crates by (essentially) arbitrary third parties. I frequently run cargo upgrade, but this change would require me to verify that all of my dependencies haven't been transmuted into completely different crates.

I think the benefits of having crates not change out from under people's feet vastly outweigh the benefits of making certain crates names accessible.

4 Likes

Do you use for breaking upgrades as well? If crate had a breaking release (e.g v0.1.0 -> v0.2.0 or v1.0.0 -> v2.0.0), you should check changes in the crate either way. And with the mentioned restrictions on versions which can be published after crate transfer you will notice it, so crates with this proposal will not "change out from under people's feet".

If you still think such measures are insufficient, we also could add additional barriers for using crate after transfer, e.g. we could add owner field without which you will not be able to use post-transfer versions:

foo = { version = "0.7", owner = "new_owner" }

And what if the current owner cannot be reached, the crate is empty, and has been so for a period of time? There's zero risk of breakage there.

1 Like

Personally, I'd be fine with that, as long it was never possible to have actually used the crate. What I object to is changing the ownership of existing non-empty crates (e.g. a crate which someone could actually have used) without involving the owner.

Just so you know, that's my original proposal :slight_smile:

3 Likes

We haven't really got deep into discussing solutions and their costs, because the mere idea of doing anything gets shot down quickly. And not-yet-presented solutions are shot for having too large cost on crates-io team, without even knowing what that is!

So here's one zero-cost solution: Change crates-io policy to say that squatting is frowned upon. Literally just a wording change in one doc.

Currently, a large portion of squatted crates is taken by well-meaning people (these "please contact me if you want this" crates). They're unlikely to be stubborn rule-breakers running a sophisticated trolling-for-profit operation. They've just grabbed a bunch of names, because that is explicitly allowed by crates-io, very easy to do, and has no downsides for the squatter. And solution to this may be as simple as politely asking them not to do that.

edit: changed that to Proposal for crates-io crate name reservation

8 Likes

Not everyone has public info, though. swmon is the first person that comes to mind, where there isn't even an email on file due to the age of the squatting.

Dealing with existing crates is a separate issue (overlapping with what to do with non-squatted but abandoned crates), and likely a lost cause. Changing squatting rules for future crates is much less controversial than changing retroactively.

3 Likes

The problem with a change like that is that it, by definition, has no teeth. Policies with no enforcement train people to ignore the policies.

1 Like

You're talking about the "happy path" in a system that fundamentally relies on manual review by humans, when the subject at hand is what goes wrong.

In general it doesn't seem like you've responded to any of the concerns about what can go wrong except to downplay them.

You ignored my question from earlier: "Is namesquatting really such a problem that you think it makes sense to potentially endanger the entire crates.io ecosystem with increased risks of software supply chain attacks in order to solve it?"

But perhaps I can ask a simpler question: can you at least acknowledge there are software supply chain risks here, and if you can do that, can you also give them due consideration?

I squat the names of crates I eventually intend to publish projects under. I think my success rate in that department is ~60%. I willingly give up names of squatted crates to people who ask unless I have immediate plans for using them.

Should that be "frowned upon"?

2 Likes

Do you think you're squatting and shouldn't be doing it?

I'm suggesting here the absolutely lowest bar of low bars — self enforcement.

I self-describe what I'm doing as squatting in the associated commit messages for the squatted crate boilerplate, but if I go on to release a usable library under a name I've squatted, I don't consider what I'm doing wrong, especially in absence of a formal policy.

I consider it quite useful to be able to reserve names I intend to use in advance.

If there were a formal policy condemning it, I would definitely think twice. I'd probably still do it, but feel bad about it, whereas today I'm elated to claim certain names and rather enjoy the absence of a namesquatting policy.

2 Likes

With squatting we have extremes such as taking a bunch of names out of spite/malice/trolling, where hopefully everyone can agree such actions are undesirable. And we have other extreme where author of a legit crate wants a placeholder for a create they're honestly planning to release, which is harmless. And between that we have lots of gray area.

I think it's possible to give reasonable definitions of the extremes, but there will always be some gray area. The tragedy here is that bringing existence of the gray is a surefire way to stop any action against the obviously unquestionably bad behaviors.

The problem can be reduced to heap paradox:

  • Let's say that registering 1 unused crate is OK.
  • Registering 208827064576 crates (all 8-char names) is NOT OK.

What number in between these two stops being too much? We'll never agree. But as long as we disagree, registering 208827064576 crates remains allowed.

8 Likes

@kornel that's a good way to frame the problem, and when framed that way, it seems like more of a general crates.io abuse issue.

1 Like

You're talking about the "happy path" in a system that fundamentally relies on manual review by humans, when the subject at hand is what goes wrong.

I'm setting an absolute baseline, as I've mentioned before. Every single crate by swmon matches all three criteria I laid out. That's over 100 crates. I do not dispute that there are a likely a number of squatted crates that don't fall under these criteria — my goal isn't to be the final solution.

You ignored my question from earlier: "Is namesquatting really such a problem that you think it makes sense to potentially endanger the entire crates.io ecosystem with increased risks of software supply chain attacks in order to solve it?"

But perhaps I can ask a simpler question: can you at least acknowledge there are software supply chain risks here, and if you can do that, can you also give them due consideration?

Using the three criteria I laid out, what risks are there? The first point is that it is unquestionably useless, so literally nobody should be depending on the crate. If it has a use case, it doesn't fall under these criteria, and won't be transferred.

I squat the names of crates I eventually intend to publish projects under. I think my success rate in that department is ~60%. I willingly give up names of squatted crates to people who ask unless I have immediate plans for using them.

Should that be "frowned upon"?

Somewhat. The key is that you're willing to transfer them. No one would be able to claim it for six months under my criteria, so you'd have plenty of time to get things together if you wanted to still use the name. And that's if the name-holder cannot be reached. Some users (again, swmon) can't be reached by any means.

2 Likes

And you haven't acknowledged any of the responses which point out why this is a bad idea except by dismissing them without actually addressing them.

If you set a minimum bar which allows people to claim crates as squatted as a fully automated process, then I think one of two things will happen:

  • new squatters will just re-squat previously squatted crates
  • the old squatters will republish their war chest of squatted crates to be slightly above your baseline

...and you have accomplished nothing.

If you loop humans in to do manual review (who's volunteering or paying them to do this work?), and they have the power to hijack crates from their current owners and transfer them to other users, then that runs a risk of software supply chain attacks.

When I brought this up before, you responded:

...so what's the point exactly?

Regardless of what approach you take, what you're prosing is a huge amount of work to accomplish something you admit is trivial to bypass.

1 Like

If you set a minimum bar which allows people to claim crates as squatted as a fully automated process, then I think one of two things will happen:

  • new squatters will just re-squat previously squatted crates
  • the old squatters will republish their war chest of squatted crates to be slightly above your baseline

...and you have accomplished nothing.

To me, that's more of an argument to do more, not less. The user I've been using as an example (swmon) appears to have no public activity in the Rust community. What evidence do you have that they will republish the 100+ crates?

If a new squatter comes in, hopefully they will at least respond (or be able to be reached, unlike the current situation).

If you loop humans in to do manual review (who's volunteering or paying them to do this work?), and they have the power to hijack crates from their current owners and transfer them to other users, then that runs a risk of software supply chain attacks.

When I brought this up before, you responded:

Yes, it is trivial to bypass

...so what's the point exactly?

Again I ask, what supply chain attack? One of the criteria is that the crate is unquestionably useless. No one would depend on an empty crate for obvious reasons; there is zero risk of breakage. If they are depending on it, there's nothing they could be importing, so, again, no breakage. As I said in a previous comment, all versions should be empty, though.

What is the harm in trying to partially solve the issue, rather than trying to sweep it under the rug by bringing up nonexistent risks? You mention that I have dismissed criticisms; feel free to quote which ones I have. All I've done is dismiss namespacing, as that (might) deal with the issue, not my specific proposal. Aside from that, I just requested people stay on topic.

This post is my opinion as an individual. I am not speaking on behalf of the crates.io team. Nothing in this post reflects any consensus that has been reached among the team.

I would like to point out that it is impossible to programmatically determine whether a crate exports any useful items. build.rs exists. Turing complete proc macros exist. Even if we ignore the halting problem, there are many useful crates out there which require some unknown set of features or native dependencies in order to compile.

Even more simply, crates.io is deployed on linux infrastructure. Any short term implementation of this would almost certainly be fooled by #![cfg(windows)]. This could certainly change, but any RFC would need to justify why this is worth the time and monetary investment required to fix this, instead of everything else we could be working on.

I'm certainly interested in seeing a realistic proposal come out of this, but let's stop pretending that there's any programatic detection of "useful" that exists. Any realistic proposal of determining squatting will be done by humans, and will need to clarify who is responsible for this work, and justify the time investment.

7 Likes

I do take your point about infrastructure concerns and the impossibility of automatically detecting if a crate is "useful". But surely detecting crates like this would be possible? It imports nothing, has no build.rs and doesn't declare anything but the default test.

That aside, I do agree any actual transfer of ownership should involve humans so I think your point about the human cost still stands regardless.

Would you agree that a quick check would be a decent threshold to pass before passing it off to a human, where edge cases could then be handled?