Namespacing on Crates.io


#41

That’s the thing. I don’t want to force that distinction. If Diesel would name a crate diesel-foo, and someone else wants to create a crate for that, I want them to be able to use the same name Diesel would use.

I’m not sure why this should work any differently than the rest of software. If I create a program that cleans the Windows registry, I’m allowed to call it “Windows Registry Cleaner”, regardless of whether or not it’s officially tied to Windows.


#42

I don’t personally care whether some crate is an “official” add-on to a certain project. The stuff I care about with my third party libraries are things like documentation, a pit-of-success API, being actively maintained, etc, etc. None of that stuff is strongly correlated with official-ness.

The only situation where I could imagine caring is if I had to enter into a legal, contractual arrangement with all maintainers of code I consume. Or if I decided the right way to audit all my transitive unsafe code is to vet the maintainers rather than auditing code. But I dunno how either of those would ever end up being the case.


#43

It’s just my opinion, so take it as that, but I would protest against namespacing without a slash or colon.

As @sgrif mentioned, if you create a framework and want to encourage others to write plugins or extensions, sandboxing diesel-* would make things counterintuitive, I’d have to name my crate naftulikay-diesel-extension, which doesn’t seem like an improvement on naftulikay/diesel-extension or diesel-extension.

I don’t have a perfect way forward, I don’t know that one exists, but hyphenating would seem to only add to the confusion. Any solution to this problem is going to be disruptive, but honestly I for one don’t mind go get github.com/naftulikay/gro or specifying that as a dependency. Since only the tail end of the repository is going to be used as the crate name, use gro::*; still works as expected.

If I had a vote, I’d vote for just biting the bullet and moving toward $VCS/$OWNER/$REPO. Leave the current crates as is, but start making new crates on Cargo register the full VCS link. This would prevent new squatting from happening and we’d be able to slowly weed out and get rid of the existing squatted crate names.

Just my two cents and entirely my opinion, so take it as that, an opinion :slight_smile:


#44

There seems to be somewhat of informal name-spacing already happening using hyphens though. If you were to formalize it using the following rules, it might not be disruptive at all:

  • Any crate that starts with [A-Za-z][A-Za-z0-9]+-: the owner of that crate claims that prefix (say foo-bar, the the owner of “foo-bar” gets the “foo-” prefix assigned to them
  • If more than 1 crate owner has crates with the same prefix, disambiguate as following as: 1) owner of the most crates with that prefix gets that prefix, 2) other crates with that same prefix owned by other owners get the next compete using the same rules for the next sub-segment, 3) rounds continue until all crates starting with the original are either assigned an owned sub-prefix (up and including the final segment)
  • After this is done on the initial rollout, no one is able to create new crates using and already owned prefix
  • A new user needs to register one or more prefixes for their use before they can create new crates. All new crates must be created under one of their owned prefixes (either auto-assigned during the initial cut-over, or requested and assigned)

Now, this leaves some legacy crates in sub-namespaces of a larger namespace, but, there is nothing that can be done about that without breaking things backwards compatibly.

Another useful thing would be that an owner of a prefix can delegate/assign sub-prefixes to other owners if they choose to. So, the owner of the “cookiemonster-” prefix could delegate assign “cookiemonster-kiebler-” to another owner. This “sub-prefix” would work just like sub-prefixes previously assigned. Now, the owner of “cookiemonster-” can no longer create crates called “cookiemonster-kiebler-something” because on the owner of “cookiemonster-kiebler-” may create such crates.


#45

This means that the owner of tic prevents anyone from publishing tic-tac, who then prevents anyone from publishing tic-tac-toe. Though the real complaint here is that you’re basically describing a hierarchical directory permission structure, but instead of using a path seperator, you’re using -, which people do not typically use for that purpose. Some people use - to replace _, for example. (And you still don’t stop anybody from using tictactoe.)


#46

I just don’t see a difference between using "-", "_", ":", "::", "/", "\" or whatever as a directory separator. I’m just saying to formalize what is largely an informal convention anyway (I think). If it isn’t really, then maybe this isn’t as desirable. I don’t think it is useful to get hung-up on what is and what isn’t a “directory separator” as there are lots of different conventions in different contexts.

And, no, the owner of “tic” would not claim the “tic-” prefix". The owner of “tic-tac” might claim the “tic-” prefix (if they’re the one with the most crates starting with “tic”. No existing single segment crates would become a prefix owned by anyone on their own. Those crates would persist in the global namespace and no further one-word (without a “-” in them) would be permitted to be registered going forward.


#47

I don’t think it’s a good idea to enforce any rules that use names to express relationships between crates. Names should stay nominal, and human organizations should orchestrate any crate relations via documentation or other publications. If you’re handing out permissions over resources, you have to get humans to negotiate the facts regardless.

I also don’t think namespacing is really a good idea… I see the appeal – it makes it feel like everyone has more ‘space’ without stepping on each others’ names – but it actually only does that by forcing people to use longer crate names. If we all decided to use a minimum of 40 characters for our crate names, probably we would not have anybody squatting on them either.


#48

I disagree. So, there can be only one “tic-tac-toe” crate, right? So every other implementation of “tic-tac-toe” on Crates.io is a second-class citizen right now. Under the new regime going forward it would be:

  • orgprefix1-tic-tac-toe
  • userfoo-tic-tac-toe
  • etc.

With no special “tic-tac-toe”. Now, this doesn’t fix the past, but, it improves the future in a backwards compatible fashion.

Yes, that is correct. But now, the owner of the “serde-” namespace or the “mysoftwarecompany-” namespace has some control of who can publish within that namespace and “Squatting” becomes not as useful or easy and there is some level of accountability introduced that self-organizes.


#49

This argument was addressed in the initial policies of crates.io four years ago. I get that you disagree with the conclusions, but continuously relitigating what has already been discussed to death is exhausting for the team.

EDIT: I’m bad at discourse apparently. This was intended to specifically reply to the comment above mine, not the OP of the thread


#50

I don’t see the problem caused by a special tic-tac-toe or why rust-tic-tact-toe, std-tic-tac-toe, or anything of the sort would not also be special.


#51

So, in that case, we can just ignore this thread and close it down. It’s already been decided. No further discussion needed/wanted/allowed is what you are saying. Someone opened this thread, others are chiming in with opinions about it and possible backwards compatible ways forward. Saying, “we’re not doing name-spacing and that’s that” seems a little off-putting.

My apologies for offering an opinion in support of others with similar ideas/opinions.

If I’m to understand what you are saying correctly, then, any further discussion of this matter is moot and future threads on this should immediately be shutdown and debate ended. OK. I guess that could work.

Namespacing on crates.io is off the table permanently. Got it.


#52

I still believe that it’s important to discuss all this. It has certainly been enlightening for me and I have found it productive, even if it doesn’t manifest in a real change.

The fork in the road here is that on an individual level, people will stop using crates.io and just hardcode in the Git repository URL into Cargo.toml. Invariably, forks will happen and crates will diverge, and this will end up in not working in crates.io, and people will go elsewhere.

There are a few avenues for the future:

  • change nothing: we could try to automate detection of crate squatting but still have to have manual user intervention to address violations
  • make a hard action to stop accepting “root-level” crates and sandbox under some scheme: this would stop squatting from continuing, if namespacing is done right. we could solve the existing squatted crates and not have this be a perpetual problem.

The discussion thus largely groups into one of these categories. Within the latter category, it’s a discussion on how to do namespacing, like based on VCS, based on some prefix, based on an org/user name in GH, based on a org/user name that crates keeps track of, etc.

I have found the discussion useful. I’m grateful to everyone who has participated so far.


#53

Definitely. Nobody is trying to shut the discussion down. I’m just asking that folks try to keep the discussion above “I disagree on the conclusion that has been drawn in the past”, and try to bring new information to the table. There’s been a lot of discussion around this lately, and the team does try to keep on top of it.

Probably worth re-iterating that any actual change would need to be the result of an RFC. Also keep in mind that such an RFC would not only need to describe why this is a change we want, but also why it is important to prioritize this right now


Crates.io incident 2018-10-15
#54

This is worse in multiple ways than using the domain:

  • It still doesn’t address squatting, as you correctly point out. It just kicks the can a few meters down the road.
  • It more or less doubles the length of crate names people have to write in their source code, and the addition does not provide any benefit whatsoever. And no, rename-dependency is not a solution here.
  • There will be tons of annoying issues; like some person having published a crate foo-bar and some other person having published foo … which of them gets to claim the namespace foo and therefore gets to lock the other person out of their crate?
  • It makes it impossible to easily upgrade a crate from some organization (that doesn’t maintain their crate anymore) to some fork that is being maintained, without having to touch every source file. One of the strengths of Maven’s approach has exactly been the ability to change old.unmaintained:package to new.maintained:package in the dependency file and move on without interruption.
  • It still burdens crates.io maintainers with having to make decisions in conflicts, an issue that just doesn’t exist with using domains.

On a more personal note, I consider crates with hyphens extremely ugly: the tiny amount of mental overhead having to translate from the crate name foo-bar to the name being used in source code (foo_bar) is a reason to avoid the crate. My line of thought is “If crate authors are fine with that tiny inconvenience, what other “tiny inconvenience” do they have in store for me further down the line?”.

I’m all for reinventing things if they are better than what we had before, but please let’s not reinvent things that are clearly worse than approaches that have worked perfectly well for more than a decade.


#55

FWIW, I thought this was a pretty clever minimalistic approach which solves a lot of problems without making a lot of changes. Also call me crazy but I like hyphens.


#56

It actually does address squatting. It reduces the incentive. It makes it harder to glom onto the good-will and “Brand” of other crates and makes reserving top-level “words” less desirable. It’s easier to limit the number of “Prefixes” someone can have without requiring justification because they can create any number of crates once they have only 1 prefix. But, all their crates will be associated with their prefix(es) and not able to inappropriately associate with anothers’ “brands”. Reducing the incentive to “Squat” is what this does and so does in fact address squatting in a most direct and efficient fashion.

Not necessarily, but, even so, “So what”? And why isn’t “rename-dependency” (or a similar mechanism) useful here? Many, many, other languages do something similar.

Update your maven.pom. Update your cargo.toml. What’s the difference? I’m not seeing an issue that you’re seeing.

No, it really doesn’t. You ask for a prefix, if it’s available you get it. Once you hit the “limit” of how man prefixes you can reserve, you can’t get any more without paying money or providing justification. You can create as many crates as you want without worrying about “conflicts” within your prefix(es). Almost zero burden on maintainers. Probably less than today.

Subjective criteria like this aren’t very useful. Whether or not it is “ugly” is largely irrelevant.

Good. It seems a number of people believe this would be better than what we have now with little “churn” required and almost 100% backwards compatibility. So, you agree, we should do it then?

Oh. You’ve declared it “clearly” worse so, never mind. I think if you had arguments that were more than just opinions about not liking it as opposed to specific issues it does not address your argument be more illuminating. Saying something is “Clearly” anything, without providing solid arguments is not useful in determining the merits of a proposal.

All of that being said, I don’t think this proposal is “Clearly” the best we can do and that their aren’t better ideas possibly, but, it does seem like a reasonable idea that maintains backwards compatibility and requires no compiler or cargo changes and very little changes to crates.io.

I think approaching this thread with ideas on how name-spacing might be made to work is the most useful thing for the discussion. Once ideas have been vetted and there is agreement about the “best” ideas, preparing an RFC, then pushing through the RFC process.

If you are opposed to the idea of names-pacing, no matter the form, then repeating that in this discussion isn’t really useful. If you want to oppose the RFC, if there is ever one, by all means, I encourage you to do so, that’s what the process is for. But, shooting down every possible solution in this thread because you are opposed to the idea seems premature.

That’s just my 2 cents I guess (or maybe that was for like a buck-fifty :slight_smile:).

I would hope that we could use this thread to ferret out possible “Namespacing Solutions” and leave the discussion of whether or not name-spacing should be had to the RFC approval process or another thread.


#57

This is already unrelated. lib.name is already unrelated to package.name. The package error-chain provides the crate error_chain, the package lazy_static provides the crate lazy_static, the package pistoncore-glutin_window provided the crate glutin_window.

That last one is actually the perfect illustration of how this “ad hoc” namespacing would work (and does, currently, for the ad hoc users!).

This is not a problem, because you specify the package name in Cargo.toml, which is decoupled from the crate name. If the package old.unmaintained provides the crate package, and you want to replace it with package new.maintained's version of the crate package, you just s/old.unmaintained/new.maintained in your Cargo.toml and it just works the same as the substitution in maven.pom.

This is not anything to do with namespacing. This is the decoupling of package and library names.

EDIT:

This has nothing to do with how you specify the package. This has everything to do with the library name. These two things are not the same. Please, avoiding conflating the two will help the discussion. Changing the package provider of a library crate is the exact same, no matter how you specify the package.


#58

One thing that I think may not be discussed enough is how people will actually work around squatting. Honestly, if I have a Crate name that I think is appropriate that’s already taken, I’ll just use it.

When I wrote this CLI utility in Python, I found that it collided with something already on PyPI. For a number of reasons, I stuck with that name anyway and instructed users to just use the full Git URL. When that inevitably happens for a crate name, I’ll likely do the same thing.

As an aside, I’m not comparing PyPI to Crates.io, I’ve never had a problem downloading something from Crates.io, and I can’t say the same about PyPI. They are, however, both similar in not namespacing.

I think I might be in the minority here, but I really like full namespaces with a VCS URL. There isn’t a case in which a collision or squatting would be Crates’ fault or responsibility to deal with. If someone squats microsoft or google on GitLab or another VCS provider, that’s not Crates’ problem.

It also makes it really easy to substitute a fork of a library. The crate name remains the last segment of the URL past the final slash. If I want to substitute github.com/rust-lang/rand with github.com/naftulikay/rand, there are no changes I need to make in my source code.

Anyway, I’m kind of done shaving this yak :smile:


#59

I think you’re conflating or mixing up “crate name” and “package name” a little. I think that not keeping this two things distinct in the discussion clouds the issues somewhat.

Actually, I’d be happy with either proposal, but, the proposal with “prefixes” has the nice feature that it doesn’t require new cargo handling, only updates to crates.io. The other main proposal that you advocate, using VCS/User/ as prefix, is entirely good, but, I think it requires more churn in the ecosystem to implement.


#60

Perhaps the prefixes can merely be distinguished for crates that haven’t been given the blessing of the prefix owner. Surely the tic-tac-toe crate owner doesn’t mind that crates.io points out that their crate is not part of the tic-* family of crates?