Crates.io package policies

Packages Policy for Crates.io

In a previous post to the Rust blog, we announced the preview launch of crates.io, giving the Rust community a way to easily publish packages. After a few weeks of kicking the tires, and hearing the most common questions people have about the registry, we wanted to clarify the rationale behind some of the design decisions. We also wanted to take the opportunity to be more explicit about the policies around package ownership on crates.io.

In general, these policies are guidelines. Problems are often contextual, and exceptional circumstances sometimes require exceptional measures. We plan to continue to clarify and expand these rules over time as new circumstances arise.

Package Ownership

We have had, and will continue to have, a first-come, first-served policy on crate names. Upon publishing a package, the publisher will be made owner of the package on Crates.io. This follows the precedent of nearly all package management ecosystems.

Removal

Many questions are specialized instances of a more general form: “Under what circumstances can a package be removed from Crates.io?”

The short version is that packages are first-come, first-served, and we won’t attempt to get into policing what exactly makes a legitimate package. We will do what the law requires us to do, and address flagrant violations of the Rust Code of Conduct.

Squatting

Nobody likes a “squatter”, but finding good rules that define squatting that can be applied mechanically is notoriously difficult. If we require that the package has at least some content in it, squatters will insert random content. If we require regular updates, squatters will make sure to update regularly, and that rule might apply over-zealously to packages that are relatively stable.

A more case-by-case policy would be very hard to get right, and would almost certainly result in bad mistakes and and regular controversies.

Instead, we are going to stick to a first-come, first-served system. If someone wants to take over a package, and the previous owner agrees, the existing maintainer can add them as an owner, and the new maintainer can remove them. If necessary, the team may reach out to inactive maintainers and help mediate the process of ownership transfer. We know that this means, in practice, that certain desirable names will be taken early on, and that those early users may not be using them in the most optimal way (whether they are claimed by squatters or just low-quality packages). Other ecosystems have addressed this problem through the use of more colorful names, and we think that this is actually a feature, not a bug, of this system. We talk about this more below.

The Law

For issues such as DMCA violations, trademark and copyright infringement, Crates.io will respect Mozilla Legal’s decisions with regards to content that is hosted.

Code of Conduct

The Rust project has a Code of Conduct which governs appropriate conduct for the Rust community. In general, any content on Crates.io that violates the Code of Conduct may be removed. There are two important, related aspects:

  • We will not be pro-actively monitoring the site for these kinds of violations, but relying on the community to draw them to our attention.
  • “Does this violate the Code of Conduct” is a contextual question that cannot be directly answered in the hypothetical sense. All of the details must be taken into consideration in these kinds of situations.

We plan on adding ‘report’ functionality to alert the administrators that a package may be in violation of some of these rules.

Namespacing

In the first month with crates.io, a number of people have asked us aboutthe possibility of introducing namespaced packages.

While namespaced packages allow multiple authors to use a single, generic name, they add complexity to how packaged are referenced in Rust code and in human communication about packages. At first glance, they allow multiple authors to claim names like http, but that simply means that people will need to refer to those packages as wycats' http or reem's http, offering little benefit over package names like wycats-http or reem-http.

When we looked at package ecosystems without namespacing, we found that people tended to go with more creative names (like nokogiri instead of “tenderlove’s libxml2”). These creative names tend to be short and memorable, in part because of the lack of any hierarchy. They make it easier to communicate concisely and unambiguously about packages. They create exciting brands. And we’ve seen the success of several 10,000+ package ecosystems like NPM and RubyGems whose communities are prospering within a single namespace.

In short, we don’t think the Cargo ecosystem would be better off if Piston chose a name like bvssvni/game-engine (allowing other users to choose wycats/game-engine) instead of simply piston.

Because namespaces are strictly more complicated in a number of ways,and because they can be added compatibly in the future should they become necessary, we’re going to stick with a single shared namespace.

Organizations & related packages

One situation in which a namespace could be useful is when an organization releases a number of related packages. We plan on expanding the ‘tags’ feature to indicate when multiple crates come from one organization. Details about this plan will come at a later time.

9 Likes

Good to see clarification on this issue though I’m a bit disappointed to see namespaces off the table. I was more interested in namespaces for the ability to maintain a fork rather than a completely different package with a similar name.

Perhaps it would be enough if cargo allowed some packages to be installed from git repos and some from crates.io instead of all or nothing. Is there a technical reason for this limitation?

It’s semi-technical, semi-cultural: first of all, cargo does let you do that, just not for packages uploaded to crates.io. The reason is that you should be able to expect anything on crates.io to be able to be built, forever. An external git link can go away.

That's a good distinction and it's the case I have in mind actually.

The use case is that you have a package published on crates.io and need to update it but can't because it depends on another package that hasn't been updated with the fix/feature/whatever. Basically, it creates a bottleneck that either namespaces or being able to depend on non-crates.io packages would resolve. It's a problem I've run into on a number of occasions in other languages/ecosystems so not just a theoretical concern.

I am very disappointed with the rationale for no namespacing.

First off, let’s dismiss arguments about technical arguments against namespaces. There are none. I certainly don’t see any in the justification text.

Arguing that package names tend to be more “creative” is pretty laughable. Look at the problems people are having today finding good domain names to see how this works out in the long run. I don’t want to have to care about naming when I publish an experimental package (which isn’t mentioned here at all, of course, because that is horrible with global namespacing), only when I publish an “official” one.

Most of the package managers that have global namespaces are curated and can make informed decisions about who gets what name. Rubygems, NPM and now Cargo are not. It’s not a feature. It’s poor behavior and I hate to see things that are notoriously bad about Rubygems be exported to Rust.

That Rubygems and NPM are prospering with 10,000+ packages doesn’t mean much when we have Github with what, millions? Namespacing works.

Leinengen already provides a better solution, to default to namespacing and fall back to single-name if name and group match. This has all the advantages of a namespaceless system. Since 90% of the justification is just arguing against wycats/http, I think it’s a pretty relevant point that you don’t need to have that.

Organizations as tags does not make sense. People like organizations because they imply ownership. I should be able to trust repositories from the same organization. Tags are for curation, not ownership. If two groups tag a repository, which one is the owner? If only one can tag a repository, you already have namespacing, just by a different name, and might as well expose it on the frontend.

In conclusion, namespaces are strictly worse than a solution like Leinengen’s. Please reconsider. If not, that’s fine, but I will probably not publish any packages on crates.io. Thanks for making it so easy to use Github though!

As a side note, seeing someone talk about “exciting brands” when discussing what should be a technical decision makes me very concerned. Can we keep marketing speak out of this? Not only do I find that a pretty flimsy justification, even if it’s true I don’t want my package manager to make marketing decisions for me. I want it to manage my packages.

13 Likes

@darinmorrison this is more of a pain point pre-1.0, for sure. But it’ll go away when the language is more stable.

Plus, you can fix that issue with local overrides, as the user.

1 Like

In package systems that support both a central registry and github (like Rubygems and npm), people publish experimental packages to Github and more stable packages to the registry. Using github as a staging ground for experimental packages makes it easy for users of the packages to keep up to date with ongoing, experimental development, and relieves the package author of the need to constantly publish to the registry just to get small changes to users.

That hasn't been my experience. Rubygems, npm, pypi, CocoaPods, nuget, and bower all use a single namespace, and I'm almost certainly missing some. Flat namespaces are the dominant solution in this space, including in package ecosystems with huge numbers of packages. There is real cost to namespacing, so you'll have to show that all of those ecosystems are ignoring a significant source of real-world problems to make progress with this line of reasoning.

The idea, which still has to be fleshed out, is to allow cases where a single project is broken up into a number of smaller packages to tag those packages with their organization name (for example, @iron or @piston). Those tags would be reserved to the organization, and indeed, they would largely be used for curation.

I personally work on frameworks, and have recently criticized npm for being technically incompatible with certain framework requirements, and even I am clear-eyed enough to recognize that the vast majority of libraries in any package ecosystem do not need the overhead of curation and grouping. When a library is part of a curated whole, I would like to call that out explicitly in the UI, and not have to guess what the semantic meaning of the (proposed) mandatory namespace is.

Policies involving naming are never purely technical decisions. That's why they are ranked as one of the two hardest problems in computer science :wink:

In closing, my personal experience as a user of many languages without namespaces doesn't match the fear you're expressing about a single flat namespace. I think you could make progress in this conversation by trying to crisply identify and roughly measure the costs you associate with the global namespace, so we could compare the cost/benefit on both sides.

I would ask that you take seriously the lack of widespread concern about a global namespaces in many package systems, and try to formulate an explanation for it. Both understanding the real-world costs and the reasons that those costs aren't rapidly identified would help me to square your deep concern with my personal experiences.

3 Likes

I am very disappointed without namespace.

  • It’s so easy to occupy the best names, such as crates.io/crates/http, but provides a bad implementation. Unfortunately, they are very easy to be found/searched, and be misused by people.

  • Without namespace, by using crates.io/crates/author-name-http, people are required always write extern crate "author-name-http" as http;, to shorten the name. We already declared “author-name-http” in Cargo.toml, why repeat it? If it is namespaced, crates.io/crates/author-name/http, people simply write extern crate http;, so easy.

  • People (crates user) do not like long name. And people (crates authors) do not like to, rack one’s brains, find an unique (“creative” and short) crate name in the world. If you think this is an easy thing, just try register a xxxxx@gmail.com, without . - _ in name. And you know, it’s so hard.

  • A namespced crate, don’t make it harder to be used. You always copy a code-snippet (e.g. [dependencies.xxx/yyy] ...) into Cargo.toml, regardless with or without a namespace. The namespace never make “copy” harder.

  • Non-namespaced crates, will hurt the ecosystem. Many people do need namespaced crates, perhaps one of them will fork Cargo/crates.io and add namespace feature in the future.

Edit: I’ve just added/modified the last two paragraph.

3 Likes

In my experience in using PyPi, another non-namespaced package repository, I have never seen a package name prefixed with the username with a generic package on the right, such as your example author-name-http,

If you can show some hard evidence that this problem exists in repositories such as whatever Ruby uses, PyPi, etc. I would agree with you.

4 Likes

When we looked at package ecosystems without namespacing, we found that people tended to go with more creative names (like nokogiri instead of "tenderlove's libxml2"). These creative names tend to be short and memorable, in part because of the lack of any hierarchy. They make it easier to communicate concisely and unambiguously about packages. They create exciting brands. And we've seen the success of several 10,000+ package ecosystems like NPM and RubyGems whose communities are prospering within a single namespace.

This seems like a conventions issue, not a namespacing issue. This could be solved by a policy encouraging creative names for packages. It seems like a strawman to compare piston vs bvssvni/game-engine -- recall it was piston on github (which is namespaced) before Crates.io packages even existed -- a fairer comparison would be piston vs bvssvni/piston.

The complexity argument seems weak, as well. [dependencies.servo/util] (for instance) is hardly more complex than [dependencies.util]. It seems reasonable to keep allowing extern crate util as long as there isn't a name clash, at which point something like the following could easily work (and would usually be unnecessary): [dependencies.servo/util] name = "servo_util" path = "../util" Actually, Servo seems like a perfect example of why namespacing is beneficial. Servo couldn't even be released on Crates.io in its current form without changing many of its crates (such as util) to be prefixed by servo-, at which point it's just manually namespacing itself.

While I don't feel terribly strongly about it, the arguments against namespacing have mostly convinced me that there aren't, in fact, any good arguments against namespacing.

6 Likes

I’m also very disappointed in the decision to forego namespacing. GitHub and Docker have worked fabulously in this regard. I’d contend that flat package repositories like PyPI, RubyGems, and npm have flourished in spite of their flat namespaces, not in any way because of of them.

Having to resort to “clever” and “creative” names in order to make a package unique is not a benefit at all. I’m guilty of this myself, having come up with names that are either cryptic or simply don’t convey what the package does just because the name I wanted to use was taken by someone else years ago.

Having silly names that don’t describe what the package does is a detriment to learning and adoption for new users. For example, it’d be much better for a new Rubyist to check for gems named “xml” or “xml_parser” to find an XML parser. The fact that the de facto package for XML parsing is “nokogiri” is institutional knowledge that impedes discovery.

I also agree with the above poster who noted that having the user’s name in the name of the package itself makes imports more cumbersome because you will usually want to alias the package name to one without the user’s name. Otherwise you’ll end up having wycats_http or the like littered throughout your code.

7 Likes

nuget has no namespaces, yes- but it promotes a strong convention of naming packages after actual .NET namespaces. this allows me to use packages Microsoft.Serialization.Json and Newtonsoft.Json, without having to think about what ‘unique brand’ (seriously??) someone’s come up with for their library.

on crates.io no convention of unnamespaced packages is in place and one would already be difficult to add. conventions are like that. point taken that (non-conventional) namespaces can be added backwards compatibly; I hope this happens ASAP.

Could you please enumerate what some of those real costs are? The only one I can think of is the implementation / switching cost. But if we're going to do it (which I believe we should), I think now is the time.

2 Likes

I'm not so sure about Docker. index.docker.io has tons of namespace-less packages, and for many (most?) of them, it is not entirely clear if they are prepared by the organization commonly associated with the name (e.g., the debian image has probably not been built on the Debian build infrastructure). Maybe I'm a bit cynical, but I'm pretty sure that users will only look at the namespace structure if the un-namespaced image does not work for them.

I don't use Docker, but this post reminded me of Docker Image Insecurity | Hacker News and this quote from the linked article:

I assumed this referenced Docker’s heavily promoted image signing system and didn’t investigate further at the time.

The naming issue is unrelated to that. Currently, Docker fails to deliver the bits securely in some scenarios (they do have HTTPS security at the transport, so it is not completely broken). Even if this is solved and users are guaranteed to receive the bits for the debian image if they run the docker pull debian command, this still doesn’t make sure that the user’s expectation what debian means in this context aligns which what Docker has in its naming database.

Not having namespaces at all may make it more obvious that this naming problem exist, and may even encourage collaboration (so that you end up with one sqlite3 package, and not dozens). This is what happens in Debian, where the majority of upstream software artifacts is packaged just once. If you don’t like it, you can file a bug, propose changes, and try to convince the package maintainer. Technical, you can even just upload your change because package ownership is not enforced for Debian Developers.

Obviously, this involves a lot of work (e.g., there is manual review before new packages are added to the archive, a step that can take quite a bit of time), but I think the Debian software repository benefits from that a lot.

1 Like

There is another use case which NPM Inc has been addressing: better supporting enterprises/private registries. I work at a large software company and the NPM scopes are great for keeping all our private packages grouped together. The npm client has evolved a lot over the last year to support interconnected features like scopes, multiple registries, multiple authentication tokens.

I do not know when big companies will adopt Rust but having in place mechanisms like the above will be welcomed at that time.

Tally one more on the side of namespaces. I haven’t heard a single concrete argument in favor of a flat structure other than that namespaces “add complexity” and I firmly believe that the additional information provided, and structure made greaty outweighs that.

I’ve also noted a lack of discussion of the very succesful java maven ecosystem. I hate maven the build tool, but the dependency infrastructure fantastically well and is used by all of the major JVM build tools, including sbt for scala, leinengren for clojure, etc. Just the primary maven server has close to 100K packages, and that doesn’t take into account that it is a federated system and build scripts can add additional resolvers to other public and private servers (which rust also needs support for).

I think you do the commuinity a disservice if you ignore how succesful that approach has been.

1 Like

I change my opinion. +1 to flat. The people who know how to make package managers know what they are doing.

1 Like

There are plenty of package managers which namespace their packages in some way. There is more to the packaging world than just Rubygems and NPM.