[Pre-RFC]: Packages as Namespaces

There’s been a lot of chatter about adding namespaces to crates.io recently, so apologies if this idea has already been discussed.

Summary

Allow packages to be used as namespaces in a way that is backwards compatible with the existing flat namespace.

Details

The owners of a crate on crates.io can add sub-crates which are referred to by the name ‘toplevelcrate/subcrate’ in Cargo.toml files. For example, instead of publishing regex-syntax, the regex maintainers could have published regex/syntax. This does not answer all concerns about squatting, but it does handle the case where squatting surrounds the ecosystem a popular crate like piston develops around itself. You will still be able to publish a crate in the top level namespace, and it will still feel like the natural way of doing things.

Why?

There is a segment of the rust community that is very concerned about name squatting and abuse of the global namespace, and another segment of the community which is concerned about the increased friction that namespaces can bring. This proposal attempts to answer some of the concerns of the former, while retaining the low friction experience we currently have. This proposal keeps the focus on packages, which may have maintainers who come and go, over people. Personally, I tend to develop trust for and familiarity with a particular software package first, and then to gain trust for the maintainer(s), and the package-focus of the current system is one that I really like.

Concern: Mapping crates.io names to rust-language crate names

Slashes will be mapped to underscores in the same way that dashes currently are. This does mean that two different crates.io crates could map to the same rust language name, but I am skeptical that it will come up much.

11 Likes

If you skip mapping between slash and hyphen, that's similar to this proposal:

I think the advantage of using crates as namespaces rather than allowing people to reserve prefixes is that there is a garentee that ‘piston/edit’ is sanctioned by the core piston team, but someone else could reserve the prefix. The prefix reservation happens automatically, without creating the possibility to step on anyone’s else’s toes.

The other issue with reserved prefixes is that it makes it easier to squat large segments of the top level namespace. What if I reserve the prefix “a”? That seems obviously abusive, but where do you draw the line and how do you enforce it.

The main downside of packages as namespaces seems like the fact that it would require more changes to exiting infrastructure (though it still does not seem like that huge of a change from the outside looking in).

I discussed optional namespacing like this with @wycats and I think @withoutboats this summer. We ended up with these notes. It’s discussing a couple issues (it’s also discussing the grouping of crates as packages, ignore that)

The main motivation, somewhat echoed in the pre-rfc, is declaring affiliation: It’s nice for all tokio-maintained crates to be called tokio/foo. Users know to trust these crates, and it’s harder to fool people with an unofficial tokio-badcode. The motivation was not to solve squatting per se (and it doesn’t really solve squatting, but alleviates some of its symptoms)

We didn’t settle on “crates as namespaces” (we also considered the option of namespaces being something you register separately). I personally think it works either way, but crates-as-namespaces has some advantages.

A lot of the discussion was around how you import things. Post 2018 extern crate shouldn’t be necessary, but you still have to name it in Cargo.toml and import in your rust code.

Assuming you import as foo/bar in Cargo.toml:

  • you use items via use bar::baz, but the name bar can be changed in Cargo.toml ("foo/bar" = {version="0.1", alias = "quux"})
  • you use items via use bar::baz, but the name bar can be changed in lib.rs (extern crate "foo/bar" as quux)
  • you use items via use foo_bar::baz (this is what this RFC proposes)
  • we straight up allow / in paths, perhaps with a new sigil. Something like use @foo/bar::baz.

There’s also the option of making foo/ cosmetic on Crates, i.e. as an affiliation marker and not a namespace.

We didn’t really end up choosing anything here, these were just the options we discussed. The things that we cared about were that it should be low-friction and ideally the local rust name is the same as the cargo.toml name (i.e. foo/bar or foo_bar) so you don’t have to remember when you have to drop the foo

Either way, this set of tradeoffs should probably be discussed a bit more and explicitly included in the RFC.


To be clear, I’m not representing the views of cargo team members here, just trying to share the results of a discussion I had with some of them.

5 Likes

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

I do not think we can just paper over the naming collision between foo/bar and foo-bar. You mention that we map from underscores to hyphens, but ignore the fact that we very explicitly do not allow two crates that would normalize to the same name to be uploaded. There are two main reasons for this.

The first is that this opens up a potential attack vector. If you take some example code involving extern crate foo_bar, but accidentally depend on foo-bar instead of foo/bar, your code may compile but not be the library you thought you would.

The second is that we have a lot of precedent for disallowing things that could result in “your code stops compiling just by adding a crate”. This is why the orphan rule exists, and it’s also part of why we disallow foo_bar if foo-bar already exists. I do not think that saying “I am skeptical that it will come up much” is a satisfactory answer here.

More generally, I also would like to see an explanation of why you think that uploading a crate called foo should grant you special ownership of that namespace. That’s not generally how this sort of thing works. When you trademark a name, that doesn’t mean nobody else can use it as part of their name, just that they can’t use it to try and imitate you.

I’d love to see a little more clarification around the problem this is trying to solve. There’s some vague mention of squatting, but little explanation about how this is connected to squatting. I’d like to see some more clarification around what the specific problem that this is trying to solve is.

Which then brings us into alternatives, which aren’t really addressed at all. Are we just trying to solve the problem of visibility of packages owned by a single org? Is this something that can just be solved by adjusting the UI to make the owning org more visible? (e.g. you can already see all the “official” Diesel crates at https://crates.io/teams/github:diesel-rs:core)

Is the goal to help users figure out whether a crate is made by the maintainers of the main package or a third party? (Assuming that we will remove malicious crates if they are uploaded, why do we think that distinction is important?) Could this be solved by making the owners more prominent in the UI?

There’s also the unanswered question of should this be mandatory, and if not whether it should be opt in or opt out. I know I personally have no interest in reserving the Diesel namespace. I want authors who create plugins for Diesel to be treated the same as the authors of Diesel. If we make this opt-out, what about the authors of less maintained crates who may feel similarly, but not know they have to opt out of this?

I think this is a good start, but there’s a lot of questions that could use more explanation.

I would like to re-iterate that these are my personal opinions, and I am not speaking on behalf of the team.

8 Likes

That's an interesting point. Currently we have informal collaboration and project-crate is not just "it's (official) part of the project", but can also mean "works with the project".

1 Like

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

I think that's the significantly more common case. Often it's both. diesel-migrations happens to be maintained by the Diesel project, but the more important property is "it's an implementation of database migrations for Diesel". Try searching for chrono- or tokio- or diesel- in the crates.io search. All the top results are things that integrate with or add new behaviors to those projects, and are not maintained by the owner of the main project. (Note: I am not trying to present this as evidence that this is the most common case. This obviously has sampling bias. However I think those are some really good examples of the kinds of crates I'm referring to)

I would like to re-iterate that these are my personal opinions, and I am not speaking on behalf of the team.

3 Likes

It’s great to hear that others were already thinking along similar lines.

It sounds like the way that Cargo.toml/crates.io names are translated into rust names is going to be what makes or breaks this idea. I’m certainly not wedded to translating / to _, it was just the first thing that occurs to me. I recognize that “I don’t think it will come up much” isn’t a very satisfying answer. The idea of translating / to :: or some new sigal seems like a better approach. I think making the outer namespace cosmetic just makes the whole name clash problem worse because then I would have to resort to Cargo.toml aliases to deal with using both foo and bar/foo in the same project.

I agree with your concerns about how to map crates.io names to rust names. The notes that @Manishearth linked had some better ideas.

In terms of getting special ownership of the foo namespace because you own foo, I think it is important to note that anyone else can still publish a foo-awesome-extention crate in the top level namespace. The situation for third party authors of ecosystem crates would be exactly the same as it is today. The only difference would come in how easy it is for users to determine the level of affiliation a crate has with the "core crate". Today, if I stumble upon regex-syntax I have a bit of a research burden to figure out how it relates to regex. Under the proposed system, I could tell at a glance.

A key part of this idea is the focus on packages instead of organizations. It's not just a question of making organizations more visible. Packages are very focused, while some organizations can become quite sprawling. I don't really care if a package lives in the google namespace, but the fact that it lives in the protocolbuffers namespace tells me a bit more.

There’s also the unanswered question of should this be mandatory, and if not whether it should be opt in or opt out. I know I personally have no interest in reserving the Diesel namespace. I want authors who create plugins for Diesel to be treated the same as the authors of Diesel.

You could opt not to use the new feature, and continue to publish crates maintained by the diesel team in the global namespace. Then users searching for diesel plugins would not see any diesel/foo crates, so all crates would be created equal in the world of diesel. No one else would be able to publish something in the diesel namespace, but if you never used it this wouldn't matter.

Speaking only at a very high level & trying not to get too deep into technical issues (I agree with sgrif’s post in that regard).

I think it would be valuable for the registry to somehow enable some kind of very visible shared branding among crates by the same project. I don’t think this has anything to do with squatting per se, though it may eliminate some squatting we see today: for example, some people currently squat obvious names that would be associated with their brand so that other people can’t claim them and dilute their identity. However, its clear that some projects value their identity and want to protect it, and that users sometimes make trust decisions based on the belief that a package is part of a project, rather than a third party extension to it. All in all, I think it would be really great to have a way of very clearly and visibly identifying packages that are part of the same project.

(I’ll note that by introducing the hyphen mapping you seem to weaken this branding aspect of your proposal because packages from the “foo” project won’t be distinguished from other packages beginning with “foo-”. Indeed from a technical perspective this mapping seems to be very untenable, we can’t allow crate collision.)

However, I’m not certain that this shared identity should have the properties of a namespace. One other reason people like optional namespaces is that it makes it easy to upload forks, e.g. I could upload withoutboats/libc to fork libc. While technically I can already upload withoutboats-libc, the system doesn’t really have an affordance for adding forks to the registry, making it very uncommon, whereas I think this would make it more common.

I think enabling forks to be a part of the registry is on the surface appealing, but could have very serious negative consequences. While I could fork serde, for example, once I do, all of my libraries that depend on my fork are incompatible with libraries that depend on proper serde, because withoutboats_serde::Serialize is not the same trait as serde::Serialize. Encouraging forks breaks up the ecosystem, which is a negative outcome we’re trying to avoid. That’s why all of our mechanisms for supporting forks and downstream patches are designed to avoid allowing you to upload your fork to crates.io.

So I’d be inclined to enable a more visible shared branding without all of the consequences of “optional namespaces.” As a minimal first step, one thing we could do is just make ownership more visible in the web interface, without any change to cargo. For example, if you have a github organization as an owner of a package, we could make the page different so that the GitHub org is very prominently displayed, we link to other packages owned by that org, etc.

10 Likes

Regarding shared branding, on crates.rs I use GitHub repo URLs to detect related crates, so for example I see that rayon-core is part of Rayon, and put Rayon in the top navigation breadcrumbs:

It's generally nice, except:

  • Not all projects use a monorepo. I'm planning to extend this to repos from the same GitHub orgs as well.
  • UI for this is hard. Parent crate in the top nav is not visible enough. Put in front of the crate name is too visible.

For this I don't need namespaces. It could be a key in Cargo.toml like parent-crate = "rayon" or a workspace-like config file in the main repo.

2 Likes

@withoutboats, those are some good points. I certainly have come to understand that mapping / to _ won’t get off the ground.

I intended this proposal as a compromise to try to fix what seems to be a growing rift in the community, and I’m a bit worried that improving branding alone won’t go far enough to satisfy the namespaces camp. Technical decisions shoulnt be political so take that with a grain of salt.

I also think it would be a real shame to end up with a situation where branding was focused on the authors of a crate rather than the crate itself.

The name mapping issue has to be resolved, and it’s worth trying to get some input from those who feel strongly about namespaces needing to exist in some form.

Certainly, I think making branding more visible from the UI would be a great thing.

Thanks for kicking the discussion into gear, @ethanpailes.

The name mapping is certainly an issue. @Manishearth’s suggestion of remapping foo/bar to foo::bar looks better (IMO) but isn’t airtight either, because the parent crate foo may contain a module called bar. I suppose we could invent a new crate namespace separator, that is not / or ::, so as not to conflict with existing Rust syntax. As a strawman, I’ll use PHP’s awful choice of \. In that case, we just add \ as a permissible character in crate names, so you’d write foo\bar = "42.0" in Cargo.toml and use foo\bar::baz::Quux; in lib.rs.

I feel like it would be a good idea for any proposed RFC related to squatting or namespacing to do a little research into other languages registries (PyPI, npm, RubyGems, Maven, etc.) to see if other more mature (in terms of age) registries have:

  • Experienced repeated abuses from squatting
  • Done anything about it. And was it effective?
  • Set up associations between related packages
  • Learned lessons from past successes/failures in this area

Rust is far from unique in wanting to set up an open source collection of packages, so I’d have to imagine there are some other-language registry maintainers we could learn a lot from.

5 Likes

Regarding squatting and npm, I’ve posted in the squatting thread: Crates.io squatting

Yeah it seems like we would have to introduce a new sigil which is (1) not a binary operator, and (2) not part of the current name syntax (so just not :: or .). \ would work from a technical perspective, but I agree with your aesthetic misgivings. I don’t really have any obviously better ideas, but I’ll throw a few of them into the mix:

::: would work, but I think it would make people’s eyes swim.

:> doesn’t look terrible to my eyes, ::> could also work.

I could really use help figuring out what color to paint the fence.

The problem with this approach is that it adds even more complexity to the module system for users to deal with, which is already a weak point for rust. I would hate to undo the effort put into simplifying that aspect of the language.

We could also follow npm and make prefix the namespace’s name with some sigil, like @tokio/core, which would then be parsed as a single ident.

I’ve looked around at some prior art to get an idea of how viable a single flat namespace is in the long run. npm’s “scopes” seem like the closest thing to this proposal in the wild (the npm github issue where they adopted scopes has a discussion which is relevant here). Most flat-namespace ecosystems seem to have some sort of formal arbitration policy around package ownership, with ruby being the notable exception[1].

I think there are still legitimate concerns with every permutation of this proposal that has been discussed, but I think the new-sigal variant is good enough to deserve a day in court. I’ll start writing something up. It has been really useful to get all this input during the Pre-RFC process. Thanks everyone!

[1]: Python has: https://www.python.org/dev/peps/pep-0541/, npm has: https://www.npmjs.com/policies/disputes, I struggled to find a policy for rubygems.

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

In general, I like this technique, although I think we need more user research to make sure this is solving the problems we hope it’s solving. Namely, as said other places in this thread, this really only addresses the “shared branding”/officialness problem. Is this a problem worth solving? Is this the best solution to that problem?

In addition to the questions already raised about sigils/naming/etc, I’d like to see more details around permissions made explicit in an eventual RFC along these lines: it seems like whoever is designated as an owner (either directly or via team ownership) of crate “foo” should be able to create new crates “foo/whatever”. Then should crates “foo/whatever” automatically get the same owners of crate “foo”? That seems like the common case and would save folks time having to remember to add the other owners, but are there scenarios where that wouldn’t be desired? If, after creation, the owners of either the crate “foo” or the crate “foo/whatever” are changed, should those changes be synced to the other crates in the group? Only if the top level crate’s owners are changed? What would folks expect?

EDIT: Also, thank you @ethanpailes for taking the time to write this up! :heart:

6 Likes

I like this style of proposal since it gives two desirable properties at once: affiliation and namespace security. I wonder if symbol mangling could be done in a backwards compatible manner by using double underscore as a mangled separator: the crate foo/bar becomes extrn crate foo__bar;. Icky, but if nobody has a crate name with double underscores already it could be done without changes to the ecosystem.

With regard to Carol’s questions of namespace author permissions, the obvious answers are either “make comprehensive security permissions that can be twiddled to do whatever a user might want”, or “crate namespace groups should trust the members of that group and do their own governance”. I like the latter solution.

Something else that I don’t think has been brought up yet: at the risk of making a hairy issue more complicated, is there any reason the level of nesting should stop at one?