[Pre-RFC]: Packages as Namespaces

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

I do not think we can just paper over the naming collision between foo/bar and foo-bar. You mention that we map from underscores to hyphens, but ignore the fact that we very explicitly do not allow two crates that would normalize to the same name to be uploaded. There are two main reasons for this.

The first is that this opens up a potential attack vector. If you take some example code involving extern crate foo_bar, but accidentally depend on foo-bar instead of foo/bar, your code may compile but not be the library you thought you would.

The second is that we have a lot of precedent for disallowing things that could result in “your code stops compiling just by adding a crate”. This is why the orphan rule exists, and it’s also part of why we disallow foo_bar if foo-bar already exists. I do not think that saying “I am skeptical that it will come up much” is a satisfactory answer here.

More generally, I also would like to see an explanation of why you think that uploading a crate called foo should grant you special ownership of that namespace. That’s not generally how this sort of thing works. When you trademark a name, that doesn’t mean nobody else can use it as part of their name, just that they can’t use it to try and imitate you.

I’d love to see a little more clarification around the problem this is trying to solve. There’s some vague mention of squatting, but little explanation about how this is connected to squatting. I’d like to see some more clarification around what the specific problem that this is trying to solve is.

Which then brings us into alternatives, which aren’t really addressed at all. Are we just trying to solve the problem of visibility of packages owned by a single org? Is this something that can just be solved by adjusting the UI to make the owning org more visible? (e.g. you can already see all the “official” Diesel crates at https://crates.io/teams/github:diesel-rs:core)

Is the goal to help users figure out whether a crate is made by the maintainers of the main package or a third party? (Assuming that we will remove malicious crates if they are uploaded, why do we think that distinction is important?) Could this be solved by making the owners more prominent in the UI?

There’s also the unanswered question of should this be mandatory, and if not whether it should be opt in or opt out. I know I personally have no interest in reserving the Diesel namespace. I want authors who create plugins for Diesel to be treated the same as the authors of Diesel. If we make this opt-out, what about the authors of less maintained crates who may feel similarly, but not know they have to opt out of this?

I think this is a good start, but there’s a lot of questions that could use more explanation.

I would like to re-iterate that these are my personal opinions, and I am not speaking on behalf of the team.

8 Likes

That’s an interesting point. Currently we have informal collaboration and project-crate is not just "it’s (official) part of the project", but can also mean "works with the project".

1 Like

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

I think that’s the significantly more common case. Often it’s both. diesel-migrations happens to be maintained by the Diesel project, but the more important property is “it’s an implementation of database migrations for Diesel”. Try searching for chrono- or tokio- or diesel- in the crates.io search. All the top results are things that integrate with or add new behaviors to those projects, and are not maintained by the owner of the main project. (Note: I am not trying to present this as evidence that this is the most common case. This obviously has sampling bias. However I think those are some really good examples of the kinds of crates I’m referring to)

I would like to re-iterate that these are my personal opinions, and I am not speaking on behalf of the team.

3 Likes

It’s great to hear that others were already thinking along similar lines.

It sounds like the way that Cargo.toml/crates.io names are translated into rust names is going to be what makes or breaks this idea. I’m certainly not wedded to translating / to _, it was just the first thing that occurs to me. I recognize that “I don’t think it will come up much” isn’t a very satisfying answer. The idea of translating / to :: or some new sigal seems like a better approach. I think making the outer namespace cosmetic just makes the whole name clash problem worse because then I would have to resort to Cargo.toml aliases to deal with using both foo and bar/foo in the same project.

I agree with your concerns about how to map crates.io names to rust names. The notes that @Manishearth linked had some better ideas.

In terms of getting special ownership of the foo namespace because you own foo, I think it is important to note that anyone else can still publish a foo-awesome-extention crate in the top level namespace. The situation for third party authors of ecosystem crates would be exactly the same as it is today. The only difference would come in how easy it is for users to determine the level of affiliation a crate has with the “core crate”. Today, if I stumble upon regex-syntax I have a bit of a research burden to figure out how it relates to regex. Under the proposed system, I could tell at a glance.

A key part of this idea is the focus on packages instead of organizations. It’s not just a question of making organizations more visible. Packages are very focused, while some organizations can become quite sprawling. I don’t really care if a package lives in the google namespace, but the fact that it lives in the protocolbuffers namespace tells me a bit more.

There’s also the unanswered question of should this be mandatory, and if not whether it should be opt in or opt out. I know I personally have no interest in reserving the Diesel namespace. I want authors who create plugins for Diesel to be treated the same as the authors of Diesel.

You could opt not to use the new feature, and continue to publish crates maintained by the diesel team in the global namespace. Then users searching for diesel plugins would not see any diesel/foo crates, so all crates would be created equal in the world of diesel. No one else would be able to publish something in the diesel namespace, but if you never used it this wouldn’t matter.

Speaking only at a very high level & trying not to get too deep into technical issues (I agree with sgrif’s post in that regard).

I think it would be valuable for the registry to somehow enable some kind of very visible shared branding among crates by the same project. I don’t think this has anything to do with squatting per se, though it may eliminate some squatting we see today: for example, some people currently squat obvious names that would be associated with their brand so that other people can’t claim them and dilute their identity. However, its clear that some projects value their identity and want to protect it, and that users sometimes make trust decisions based on the belief that a package is part of a project, rather than a third party extension to it. All in all, I think it would be really great to have a way of very clearly and visibly identifying packages that are part of the same project.

(I’ll note that by introducing the hyphen mapping you seem to weaken this branding aspect of your proposal because packages from the “foo” project won’t be distinguished from other packages beginning with “foo-”. Indeed from a technical perspective this mapping seems to be very untenable, we can’t allow crate collision.)

However, I’m not certain that this shared identity should have the properties of a namespace. One other reason people like optional namespaces is that it makes it easy to upload forks, e.g. I could upload withoutboats/libc to fork libc. While technically I can already upload withoutboats-libc, the system doesn’t really have an affordance for adding forks to the registry, making it very uncommon, whereas I think this would make it more common.

I think enabling forks to be a part of the registry is on the surface appealing, but could have very serious negative consequences. While I could fork serde, for example, once I do, all of my libraries that depend on my fork are incompatible with libraries that depend on proper serde, because withoutboats_serde::Serialize is not the same trait as serde::Serialize. Encouraging forks breaks up the ecosystem, which is a negative outcome we’re trying to avoid. That’s why all of our mechanisms for supporting forks and downstream patches are designed to avoid allowing you to upload your fork to crates.io.

So I’d be inclined to enable a more visible shared branding without all of the consequences of “optional namespaces.” As a minimal first step, one thing we could do is just make ownership more visible in the web interface, without any change to cargo. For example, if you have a github organization as an owner of a package, we could make the page different so that the GitHub org is very prominently displayed, we link to other packages owned by that org, etc.

10 Likes

Regarding shared branding, on crates.rs I use GitHub repo URLs to detect related crates, so for example I see that rayon-core is part of Rayon, and put Rayon in the top navigation breadcrumbs:

It's generally nice, except:

  • Not all projects use a monorepo. I'm planning to extend this to repos from the same GitHub orgs as well.
  • UI for this is hard. Parent crate in the top nav is not visible enough. Put in front of the crate name is too visible.

For this I don't need namespaces. It could be a key in Cargo.toml like parent-crate = "rayon" or a workspace-like config file in the main repo.

2 Likes

@withoutboats, those are some good points. I certainly have come to understand that mapping / to _ won’t get off the ground.

I intended this proposal as a compromise to try to fix what seems to be a growing rift in the community, and I’m a bit worried that improving branding alone won’t go far enough to satisfy the namespaces camp. Technical decisions shoulnt be political so take that with a grain of salt.

I also think it would be a real shame to end up with a situation where branding was focused on the authors of a crate rather than the crate itself.

The name mapping issue has to be resolved, and it’s worth trying to get some input from those who feel strongly about namespaces needing to exist in some form.

Certainly, I think making branding more visible from the UI would be a great thing.

Thanks for kicking the discussion into gear, @ethanpailes.

The name mapping is certainly an issue. @Manishearth’s suggestion of remapping foo/bar to foo::bar looks better (IMO) but isn’t airtight either, because the parent crate foo may contain a module called bar. I suppose we could invent a new crate namespace separator, that is not / or ::, so as not to conflict with existing Rust syntax. As a strawman, I’ll use PHP’s awful choice of \. In that case, we just add \ as a permissible character in crate names, so you’d write foo\bar = "42.0" in Cargo.toml and use foo\bar::baz::Quux; in lib.rs.

I feel like it would be a good idea for any proposed RFC related to squatting or namespacing to do a little research into other languages registries (PyPI, npm, RubyGems, Maven, etc.) to see if other more mature (in terms of age) registries have:

  • Experienced repeated abuses from squatting
  • Done anything about it. And was it effective?
  • Set up associations between related packages
  • Learned lessons from past successes/failures in this area

Rust is far from unique in wanting to set up an open source collection of packages, so I’d have to imagine there are some other-language registry maintainers we could learn a lot from.

5 Likes

Regarding squatting and npm, I’ve posted in the squatting thread: Crates.io squatting

Yeah it seems like we would have to introduce a new sigil which is (1) not a binary operator, and (2) not part of the current name syntax (so just not :: or .). \ would work from a technical perspective, but I agree with your aesthetic misgivings. I don’t really have any obviously better ideas, but I’ll throw a few of them into the mix:

::: would work, but I think it would make people’s eyes swim.

:> doesn’t look terrible to my eyes, ::> could also work.

I could really use help figuring out what color to paint the fence.

The problem with this approach is that it adds even more complexity to the module system for users to deal with, which is already a weak point for rust. I would hate to undo the effort put into simplifying that aspect of the language.

We could also follow npm and make prefix the namespace’s name with some sigil, like @tokio/core, which would then be parsed as a single ident.

I’ve looked around at some prior art to get an idea of how viable a single flat namespace is in the long run. npm’s “scopes” seem like the closest thing to this proposal in the wild (the npm github issue where they adopted scopes has a discussion which is relevant here). Most flat-namespace ecosystems seem to have some sort of formal arbitration policy around package ownership, with ruby being the notable exception[1].

I think there are still legitimate concerns with every permutation of this proposal that has been discussed, but I think the new-sigal variant is good enough to deserve a day in court. I’ll start writing something up. It has been really useful to get all this input during the Pre-RFC process. Thanks everyone!

[1]: Python has: https://www.python.org/dev/peps/pep-0541/, npm has: https://www.npmjs.com/policies/disputes, I struggled to find a policy for rubygems.

This is my personal opinion. It should not be taken as a statement from the crates.io team, nor does it imply that anybody else on the team shares my opinion.

In general, I like this technique, although I think we need more user research to make sure this is solving the problems we hope it’s solving. Namely, as said other places in this thread, this really only addresses the “shared branding”/officialness problem. Is this a problem worth solving? Is this the best solution to that problem?

In addition to the questions already raised about sigils/naming/etc, I’d like to see more details around permissions made explicit in an eventual RFC along these lines: it seems like whoever is designated as an owner (either directly or via team ownership) of crate “foo” should be able to create new crates “foo/whatever”. Then should crates “foo/whatever” automatically get the same owners of crate “foo”? That seems like the common case and would save folks time having to remember to add the other owners, but are there scenarios where that wouldn’t be desired? If, after creation, the owners of either the crate “foo” or the crate “foo/whatever” are changed, should those changes be synced to the other crates in the group? Only if the top level crate’s owners are changed? What would folks expect?

EDIT: Also, thank you @ethanpailes for taking the time to write this up! :heart:

6 Likes

I like this style of proposal since it gives two desirable properties at once: affiliation and namespace security. I wonder if symbol mangling could be done in a backwards compatible manner by using double underscore as a mangled separator: the crate foo/bar becomes extrn crate foo__bar;. Icky, but if nobody has a crate name with double underscores already it could be done without changes to the ecosystem.

With regard to Carol’s questions of namespace author permissions, the obvious answers are either “make comprehensive security permissions that can be twiddled to do whatever a user might want”, or “crate namespace groups should trust the members of that group and do their own governance”. I like the latter solution.

Something else that I don’t think has been brought up yet: at the risk of making a hairy issue more complicated, is there any reason the level of nesting should stop at one?

@carols10cents, the note about permissions is a really good one. I had definitely planned to flesh the permissions model out more in the full proposal. I think @icefoxen’s point about basically trusting the maintainers of a top level package to manage their own community is a good one. Selecting good defaults is probably the key thing for this RFC.

With respect to user research, I think that looking at why npm adopted scopes can provide a lot of value. They have already done some of the legwork about this sort of thing and talked about it in the open, so that’s great. I started this discussion because of the flood of chatter about this stuff on /r/rust and other boards, and I think those discussions can also be mined.

I promise I’ll write up the actual RFC real soon now.

If we did have multi-level nesting, I wonder if it could start one level up with the registry. In order words: @foo/bar would be short for @cratesio/foo/bar and registries would just be namespaces that happened to have their own URL.

Sourcegraph search of the index for __

repo:^github\.com/rust-lang/crates\.io-index$ __

One crate, teardown_tree___treap, which is

Do not use - intended for internal use of teardown_tree crate

Which seems like it would enjoy being teardown_tree/treap instead?

This is the singular occurrence of a double underscore in the index, and is indeed a triple underscore.

(This post is merely meant to be informative and not issue support one way or another)

3 Likes

Awesome datapoint!