[Pre-RFC] Crates expropriation policy


#21

<3 proactively creating and revisiting policies is something i care about a lot and spent a lot of time working on at npm (if you think we’re in a tricky space right now with crates.io zomg i have horror stories for you).

the team is coming together and also really does care about this. i know the threads have been heated and i know we have a lot to improve on re: communication.

“it’s a problem, we just don’t have time to deal with it now” is definitely where we are at. and we’re working on building capacity so that we can address everything we need to- and even more so, we desire to address it proactively instead of re-actively :slight_smile: the entire team agrees that crates.io is a huge boon for rust, and we want to keep it that way <3


#22

I would still like to see this go through to a full RFC, even if there’s a feeling that it might not be the thing we need to prioritize right now. Even if that does end up being the case, I’d really like to have that documented by an RFC being officially postponed. I also think a discussion about exactly how big of a problem there is/perceived to be, and exactly where this should be prioritized is a healthy one to have.


#23

I’m trying hard to not be snarky here: do you not think that requiring contributors to crates.io attend a synchronous meeting will lead to a bubble type structure of the kind that might not lead to a better crates.io?

I experience this as a very high barrier to entry, so despite strong motivation to contribute this is a massive turnoff for me personally even before taking into account the actual timeslot, which is completely infeasible from my CET timezone.

I also frankly find it hard in general to square remarks on wanting to create an inclusive community with the tendency to get a lot of the work done in synchronous meetings, which makes it harder for people with day jobs that don’t involve working on Rust and/or people with (small) children and/or people across timezones.


#24

if you would like to join the team and the time of the meeting does not work for you- please let me know! i, and @sgrif, the leads of the team are very amenable to changing the schedule to fit the needs of folks.

that being said, team based work is how the Rust project works and has worked for a very long time. i agree that it does indeed create limitations but those limitations are indeed the limitations of the human condition. a condition i also find frustrating but have yet to be able to solve on my own.

inevitably, the work must be done. it must be done by a set of folks with shared context with a reliability and trust that it will get done. this will require sync communication on some level. it’s true that indeed this does limit types of participation but it attempts to limit it in a practical way, an unlimited number of asynchronously proposed features does not naturally turn into a coherent product.

the team is young and currently requires sync communication. virtually every team on the rust project has a sync component, and this is largely due to the desire for a coherent product. it’s possible that we will get to a place where we can operate completely asynchronoulsy, but we aren’t there yet. in fact, in my experience, most open source projects aren’t there yet. that being said, we want to accommodate as many folks as possible.

one of my largest jobs in rust thus far has been making the project more accessible. i’ll be the first to say we aren’t there yet. however, structure and, yes, synchronous communication are some of the best first building blocks i’ve encountered and implemented.

in the end, the work must get done. the product must be coherent. we’re working with a team based structure for now and that will change as the team evolves. i’m sorry that you feel excluded, i’d like to learn more about how to better include you.

i will say tho, that constantly fielding asynchronous calls for action from folks who don’t share context and understanding, doesn’t scale. at the moment, communicating that context and understanding is difficult due to understaffing. it’s a chicken and egg situation and we are doing our best. i’m deeply sorry you feel harmed by this. if there’s any way i can help do let me know, but also know that i am also trying to protect team members from harm when i direct action to these synchronous calls. the crates.io team is largely volunteers and to be completely frank i think the calls from the community have been unfairly overwhelming for the team and i am genuinely seeking to shield them in hopes that it improves their experience and productivity. if the team is constantly reacting to contextless calls for action, the team cannot act proactively.

i hope we can agree that acting proactively is the best move to take. i know the team needs to improve its ability to communicate context. the whole rust team does, to be honest. these are large problems that i haven’t seen any open source project do successfully at scale. these are open questions that we are solving on the fly now. we care a lot but it’s just actually not super easy. please feel free to shoot us a message with ideas. i am available personally to also discuss it.


#25

Just to add one thing onto what @ag_dubs said, you can absolutely still get involved if you’re unable to attend meetings, it’s just going to be a bit harder. We’re trying to do a better job of providing quest issues, and other ways to help people hop in, but that’s something that takes a lot of time to do well, and we’re already really pressed for time on our urgent issues.

This is part of why one of our main focuses right now is to grow the team, so we have more people and more time to focus on improving this. But for folks to actually join the team, I do think the meetings are important. For a team that operates on consensus, especially when we often have a huge amount of context that we need to communicate, the sync meetings have been extremely helpful.


#26

It’s been said well by @ag_dubs and @sgrif, but I would also like to emphasize that until recently the team has been small and informal. Fortunately that is now changing and we are entering a phase of growth. It looks like we had 7 team members and 7 observers participate in the meeting yesterday. I think that’s fantastic, especially given that just 6 weeks ago none of this was in place.

While a synchronous meeting may not be ideal, I think it has been essential to establishing a regular cadence for the project. It has also put in place a mechanism for us to efficiently discuss and reach consensus around our current top priorities. Overall, I’m more optimistic now than ever for the future of crates.io.


#27

We should not stress the cargo team and the deprecation scheme proposed here seems poorly defined.

We should instead consider adopting a hierarchical crate naming structure in some near future rust edition/epoch, say 2020 or 2022. It might resemble:

All existing top-level crate names become deprecated, except for compiler provided crates like std, core, etc. as well as a few crates like futures that integrate tightly with compiler crates.

I’d personally favor also giving top-level crate names to crates that provide their respective functionality extremely well, including a canonical and elegant interface, so serde and rand, but… I’d foresee disagreement around doing this and it’s not so important. Also, there are many domains like crypto in which one seemingly cannot provide a sufficient interface, making this option impossible there.

We instead reserve almost all top-level names for maintainers, both organizations or individuals, who must first demonstrate control over some relatively limited name resource utilizing roughly sane name, so a DNS name, github or gitlab user or organization name, stackoverflow name with reputation over 1000, or similar.

We’d verify these conditions automatically to minimize the cargo team’s workload. If you cannot satisfy such an automatic condition then you may request a top-level maintainer name manually with a pull request on some cargo-requests repository, which provides a forum for objections.

There are roughly two ways to manage a hierarchical system from cargo itself: We’d likely permit names like RustCrypto/digest but another interesting option would be including a maintainer’s whole namespace, like:

[dependencies]
namespace = "RustCrypto"
sha3 = ...

I’m nervous about this second option because namespaces would commonly overlap and users might become confused as to what namespace they pulled from. We might find tricks to mitigate this concern however.

In both options, we’d expose only the lower level crate names to rustc, not the maintainer name, so we should only need to edit Cargo.toml files for the new edition, not actual rust code.

tl;dr. We can avoid the problem here with only a hierarchical scheme that imposes only a slight complication to registering as a maintainer and modification to Cargo.toml files, and only minimal work for the crate.io team and no modifications to actual rust code. It remains to work out the best modification to Cargo.toml files however.


[Pre-RFC]: Packages as Namespaces
An idea to mitigate attacks through malicious crates
#28

It’s an interesting idea about edition level namespace, it could be a nice incentive to keep crates updated and you will be able to see “freshness” of project dependencies. But there is several serious issues:

  • After registration for a new edition namespace is open, there will be a “gold rush” to register names, and some less active contributors could lose theirs project names to squatters. This could be remedied to some extent with a certain period in which name can only be registered by an owner of the same name in the previous edition, but problem will be still there.
  • It will be a political decision to choose which project is worth being a top-level crate. What about regexp, winapi, ring and many other important projects? What about crates which they depend upon? I am afraid it will be much more stressful for crates.io team, and I am not sure that “1000 stars on github” is the right metric to apply here.
  • Changes to Cargo.tomland registry will be quite substantial, I am not sure if it can be done in a backwards compatible way even considering editions.

In the current proposal, the only additional workload for crates.io team will be to attempt contact owners 3 times, and if there is no answer to transfer crate to requestee. We even could automate it, e.g. by making bot which will mention owners automatically in the relevant issue/PR. And I am not sure that you mean under “poorly defined”. Could you be more specific?


#29

What I like about this proposal is that it really gets to the heart of the issue: the negative externality of an uncurated package repository is not just that good names are taken by squatted packages, but also possibly by packages that are suboptimal for one of many other issues. Here are some examples of ways a package could be suboptimal:

  • It could have an insufficient API (e.g. it isn’t finished)
  • It could introduce UB in safe code, or have some other security vulnerability
  • It could be old code and not keep up with the idioms of Rust in its API

None of these are directly tied to being maintained or not (which this RFC tries to make the factor), but the assumption here is that a well intentioned maintainer would want to fix these issues, so they would tend to disappear except in code for which the maintainer is now unreachable. I’m not sure that assumption holds, but its certainly clear that an unreachable maintainer is less likely to address issues that an active one.

However, I think this gets into the very obvious problems with any kind of policy like this at all: intervening to make sure that packages are in some sense good, useful, or fit for a purpose introduces a lot of work for the project as well as a lot of potential for messy, unpleasant fights. Rust has shown that there is a somebody store for a lot of purposes, but moderation hasn’t been one of them and I personally suspect the problem is intractable.

Ultimately, I believe this is just an implication of crates.io’s being uncurated. A better approach to trying to make packages with obvious names useful would be to have a curated package repository, which limits some of the pressure because you’re never taking someone’s package away, you’re just not letting them get one.

However, the Rust project is currently not really able to maintain a curated package repository. (I’d also point out that the core team proposed something a bit similar to this a year or two ago, and it was very widely unpopular.) Since there’s a lot of community interest in having an experience which is more curated, I think it would be a worthwhile effort for community members to try to set up a curated repository with a high standard for inclusion. We hope to stabilize alternative registries soon, so really nothing prevents this from being a totally unoffocial community effort.


#30

I have considered starting an alternative filtered/curated registry, but I don’t think it’s actually possible in practice.

I’ve started a separate topic about the deal-breaker with alternative registries:


#31

On the other hand, it should be possible to independently organize and implement a curated view of crates.io, choosing only crates that achieve the independent organization’s criteria for inclusion. You yourself could start good.crates.rs, for instance, and form a group to help decide which were the good ones. As a newbie, I’d find such an advice site useful.

A curated view of crates.io wouldn’t prevent name squatting, but (again as a newbie) I don’t much care whether a crate that does what I want has the perfect name, so long as I can find it by its keywords or description — or by a recommendation from something like good.crates.rs.


Silo effect of alternative registries
#32

Please let’s discuss alternative registries in the other thread, not in this one about expropriation policy of cratesio.


#33

Regarding prior art: Python has a surprisingly similar in spirit PEP 541, which is on the course to being accepted.


#34

I proposed revoking all crate names only once because after this we exploit the existing filters scheme used by github, gitlab, DNS, etc.

We could use repository url for github, etc. to give all existing crates a top-level maintainer name and thus provide a second-level crate name, so some some RustCrypto maintainer gets created automatically during this one time upgrade.

I’d personally favor select crates having that top-level name, but maybe only crates that spend years in some nursery. And I’d expect enough voices to disagree with doing that so that it’d never happen.

We should permit existing crates to continue using their current top-level name for one edition, or maybe only for their current version number.

I’m more worries that namespaces being quite this cheap encourage creating new crates for minor forks when maintainers disagree over small details, like exposed fields.