Pre-RFC: Configurable authentication requirements and non-Git indices

julian · July 4, 2019, 10:19pm

Support for custom registries has been added, but they’re fairly limited in what they are able to be used for. Currently, authentication is not required for anything crates.io does not need it for. There is also no option to configure this.

One major use case for this would be companies with private registries. Currently, the only way to ensure privacy of these is to keep them behind a firewall, but that may not be viable for all. Adding a configuration option that send the Authorization header on all API requests, and on downloads (or other auth schemes for downloads) would be very useful for this case.

The reason I mention non-Git indices is because some companies have projects with higher levels of privacy for projects, with only certain people being allowed to know about them, and with Git, it’s not as easy to filter who sees what packages. In theory, this may be able to be solved with separate registries, but that is not as streamlined.

I’m ready to write an RFC for this if it’s likely to be accepted.

gbutler · July 5, 2019, 12:34am

I would look into seeing if you could implement the necessary functionality through cargo plug-ins instead of in cargo itself. Then, no RFC required and it does not require the core team to take on maintenance burden of the extensions. You, and other interested parties, can simply maintain one or more cargo plugins that provide the necessary functionality.

julian · July 5, 2019, 12:36am

I’m not sure how this would be able to be implemented externally, as it’s part of the core way Cargo deals with packages.

gbutler · July 5, 2019, 12:55am

A local proxy perhaps that is started from a cargo plug-in?

Cargo ------> Crates.io
       |----> Local Proxy (Listening on Local Port) -----> Private Repo #1 
                                                      |--> Private Repo #2

You could use like:

cargo proxy start
cargo ... normal commands

cargo proxy start starts the local proxy which reads configuration of available private repositories and the necessary authentication information (whatever is desired/needed).

The location of the private repositories would then be something like, “localhost:63000” (or whatever port you wish to use for your proxy).

Cargo would then try to read packages from the local proxy, which, would then use the list of configured private repositories and authentication information to proxy the requests through to the appropriate private registry.

I could see something like that working.

julian · July 5, 2019, 3:11am

That’s an interesting proposal, but I have concerns about idle power usage on laptops. I guess, if implemented correctly, it could remain completely idle until a connection occurs. I’ll look into implementing this.

One other concern would be the requirement to start it every time. Perhaps a wrapper could be made to check if it is running, and put that in the shell rc.

Eh2406 · July 5, 2019, 4:05am

Working from memory here, so this is not an official statement from the team. The Cargo team has discussed this several times. So Yes we agree that the current private registries is not sufficient for all users. As I recall the reason there has not been work done on this in the past is that we don’t have the bandwidth to find the relevant stakeholders and come up with a shared design. ( not to mention the lack of bandwidth to do the implementation. ) If you want to spearhead that design / implementation work we are open to working with you!

Now for some details:

Configuration a way to send some authorization with more endpoints is defintly needed. Open questions include how should it be configured, what forms of authorizations should it support, does this leave the api open to MITM or Credential stuffing attacks, what threat model do we need to be robust to… All theas need to be answered for the current users and the foreseeable upcoming users.

As for non-Git indexes, this may be more complicated than it first appears. One piece of background is that many different people (think thay) want this, ech for different reasons, and they end up talking past each other. There are some important design constraints on how the index works given the design of Cargo. For example the first interaction of Cargo with an index is to do a git clone; backwards compatibility means that it needs to work with indexes that don’t know about the new api. An additional example is that Cargos resolver is assumed to be fast ( at least for the happy path ) and gets called quite a lot, at least once per build, so a new api will need to have a persistent on disk copy and a reliable way to know if it is still up to date.

One thing that makes the Git index less onerous than if may at first appear is that Cargo always does a pull from master. So the returned commit does not need to be from the same tree as anything that was previously sent. (see Cargo's crate index: upcoming squash into one commit) If for example the url has a username in it and the server needs to figure out what that user is allowed to see. It is entirely correct for a custom registry to return a totally new git repo with one commit on each request.

julian · July 5, 2019, 4:18am

The note about Cargo replacing the index each time is very useful. Perhaps on the server, these copies could be stored with timestamps of the latest update to change them out.

The question about indices not knowing about the new API could be solved by defaulting to Git and checking via a value in .cargo/config to see if the index is non-Git.

For the issue of MITM and credential stuffing, I think the tokens obtained via cargo login should be used with this, and enforce HTTPS on these repositories. Honestly, HTTPS should be standard for any web application.

I am more than willing to do the work on this design/impl, so let me know. I can come up with an RFC for you tomorrow.

withoutboats · July 5, 2019, 1:34pm

I’d rather not see engineering effort put into moving the index away from git without very strong motivation. This would be a really herculean effort given how intertwined cargo is with its git index today.

julian · July 5, 2019, 2:42pm

I have noticed how intertwined Cargo is with Git. I think the new git repo each time would be a good middle ground.

Eh2406 · July 5, 2019, 2:48pm

I am more than willing to do the work on this design/impl, so let me know.

We can always use help! I think the next step is to find the users of custom registries and interview them to find out what there pain points are in practice. Then an RFC can document "these are the problems they are having and this is how to directly solve them". Changing the interface for custom registries is a large permanent ask. Adding an alternative to Git doubly so. An RFC is going to need really well researched justification. Building a plugin, as @gbutler suggested, and dimistarting that it is getting used, would also be a good way to demonstrate the usefulness of the design.

carols10cents · July 5, 2019, 2:58pm

I see you’ve opened these issues:

Here are some related issues with various discussions:

julian · July 5, 2019, 3:05pm

This Pre-RFC is a combination of the first three you mentioned. The issue of an auth scheme would be a breaking change however, so we must tread lightly with that.

bascule · July 7, 2019, 12:15am

These are two incredibly huge, nonspecific, and unrelated proposed changes to the way Cargo currently works. Before you write an RFC, you might want to focus on a particular, tangible goal.

For authentication methods, in addition to the issues @carols10cents posted you may at least want to peruse this thread (not that I think it contains particularly good ideas, but it does contain a lot of good discussion):

github.com/rust-lang/crates.io

Non-Github account creation

opened 09:51PM - 29 Apr 16 UTC

ntninja

C-enhancement ✨ A-accounts E-big

http://doc.crates.io/crates-io.html says: > Acquiring an API token > > Firs…t thing’s first, you’ll need an account on crates.io to acquire an API token. To do so, visit the home page and log in via a GitHub account (required for now). Any plans to change this? Not publishing on crates.io meanwhile… ---- # Current status The team consensus is summarized in [this comment](https://github.com/rust-lang/crates.io/issues/326#issuecomment-216662599): > Yes there's no particular reason that we don't have anything other than GitHub yet beyond that no one's actually implemented it. It was always the intent to have a variety of login options, and then you could link multiple login varieties to the same account (e.g. you can log in via oauth from either Twitter or GitHub) That is, we are in favor of adding additional login methods, and we do not need any further discussion or 👍🏻 s on this issue so it is locked. But there are a number of hurdles to get there: - crates.io usernames currently match the GitHub usernames, so if we allow a different provider we will need a way to avoid conflicts - the current system for assigning teams as crate owners is unfortunately directly tied to GitHub teams - from what I was told, GitHub is somewhat good at preventing spam accounts and other malicious activity, which we can't handle ourselves with the existing resources of the project - the code and database are quite tied to the assumption that GitHub is the only login (the users table has column gh_login and gh_id for example), so significant refactorings need to take place to enable adding other services The way forward is to find solutions for all of these, writing RFCs as neccessary to propose these solutions, then implementing the necessary changes. If you are interested in helping with this work, please feel free to get started!

julian · July 7, 2019, 12:22am

I’ve dropped non-Git indices for now, so I’m focusing on authenticating all requests.

paddycarey · July 23, 2019, 3:39pm

Hi, I'm an engineer at Cloudsmith, we provide hosted private registries via our service and currently work around the authentication issues discussed in this thread in a few different ways (URL-based tokens, git indexes generated on the fly per user, etc).

We'd be happy to discuss our pain points in an appropriate forum and arrange intros/interviews with some of our customers so they can discuss any issues they've come across in practice if that helps.

Eh2406 · July 23, 2019, 5:24pm

That would be great! Thank you! We are generally in favor of open discussion, so this is a great place to explore how things can be made better for you. @julian opened an RFC, I would love to get your thoughts in that discussion.

julian · July 26, 2019, 12:51am

As @Eh2406 said, I have opened an RFC with an initial design in there. Opinions are appreciated. The note about tokens within the URL is a good stop gap, however. I hadn't thought of that.

steffahn · December 22, 2024, 5:13pm

This topic was automatically closed 540 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: Cargo alternative registry authentication cargo	16	3582	December 22, 2024
Pre-RFC: JWTs for private Cargo registry authentication cargo	20	1445	December 22, 2024
Cargo may leak private tokens when overriding the crates.io index cargo	1	779	April 12, 2019
[WIP, Pre-RFC] Federation cargo	32	2755	March 25, 2019
Vendor lock-in	11	1116	March 25, 2019

Pre-RFC: Configurable authentication requirements and non-Git indices

Related topics