Pre-RFC: Configurable authentication requirements and non-Git indices

Support for custom registries has been added, but they’re fairly limited in what they are able to be used for. Currently, authentication is not required for anything crates.io does not need it for. There is also no option to configure this.

One major use case for this would be companies with private registries. Currently, the only way to ensure privacy of these is to keep them behind a firewall, but that may not be viable for all. Adding a configuration option that send the Authorization header on all API requests, and on downloads (or other auth schemes for downloads) would be very useful for this case.

The reason I mention non-Git indices is because some companies have projects with higher levels of privacy for projects, with only certain people being allowed to know about them, and with Git, it’s not as easy to filter who sees what packages. In theory, this may be able to be solved with separate registries, but that is not as streamlined.

I’m ready to write an RFC for this if it’s likely to be accepted.

I would look into seeing if you could implement the necessary functionality through cargo plug-ins instead of in cargo itself. Then, no RFC required and it does not require the core team to take on maintenance burden of the extensions. You, and other interested parties, can simply maintain one or more cargo plugins that provide the necessary functionality.

I’m not sure how this would be able to be implemented externally, as it’s part of the core way Cargo deals with packages.

1 Like

A local proxy perhaps that is started from a cargo plug-in?

Cargo ------> Crates.io
       |----> Local Proxy (Listening on Local Port) -----> Private Repo #1 
                                                      |--> Private Repo #2 

You could use like:

cargo proxy start
cargo ... normal commands

cargo proxy start starts the local proxy which reads configuration of available private repositories and the necessary authentication information (whatever is desired/needed).

The location of the private repositories would then be something like, “localhost:63000” (or whatever port you wish to use for your proxy).

Cargo would then try to read packages from the local proxy, which, would then use the list of configured private repositories and authentication information to proxy the requests through to the appropriate private registry.

I could see something like that working.

1 Like

That’s an interesting proposal, but I have concerns about idle power usage on laptops. I guess, if implemented correctly, it could remain completely idle until a connection occurs. I’ll look into implementing this.

One other concern would be the requirement to start it every time. Perhaps a wrapper could be made to check if it is running, and put that in the shell rc.

1 Like

Working from memory here, so this is not an official statement from the team. The Cargo team has discussed this several times. So Yes we agree that the current private registries is not sufficient for all users. As I recall the reason there has not been work done on this in the past is that we don’t have the bandwidth to find the relevant stakeholders and come up with a shared design. ( not to mention the lack of bandwidth to do the implementation. ) If you want to spearhead that design / implementation work we are open to working with you!

Now for some details:

Configuration a way to send some authorization with more endpoints is defintly needed. Open questions include how should it be configured, what forms of authorizations should it support, does this leave the api open to MITM or Credential stuffing attacks, what threat model do we need to be robust to… All theas need to be answered for the current users and the foreseeable upcoming users.

As for non-Git indexes, this may be more complicated than it first appears. One piece of background is that many different people (think thay) want this, ech for different reasons, and they end up talking past each other. There are some important design constraints on how the index works given the design of Cargo. For example the first interaction of Cargo with an index is to do a git clone; backwards compatibility means that it needs to work with indexes that don’t know about the new api. An additional example is that Cargos resolver is assumed to be fast ( at least for the happy path ) and gets called quite a lot, at least once per build, so a new api will need to have a persistent on disk copy and a reliable way to know if it is still up to date.

One thing that makes the Git index less onerous than if may at first appear is that Cargo always does a pull from master. So the returned commit does not need to be from the same tree as anything that was previously sent. (see Cargo's crate index: upcoming squash into one commit) If for example the url has a username in it and the server needs to figure out what that user is allowed to see. It is entirely correct for a custom registry to return a totally new git repo with one commit on each request.

3 Likes

The note about Cargo replacing the index each time is very useful. Perhaps on the server, these copies could be stored with timestamps of the latest update to change them out.

The question about indices not knowing about the new API could be solved by defaulting to Git and checking via a value in .cargo/config to see if the index is non-Git.

For the issue of MITM and credential stuffing, I think the tokens obtained via cargo login should be used with this, and enforce HTTPS on these repositories. Honestly, HTTPS should be standard for any web application.

I am more than willing to do the work on this design/impl, so let me know. I can come up with an RFC for you tomorrow.

I’d rather not see engineering effort put into moving the index away from git without very strong motivation. This would be a really herculean effort given how intertwined cargo is with its git index today.

I have noticed how intertwined Cargo is with Git. I think the new git repo each time would be a good middle ground.

1 Like

I am more than willing to do the work on this design/impl, so let me know.

We can always use help! I think the next step is to find the users of custom registries and interview them to find out what there pain points are in practice. Then an RFC can document "these are the problems they are having and this is how to directly solve them". Changing the interface for custom registries is a large permanent ask. Adding an alternative to Git doubly so. An RFC is going to need really well researched justification. Building a plugin, as @gbutler suggested, and dimistarting that it is getting used, would also be a good way to demonstrate the usefulness of the design.

1 Like

I see you’ve opened these issues:

Here are some related issues with various discussions:

2 Likes

This Pre-RFC is a combination of the first three you mentioned. The issue of an auth scheme would be a breaking change however, so we must tread lightly with that.

These are two incredibly huge, nonspecific, and unrelated proposed changes to the way Cargo currently works. Before you write an RFC, you might want to focus on a particular, tangible goal.

For authentication methods, in addition to the issues @carols10cents posted you may at least want to peruse this thread (not that I think it contains particularly good ideas, but it does contain a lot of good discussion):

I’ve dropped non-Git indices for now, so I’m focusing on authenticating all requests.

Hi, I'm an engineer at Cloudsmith, we provide hosted private registries via our service and currently work around the authentication issues discussed in this thread in a few different ways (URL-based tokens, git indexes generated on the fly per user, etc).

We'd be happy to discuss our pain points in an appropriate forum and arrange intros/interviews with some of our customers so they can discuss any issues they've come across in practice if that helps.

That would be great! Thank you! We are generally in favor of open discussion, so this is a great place to explore how things can be made better for you. @julian opened an RFC, I would love to get your thoughts in that discussion.

As @Eh2406 said, I have opened an RFC with an initial design in there. Opinions are appreciated. The note about tokens within the URL is a good stop gap, however. I hadn't thought of that.