Cargo sparse protocol feedback thread

Cargo's new index protocol will be available starting in Rust 1.68, which will be released on 2023-03-09. This new "sparse" protocol should usually provide a significant performance improvement when accessing crates.io.

We would like your help in testing this new feature and infrastructure. If you use beta (1.68) or nightly-2023-01-21 or newer, set the environment variable CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse, or edit your .cargo/config.toml file to add:

[registries.crates-io]
protocol = "sparse"

We would like to hear reports on your experience. If you run into a problem, please open an issue. If you would like to post general feedback, please leave a comment on this thread.

Along with fetching crates and running cargo update, we'd also like to hear if you have any issues when running cargo publish. Another data point that may be helpful is to gauge how many users are behind a restrictive firewall, proxy, or other network environment that prevents access to the index.

More information is available in this blog post: Help test Cargo's new index protocol | Inside Rust Blog

14 Likes

In limited testing so far, on FreeBSD 13.1, seems to work fine, and quickly.

The sparse option for fetching has been working like a charm, as expected!

For publishing I just now published a workspace of crates with the sparse protocol enabled, and two things I noticed during the publish process was:

  • Two crates ended up timing out around ~100 "complete" being listed. I haven't previously seen a timeout with the git-based publishing process yet so I'm not sure if this was expected or not. I published ~20 crates with this script and only two timed out, however. I haven't published a ton of crates with the git-based protocol where Cargo waits for the version to be in the index before finishing the publish command, but the sparse-version did feel a bit slower than the git-based version. (not that this was head-to-head mind you, just based on previous workspace publications)
  • Personally I found the "Waiting ..." UI to be sort of confusing. There's a "Fetch ..." progress bar but it's not clear to me what progress is being made as it always jumps to the right quickly. The "complete" counter additionally increments by 2-at-a-time, and the "pending" counter sticks at 1. I originally thought that this was doing recursive resolution since the UI looked the same as during a normal cargo fetch, but I suspect it's doing repeated http requests to find the latest version after watching it more.

The publication process nonetheless worked without flaw. Thanks again to everyone who's worked on this feature!

4 Likes

Thanks for the feedback @alexcrichton!

Regarding the delays, there are a few issues there. One is that it relies on invalidating the CDN cache, which can be a little slow with our current provider. One possibility is using a different service which you may be familiar with.

Another issue is that crates.io is currently processing requests to update the index one at a time. A potential improvement would be to batch them.

A much longer-term solution is to have a different index format that does not require cache invalidation, which is tracked in Sparse registry indexes should be viewable atomically · Issue #10928 · rust-lang/cargo · GitHub.

Another much longer-term solution is to have a different publishing API and workflow that would allow publishing multiple packages atomically.

We may likely need to increase the timeout for when Cargo gives up waiting for the update to appear. The default is currently 60 seconds. There is also a -Z publish-timeout option to change that, though that is not necessarily tracked to be something we intend to stabilize.

For the UI, yea it is just repeatedly trying to update the index. Ideally I think it should just be a single progress bar. That is tracked in Extra "Updating index" messages after publishing a crate · Issue #11304 · rust-lang/cargo · GitHub and I would love for someone to work on that.

Sounds like everything I experience is within the realm of "yes, that's expected", so sounds good to me! Nothing is a showstopper by a longshot, and I look forward to Rust 1.68 where I can update CI workflows and such to using sparse by default!

Added it to the Rust Beta CI tasks for my work project. Seems to consistently save us between 40 and 80s (looks like it saved 30-40s on the registry update, I'm not entirely sure where the other time savings are from, but I'll take it)

I have used the sparse option to publish a series of packages. It run very lightly and I had no problems.

One point I would like to mention is that when I first used it, I was confused by "Waiting" in the last line. Since there was also a word ctrl-c, I wasn't sure if the process completed or not. However, I was able to determine that it was over because the "Waiting" was green and the prompt was displayed.

It would be a little easier to understand if "Waiting" was in the past tense or the ctrl-c message was turned off when the process was finished successfully, wouldn't it?

   Compiling pkg v0.0.0 (/path/to/pkg-0.0.0)
    Finished dev [unoptimized + debuginfo] target(s) in 59.53s
    Packaged 22 files, 115.4KiB (23.8KiB compressed)
   Uploading pkg v0.0.0 (/path/to/cwd)
    Updating crates.io index
     Waiting on `pkg` to propagate to crates.io index (ctrl-c to wait asynchronously)

As a comparison, for example, cargo install showed the following in the past tense:

   Compiling pkg v0.0.0 (/path/to/cwd)
    Finished release [optimized] target(s) in 2m 39s
   Replacing /path/to/bin/exe
    Replaced package
2 Likes

I was publishing some more crates today for the first time and I noticed that instead of incrementing by 2 the counter was incrementing very rapidly at around 80 per tick. That wasn't necessarily an issue but I was soon rate limited by crates.io with the message:

...
   Uploading wit-bindgen-cli v0.3.0 (/home/acrichto/code/wit-bindgen)
error: failed to publish to registry at https://crates.io

Caused by:
  the remote server responded with an error (status 429 Too Many Requests): You have published too many crates in a short period of time. Please try again after Mon, 13 Feb 2023 21:21:15 GMT or email help@crates.io to have your limit increased.

I'm not sure if these two events were related, however, but wanted to raise the issue since I'm publishing ~8 crates back-to-back and haven't had this come up before. I realize though as I type this that all the crates are new to crates.io whereas many of my publishes historically are for new versions of existing crates, so there may also just be a smaller rate limit for new crates that I'm running into

The "waiting" message is printed before we block on the server to finish the publish. It is not finished until pretty much at process exit. Hitting ctrl-c is an appropriate action to take if you do not care to wait.

I'm a little confused by this. Why would "Waiting" be green affect things? By "prompt was displayed", do you mean the process exited and you saw your command prompt? If so, then yes, the waiting was complete by that point. If it didn't wait very long, then that is great! I sometimes wait several seconds when waiting.

I had a cargo-edit user report that mixing of the sparse registry (cargo) with git registry (cargo-edit) wasn't working and got

Error: reference 'refs/heads/master' not found; class=Reference (4); code=NotFound (-3)

Caused by:
    reference 'refs/heads/master' not found; class=Reference (4); code=NotFound (-3)

I'm still waiting on a reproduction case but figured I'd give people an early heads up in case something comes of this. We use crates-index under the hood and I tried running a thin wrapper around crates-index without the registry folder and without the index's .git folder and neither reproduced the problem.

See `cargo upgrade` fails to get crates version when using sparse registry · Issue #841 · killercup/cargo-edit · GitHub

I'm not sure on whose end it is, but running cargo-release in CI gave me a strange error of

error: object not found - no match for id (bc74b69f0a5f9e5286a67f2abad52b85e5c727bb); class=Odb (9)

after waiting for confirmation of upload. run 1, run 2; a local run just after these two CI runs went fine.

For added context, cargo-release is using crates-index as well.

I have posted https://github.com/rust-lang/cargo/pull/11713 with some tweaks to how the publish status is displayed. It is a bit difficult to succinctly convey the appropriate amount of information, but I gave it a shot.

I don't think the ticking was directly related to your rate limit. The way the progress bar works with the sparse protocol is a little convoluted (there is a global progress bar that doesn't really know how many requests are required). The PR I linked above implements a new progress bar that should tick once per second with a defined limit (60 seconds).

There are separate rate limits for publishing new versions versus publishing new crates. I believe the latter is somewhat more restrictive, though I'm not sure what the exact values are.

1 Like

Perhaps, before exiting, it would be useful to output a Finished message or similar, to avoid any confusion from the last message indicating that work was (then) still pending?

1 Like

Yes, I uploaded multiple packages, but in every case I did not wait much longer for the publish to finish successfully and return to the command prompt. Thus, there is nothing wrong with the functionality, I just felt it would be easier to understand if there was a Finished message.

1 Like

Yes, I think so!

Just used cargo extensively via a satellite connection on airplane wifi, and the sparse protocol made the difference between usable and usable.

I also discovered that several important tools, such as cargo upgrade and cargo deny, haven't yet been upgraded to support the sparse index. Those tools tried to download the git index, which failed miserably (long delay followed by server-side timeout).

1 Like

We just released sparse index support in Bazel's crates.io integration (Support sparse indexes by illicitonion · Pull Request #1857 · bazelbuild/rules_rust · GitHub) via crates-index support (initial support for which was released in 0.19.7).

By default this integration used to do a full index clone when you need to resolve versions, and now it uses sparse index. We are seeing typical version resolution time drop from 2-4 minutes down to <10 seconds.

4 Likes

Love it! Looking forward to seeing this as the default. FWIW my usage is on CentOS, and that works great.

Apologies if this is the wrong place

I have just started to implement a cargo plugin making use of the sparse protocol for a popular repository manager. I have the basics implemented for proxying to crates.io and hosting private crates but I have come across an issue whereby I dont seem to be able to switch entirely across to an alternative registry to crates.io using the sparse protocol. I am finding that I need to explicitly state which dependencies come from that alternative registry, for example:

tokio = { registry = "my-repository" }

Is there a way to switch entirely? Specifying my alternate registry as the default only seem to have an affect for publishing AFAICT

sparse protocol is SO much nicer to work with