Rust distribution mirrors


#1

I’m looking to base one of my startup projects on Rust, and one of the first things to do is making sure of everything can be mirrored locally. This includes the Rust distribution itself, obviously; I’d like to freely pin any version of Rust and/or Cargo should any need for that raise, and preferably rustup too, as they’re distributed essentially the same way. However, it seems nothing is being done in this direction, and I’d like to bring this matter up for discussion.

Mirrors can be essential to any large-scale deployment scenario, especially ones that sit behind a firewall; for some of us living in areas with limited global connectivity mirrors even become the only means of access. For example in many parts of China the Amazon S3 connectivity is painfully slow, or blocked entirely; developers there can’t get Rust without jumping through hoops and it certainly hurts adoption. (As a side note, I live in university campus and the connectivity here is excellent. So you may or may not hear complaints about the Chinese GFW from any Chinese individual, just that people connect from different places and stupidities across the different GFW operators are varied too.)

The various distribution addresses are all hard-coded presently. It shouldn’t be very hard to extend the related tooling to accept mirror addresses too, and simply use that as a base for constructing the full address. Because the layout should be all consistent across mirrors, and all content is static and signed, there should be no breakage and the actual work should be easy and minimal.

One blocker remains, though, and that’s proper sign verification during install; I remember seeing this brought up on the rustup 1.0 release, and a proper solution is postponed IIRC. But maybe we can start preparing for all of this to happen and come up with a plan.

Your opinions and comments are welcome!


#2

All this would be great to have. We need someone to do the work. :smile:


#3

It would be good to have this capability. It requires a lot of small pieces to come together. We are slowly working in a direction where this will be more feasible.

Some factors involved:

  • It needs to be easy to produce a Rust distribution (the static.rust-lang.org / rustup directory format basically) from source, from the build system. This is one focus of current release infra changes. At some point we will expect that a single command in the build system will be suitable for producing toolchains that rustup can install (with different signing keys).

  • It needs to be easy to mirror the important parts of static.rust-lang.org. I talked to somebody on IRC about this yesterday.

  • It needs to be easy to mirror not only the crates.io index, which carol has worked on, but also the crates.io S3 bucket. I don’t believe anybody has made progress on that.

  • Finally it needs to be possible to redirect the tools to use these alternate sources in a convenient way.

It’s seems totally feasible for me that a single tool could mirror both static.rust-lang.org and crates.io, and help configure the local environment to use them. The data is not particularly complex. Just a matter of legwork.


#4

First of all, thanks for the explanation! I see that there are lots of moving targets to deal with.

Fully aware of this effort, certainly we wouldn’t want to do anything on the mirror front before the new release infra is in place.

The common practice among Linux distributors seems to be setting up rsync. However there is one alternative approach that I find particularly attractive that is Python’s PEP 381 on PyPI’s mirror infrastructure. This way an rsync server is not needed while providing all the incremental benefits; I’m just not confident about the structural similarities between PyPI and crates.io metadata. (I never got any chance to read the crates.io code despite long time wanting to do so.)

Hmm, this is a problem if we were to go rsync. In the PyPI way the problem is non-existent in the first place as from the master mirror’s perspective the mirror clients seem exactly like Cargo. (We might want to provision a request header that when sent instructs the server to not increase download count, for cleaner statistics; non-essential though.)

This is definitely doable. Just model the tool based on the PyPI counterpart. It’s basically a pip with all irrelevant features removed, plus directory layout and sync state management. Swap the pip with cargo and write everything in Rust, and you get the Rust mirroring tool.