Initiative, a group of changes to make vendoring easier

plein · August 20, 2025, 8:25am

the usual vendoring workflow I have is

Recursively submodule all the dependencies into my git repo, which can be quite messy as my demands grow.
Recursively change all dependencies to use my vendored sources. Check cargo tree repeatedly do that until all dependencies use my vendored versions.
[patch] never worked for me. I didn't bother reading more on that. It's a UX problem.
Change dependencies to upstream, when I feel vendoring is not needed anymore, ie, from {path="./local_path"} to =version

the projects I worked on include GitHub - ple1n/nsproxy: Kernel-namespaces-based alternative to proxychains.

A couple problems exist here

The vendored crates tend to contain a [workspace]. They work okay when I use =version. They error when I use them as submodules, saying there are conflicting workspaces.
I don't want to use {git=, because I already have the repo checked out locally. Using {git= demands unnecessary roundtrips from local to remote.

I want a solution to all the problems. I vendored a lot. It's a recurring problem.

On a tangent, I have a preference for git submodules because they are inherently more decentralized. Centralized hosting tends to cause problems. Sometimes their sites go down etc. or block my IP.

I think its a pain to add and remove git submodules UX-wise too. Maybe cargo should interop with git to automatically do this submodule vendoring workflow.

I saw there is a command cargo vendor but I havent used it sorry.

jer · August 20, 2025, 9:21am

Then you should start there, as that specifically exists to vendor all dependencies locally in an automated way. Docs are available online too

pacak · August 20, 2025, 1:10pm

Using [patch] all the time, works great.

zackw · August 20, 2025, 1:30pm

Personally I would rather see effort put into changes to make vendoring less necessary; for example, changes to the core language to make it easy for an application to adapt to whatever versions of its dependencies happen to be available, rather than insisting on specific versions.

Vorpal · August 20, 2025, 3:52pm

The main thing to change would be some way to deal with the orphan rule.

plein · August 23, 2025, 8:58pm

I ran the command cargo vendor. It's absurd. It made a directories with all dependencies with no version control. Now git shows 10K+ files changed. Am I supposed to commit these 10K amount of bloat? And what if I want to merge changes into upstream?

I'm not sure what the person who wrote the feature intended me to do? This is absurd.

I want users to be able to use my software by cloning and building, full free software philosophy. And sure, the sources should be easily decentralized. Like every dependency is pinned with a merkle root hash. I'm an expert at this.

zackw · August 23, 2025, 10:14pm

Pinning dependencies without a very good reason, and a documented plan for when to stop pinning them, is exactly the sort of thing I want to see Rust, and the industry at large, move away from.

More generally, the question I feel I need to ask you is, why do you think you need to vendor at all? What are these library packages that add up to 10KLOC that you can't use as normal dependencies, what's wrong with them, and what's stopping you from working around the problems from within your own code?

Vorpal · August 23, 2025, 10:21pm

Copying code into your code is the definition of vendoring. What did you expect?

Really vendoring like that only makes sense for extremely niche use cases, such as airgapped development. And these days a private registry that mirrors the crates of interest is probably a better option.

I'm not sure how this is related to vendoring at all? Cargo will download dependencies specified in the lock file. Submodules will have git fetch them. Vendoring copies it into your repo. But it doesn't change if it is free software or not?

plein · August 24, 2025, 2:22am

and what's stopping you from working around the problems from within your own code?

Unfortunately some things are JUST better done by modifying the internals of a component than tinkering with its externals.

Modern programming, stateful or functional, is inherently haunted by encapsulation of state, and logic. It's in many cases, always better to refactor the dependencies than work around them

Pinning dependencies without a very good reason,

Dependencies should be pinned down with a merkle root hash such as with SHA256. This enables absolutely decentralized package management and reproducible builds because its deterministic. You can obtain sources from anywhere, and be sure its correct, and safe.

You can get an idea of it by reading about IPFS.

plein · August 24, 2025, 2:27am

I do not expect copying code from a git repository into my git repository and have it show up 10K+ additions, because this totally defeats the point of git, besides looking disgusting.

Vendoring should be, like git submodules. Actually, I think cargo should make every copy of package downloaded a git repository, and when I vendor them I can just edit and commit right in there.

If you simply copy sources into your repo you lose the ability to PR into upstream, pull new updates, all the features of version control and community collaboration etc.

Tracking versions by 0.1.x should be abandoned. Replace all of them with git commit hashes.

You can build the language stating that {x,y,z} versions are compatible, on top of that protocol.

And finally, the package manager can solve the dependency graph with Z3

pitaj · August 24, 2025, 3:51am

If you have a Cargo.lock, it does exactly this.

Nemo157 · August 24, 2025, 7:58am

It’s not a merkle root or any other kind of content-addressing hash. Somebody could build an index from crate hashes to the files, but it’s not implicit in the design.

I’ve experimented with generating registries into ipfs before, and it’s possible to do something that works, but it’s really not ergonomic. I think to really integrate cargo with some kind of CAS would require changes to explicitly support them.

R081n · August 24, 2025, 10:28am

This cannot work as it would remove semantic versioning.

You would now need an extra copy of each crate for each minor/patch version every dependency uses. Wich would make communicating between crates using anything other than std impossible.

In effect tokio can never update anything glam can never add anything. This then extends to any crate that is used to communicate between different crates.

Also while this maybe only helps a little the crate cargo-patch allows changing crates without pinning the version by using patch files

Noratrieb · August 24, 2025, 10:31am

I'm a bit confused with what your end goal is here. You mention submodules being more robust and decentralized, but all your submodules point at GitHub, which is very centralized and has significantly worse uptime than crates.io. Also, some of the submodule links 404 for me, maybe because the repos are private, so I don't think I'd be able to clone this repository properly.

To be more robust against crates.io downtime, you can self-host mirrors of crates.io, which isn't necessarily decentralized, but an alternative.

I think it would be very useful to focus on how to solve the concrete problems you have with the current crates.io way of dependency management instead of your specific submodule solution you use to try to get around those.

plein · August 24, 2025, 3:26pm

You mention submodules being more robust and decentralized, but all your submodules point at GitHub, which is very centralized and has significantly worse uptime than crates.io

What do you think really is the act of supporting decentralization? Using an obscure github alternative? I appreciate the act of trying to break their monopoly, but it's already a lost battle. They monopoly over the attention, entry point of repo hubs.

At this point it does not matter any more. Rather, I increase my publicity by using github, which serves my goals.

Using git submodules, regardless where you host it, is decentralizing on its own. Git is a decentralized version control system as we all know. That's it.

significantly worse uptime than crates.io

We can make a decentralized git router to resolve git-commit to any available hosting platform.

To be more robust against crates.io downtime, you can self-host mirrors of crates.io,

That's not decentralized, because git submodules only point to hashes (preferably SHA2 hashes but git project lags so much they are still using SHA1). Crates.io is an authority, which causes more problems when it goes down. Being an authority, when you want people to trustlessly mirror your data, I wonder if crates.io has a keypair and regularly publishes signed indices over all data (I guess not?)

Notice I have no problem with crates.io being authority. It just needs to be done right. For decentralizing an authority I have more ideas because I dedicate a lot of time to researching decentralization tech.

plein · August 24, 2025, 3:29pm

You would now need an extra copy of each crate for each minor/patch version every dependency uses. Wich would make communicating between crates using anything other than std impossible.

I already said, you can have a language over the hash based registry.

Language stating that {x,y,z} where x,y,z are versions are compatible, etc. Semantic versioning expresses such constraints, over minor versions being compatible. Package manager collects these constraints, and put them into a constraint solver, such as SMT solver. That's how they work.

plein · August 24, 2025, 3:38pm

Thats often the case when I look into designs of non-cryptography-aware people.

I’ve experimented with generating registries into ipfs before

I never implied IPFS is the best choice. It's too bloated and last time I checked the implementations suck. Using it or not, I advocate you to adopt self-authenticating data structures such as presented in IPFS. They are always compatible, like, you can turn a centralized authority, more decentralized by signing its data. The authority stays authority and functioning, but now you can get its data from other servers because it's signed.

jjpe · August 24, 2025, 5:59pm

Everybody's use case is different, but personally that's a hard pass for me.

Relative to using semver, using git hashes is just painful, because hashes are more human-hostile than semver is, and as the name implies, there is a semantic meaning attached to a semver.

On top of that, it would break massive parts of the ecosystem, specifically the parts that rely on semver at this time, none of which are going anywhere any time soon.

I've been using Rust successfully for about 10 years now without having to reach for vendoring even once, and just using the facilities provided by Cargo and crate.io. So to me this has the feel of an X/Y problem. What is it you're really trying to achieve here with vendoring?

plein · August 24, 2025, 6:23pm

personally that's a hard pass for me.

Did you read what I wrote.

You don't need to drop semantic versioning. Rather, its a layer on top of that

The point is that hashes are a MUST in, eg. Cargo.toml. Dependency packages are first, and canonically identified by cryptographically secure hashes.

Your packages can declare all 0.1.x to be compatible, and cargo will handle this declaration. The declaration itself is a form of language

plein · August 24, 2025, 6:25pm

I just discovered that, dependent packages do not adopt the [patches] section, which is absurd and annoying, because I already declared I wanted to vendor a crate in the that one package with [patch].

Makes no sense to ignore the [patch].

I will now consider this method unviable

Topic		Replies	Views
Discussion: Improved UX for Distributed/De-centralized Development cargo	26	2566	December 22, 2024
Defining Dependency Versions Remotely cargo	25	778	May 3, 2025
Is it really a good idea for Cargo to silently choose out-of-date versions of crates to avoid pulling in multiple compatible versions of another crate? cargo	32	3463	December 22, 2024
Cargo's crate index: upcoming squash into one commit libs	37	8334	March 25, 2019
Blog Post: No Namespaces in Rust is a Feature cargo	21	8560	December 22, 2024

Initiative, a group of changes to make vendoring easier

Related topics