I am interested in improving Cargo's dependency resolution and downloading for distributed development.
In distributed development, developers do not necessarily share [much] infrastructure. A famous example of this is the linux kernel. There is no "one place" where linux lives. There is no single git repo to which all developers git push
at the end of the day. For the kernel, a significant amount of developer communications takes place on mailing lists through patches.
It is easy and straightforward to pass around a single git repository, regardless of how de-centralized your workflow is. Problems arise when dependencies are introduced between crates from different repositories. At present, I believe that there are only two basic options for referring to crates. Please correct me if I am wrong:
-
Reference by absolute URL:
-
Crate registries like
crates.io
, and alternative/private registries, are centralized places which must be common to all developers. The crate registry is essentially a mechanism for turning a crate name + version into an absolute URL. -
Git repositories are another option, but they too must be referenced by absolute URLs that are common among all developers.
-
-
Reference by local filesystem: Workspaces and relative file paths are intended to assemble together crates which are part of the same repository. Such crates are, by necessity, always versioned and released together.
Neither of these methods work well for distributed development.
The requirement for using absolute URLs means that a crate's identity (name + version + origin) is inexorably linked to where it can be found. In cases where developers can't share—or don't want to share—centralized server infrastructure, this makes things difficult. There is no [easy] way to tell Cargo,
- "Look for this dependency relative to something else, like my repo's
origin
." - "Use this dependency by name, but require (or allow) the top-level crate to declare where it can be found."
Some workarounds I believe are non-workarounds include:
-
Path-based dependencies, in conjunction with git submodules. Submodules are awkward to work with. They also lead to lower-level projects declaring dependencies for higher-level projects, which isn't good.
-
Local registries. These require that each and every developer publish all crates locally. This may entail a non-trivial amount of "manual labor," as crates have to be published in dependency order. Unpublished development snapshots cannot be used.
-
Vendored dependencies. These are perfectly workable, but the present-generation tools require an existing registry or an absolute-URL repository source.
Is there community interest in supporting distributed workflows? Does anyone have any suggestions for improving these workflows? I will gladly assist with writing an RFC, if one is called for.
Related topics / further reading:
-
#6859: feature request for relative paths to
git
repositories. I have commented here. -
#6713: "Want config option for definitively specifying local crate paths"
-
path
overrides cannot be used "to tell Cargo how to find local unpublished crates"
(EDIT: Crates are also identified by their origin, and not just name + version.)