Make version selection work on repository dependencies

It seems to me that the "version" attribute for a repository dependency only works for validation of the latest version:

dep = { git = "...", version = "x.y.z" }

If the repo has a "release" and a tag for a version "a.b.c" where "a < x", it fails to retrieve it. The "tag" attribute does work.

It'd be nice if version selection does as well, by searching tags in the form of "v*", and reading the Cargo.toml file for validation. There shouldn't be a need for a whole registry setup for this to work, I think.

1 Like

Prior art: the Swift Package Manager works like this, as the default way to specify dependencies.

1 Like

The version is also what's used when publishing the package (git and path gets stripped from the dependency spec), the intended design for having both git/path and version is to allow for developing on multiple packages and then publishing without having to edit the Cargo.toml just for the release.

IMO overloading version to also search tags would be surprising behavior. Maybe there could be another key added to the git package selection like semver-tag that specifies a semver constraint like version but searches git tags instead of a registry?

Does it use one key as both a way to search git repository tags and to search a registry index?

4 Likes

SPM launched without registries, everything was either a local dependency or a git repository. So it had to use semver tags in order to function.

2 Likes

You probably also want to search for $crate/v* for crates (that's at least the pattern I use) that split off sub-crates internally.

2 Likes

Agreed. In my work at Cloudflare, which depends on a bunch of crates in private git repos, this lack of automatic version resolution has been the biggest pain point.

It's even worse when using version tags, because crates with even slightly different git parameters like tag/branch/revision are considered completely separate and incompatible (it changes their internal ID/URL used by Cargo).

For some crates we've settled on using branch = "v1" and having branches like v1 and v2 (without minor/patch parts!) to approximate semver-like behavior, but it's a non-default workflow and requires some manual repository management.

4 Likes

What I do now is just omit tag/branch and if something breaks, you can fix it or use the last working tag. The branch vN technique is clever.

Let’s work through today’s behavior, all with git specified:

Is on crates.io (or other registry) Local use only with version Local use only without version
rev = "<hash>" Specific commit locally (version verified), semver in registry Specific commit (version verified) Specific commit
rev/tag/branch Current commit locally (version verified), semver in registry Current commit (version verified) Current commit
(nothing else) Latest commit on default branch locally (version verified), semver in registry Latest commit in default branch (version verified) Latest commit in default branch
(desired) Latest tag matching semver locally, semver in registry Latest tag matching semver Latest commit in default branch

(One thing to notice here is that today manifests don’t make a difference between the first and second columns…but I also think that’s okay, since the crates.io behavior is always in addition to the local behavior, never overriding it.)

It’s clear that that third row is technically redundant; it’s equivalent to rev = "HEAD" (or perhaps in practice branch = "main"). But (a) changing the meaning here could break existing manifests, and more importantly (b) people probably don’t want to type that. So getting the fourth-row behavior, which seems useful and reasonably straightforward, can’t just steal the “simplest” syntax of git + version.

Throwing out some ideas:

  1. Make this edition-based; manifests with edition = 2024 get the new behavior; others get the old. I don’t think we’d want to do this, but I’m listing it here for completeness; it could also be combined with another solution when a direction has been chosen.

  2. New package-wide key use-semver-git-dependency-resolution = true or similar, which changes the meaning of entries with git+version specified.

  3. Dependency-specific key, same as above but on a per-dependency basis.

  4. Modification of 3 that also allows specifying the tag pattern, something like semver-tags = "v{}". Nicely flexible for non-standard tag formats, but a downside that you have to specify it every time instead of cargo making it Just Work.

  5. Modification of 4 where we reuse the tag specifier for this. I don’t love this because someone could have a tag named v{} but they probably don’t?

2 with a dependency-specific pattern that defaults to "v{version}". For the later editions, 'use-semver-git-dependency-resolution' defaults to true. You can turn this off/on per dependency with a new dependency-specific key as well. If the pattern specification can't match a specific weird case, tough luck, use tag, ref, etc.

If for later editions the package-wide enabler is desired to be true, maybe mandate it for the next so you're forced to configure it, and then in another edition make it default true. So no weird surprises for the next except the Cargo failing missing one line (better than using SemVer when not desired).

With this you reuse 'version' (desirable in my opinion), you can package-wide turn it on or off, you can configure the pattern in odd cases, and you an turn it on/off per dependency.

Using the key 'version' for SemVer lookup is desirable and already done in "prior art" (so it'd be familiar with prior art users). Using 'tag' is... confusing. With version you assume SemVer (unless configured of course), with tag... you just assume a specific tag. Using another key (semver-tag), with 'version' as well... mmm, not sure.

I'm new at Rust so forgive me if what I said makes no sense he he.

Is this actually desired to be used with published packages? My feeling is that normally using git with semver tag lookup would be an alternative used with non-published internal packages. So using a different key in the dependency spec rather than version wouldn't cause an issue with duplicating information.


For the tag format there are quite a few formats in use:

  • {version} (used by rust-lang/rust)
  • v{version} (used by serde)
  • {package_name}-{version} (used by tokio)

I have also personally used a format that couldn't be automatically derived from the dependency metadata, but is consistent so could be templated (basically {package_suffix}-{version} for a sub-crate in a repo, rather than the full package-name).

1 Like

One thing to worry about is that the semver spec doesn't specify what tags look like. The spec mentions v{semver} in the FAQ to specifically point out that this is NOT a semver.

But it's not unprecedented anymore. This is the standard practice for versioning GHA actions. I would not be surprised to see GH add some functionality for automating this in the future. Similarly, adding tag bumping into release scripts like cargo release should be fairly simple.

I really think no, most of the time. I think for the most part crates in a workspace should always have a version tag requiring the latest published version of their workspace siblings.

(And more should even use exact dependency versions. If you have an internal detail crate split like lib-macros that would be in the same package if multi-lib-crate packages were available, it should be an =version dependency requirement.)

And you didn't mention package/version either.

In general, I feel automatically guessing the tag format is going to bias against workspaces to repo-per-package.

Proposal:

version-tag = "v1.0"

Semantics: use the version-tag tag if available, otherwise list the tags with git tag -l {version-tag}.*, do a dot-separated numeric sort, and take the latest. (And if the version key is present, verify it).

Alternatively:

  • put this behavior on tag if the exact tag is not present
  • make version-tag a template and use v{}, putting version in the {}
  • skip the templating and always assume the version is the last bit; just concatenate {version-tag}{version}
    • actually perhaps required to use git tag -l globs?

I think “use the version-tag tag if available” isn’t good enough for semver: if you add a dependency on “v1.1”, you probably want to be able to use v1.1.1 as well when that comes out!

The idea is that upstream would have v1.1.0 and v1.1.1, and if they have the tag v1.1, they would change it from pointing at v1.1.0 to v1.1.1 when publishing the latter.

If upstream has a v1.1 tag, they're saying that what it points to is the most recent release in the v1.1 release series.

I guess that’s reasonable, but it seems at odds with allowing the rest of the tag name to be arbitrary. Either Cargo should be imposing a standard for tag names or it shouldn’t; the idea that you have to write “1.1.0” and not “1.1” but can prefix it with “v” or “release-” or whatever seems silly. If you want to match a specific tag, use tag.

EDIT: Today it occurred to me that I was assuming “1.1” is a valid semver version, and it is not. Semver versions always have three components, and the Cargo book says you should always use three components for package.version as well.

Git doesn't work well with changing tags, by design.

https://git-scm.com/docs/git-tag#_on_re_tagging

3 Likes

I think having anything called a version that isn't using semver compatibility rules would be very confusing.

That did give me an idea though: tack the version-tag (or whatever it's called) value, split off any non-numeric prefix, parse the remainder as a semver ^ requirement. When looking at the tags in the repository skip any missing the non-numeric prefix, for the remainder strip the prefix off and parse them as a semver version. Then apply normal semver matching rules to choose which tag to use.

Splitting the prefix may be a little complicated, it has to handle both an alphabetic-prefix directly concatenated onto the version (v1.2.3) and an alphanumeric-prefix with separator (bs58-1.2.3, bs58/1.2.3) to handle common patterns. But I think there's a rule that should handle the majority of situations somewhere there; one I can't see an easy disambiguation for would be {package_name}.{version}, bs58.1.2.3 would be an alphabetic-prefix followed by the invalid semver 58.1.2.3—but I haven't seen any repositories using that. (An idea might be to have a reverse semver parser and parse from the end forward, I don't know that that is possible given the optional trailing fields).