Recently, the icu4x project had a bit of a debate about whether our crates should be icu_foo
or icu-foo
. The primary motivating factor for both sides was not aesthetics, rather it was what people will default to. icu-foo
was motivated with the argument that dashes are what most people typically default to, whereas icu_foo
was motivated with the argument that newcomers who may not be 100% familiar with the dash-to-underscore conversion that happens before Rust code sees the crate name, and it's better to be consistent for them. Rather belatedly, we realized that we had already published an icu-locale
crate, effectively locking us in to dashes (or picking a new name for the locale crate) if we want internal consistency.
I'd been thinking about this for a while, but this incident really motivated me to finally post this pre-RFC.
Summary
Crates.io, cargo, and docs.rs will treat crate names as identical under transformations where dashes and slashes are replaced.
Crates will still have a canonical name that uses dashes or slashes, but it only matters for presentation.
Motivation
Crates.io already prevents you from publishing foo_bar
when foo-bar
already exists (and vice versa). The equivalence class of crate names under replacement between dashes and underscores already uniquely defines a single crate. Crates.io and docs.rs already perform redirects.
However, every time I type a crate name in Cargo.toml I need to remember whether the crate uses dashes or underscores. This is annoying and rather unnecessary.
New projects are also forced to make a choice between dashes and underscores, and most of the tradeoffs there have to do with the choice that people will pick first (to minimize friction when working with this crate). Dashes are more of a typical default pick, but new Rustaceans not aware of the dash-to-underscore conversion may first try underscores.
The Rust project so far has not made a stance strongly preferring one or the other, nor does it seem likely, so this problem isn't going anywhere.
It seems to me we can make all of this a moot point by treating them as equivalent in the backends.
Guide-level explanation
Crates with underscores or dashes in their names can be referred to with any name that is equivalent to the original under the replacement of one or more dash/underscore with the other separator. This applies to Cargo.toml, crates.io, and docs.rs.
Reference-level explanation
When published, crates have a canonical name which is their name when published. This crate will have an equivalence class equal to all names that can be formed by replacing one or more -
with _
or vice versa in the crate.
crates.io
and docs.rs
will perform redirects when you visit any name within the equivalence class that is not the canonical name (This is already the case)
Cargo will also treat these crates as equivalent. When traversing the registry trie, it will traverse both underscore and dash options for the crate, picking up the first matching crate it finds. This is technically a breaking change for custom registries (see below), though I'm not sure if people would actually care about that.
Cargo will also treat names within this equivalence class as equivalent when looking for path or git dependencies, i.e. the following is okay:
# ./Cargo.toml
[dependencies]
foo-bar = {path = ../foobar}
# ../foobar/Cargo.toml
[package]
name = "foo_bar"
Cargo will, in its user interface, report the canonical name of the crate.
Drawbacks
It being forbidden to upload an underscore-crate when a dash-crate exists (and vice versa) is not a Cargo feature, it is a crates.io feature. It does not apply to custom registries. Any solution that makes the Cargo codebase itself aware of dashes and underscores may be a breaking change for custom registries.
We could potentially add support for "renames" to the registry format, however this will bloat the set of crates in the registry. Such a feature may eventually be useful for folks wishing to migrate to optional namespacing.
Rationale and alternatives
We could simply not do this, however this confusion seems to crop up a lot.
We could also "solve" this by, as a community, determining that either dashes or underscores is the accepted idiomatic style, and use those for newer crates. Over time this will diminish this problem, and if we ever add support for renames potentially get rid of the problem entirely.
Prior art
This has been in the past discussed at https://github.com/rust-lang/cargo/issues/2775
Unresolved questions
None so far
Future possibilities
It's worth considering the interaction of this feature with Pre-RFC: Packages as Optional Namespaces or whatever namespacing solution we pick. So far it does not clash.
Any solution that involves teaching Cargo about renames may also be useful for supporting renames for smoother migration to namespaced packages.