[Pre-RFC] Hyphens in crate names, redux


#1

At the request of @DanielKeep, I’ve fleshed out and finished off his original proposal.


  • Feature Name: hyphens_considered_harmful
  • Start Date: 2015-02-25
  • RFC PR: (leave this empty)
  • Rust Issue: (leave this empty)

Summary

Disallow hyphens in package and crate names. Propose a clear transition path for existing packages.

Motivation

Currently, Cargo packages and Rust crates both allow hyphens in their names. This is not good, for two reasons:

  1. Usability: Since hyphens are not allowed in identifiers, anyone who uses such a crate must rename it on import:

    extern crate "rustc-serialize" as rustc_serialize;
    

    This boilerplate confers no additional meaning, and is a common source of confusion for beginners.

  2. Consistency: Nowhere else do we allow hyphens in names, so having them in crates is inconsistent with the rest of the language.

For these reasons, we should work to remove this feature before the beta.

However, as of January 2015 there are 589 packages with hyphens on crates.io. It is unlikely that simply removing hyphens from the syntax will work, given all the code that depends on them. In particular, we need a plan that:

  • Is easy to implement and understand;

  • Accounts for the existing packages on crates.io; and

  • Gives as much time as possible for users to fix their code.

Detailed design

  1. On crates.io:

    • Reject all further uploads for hyphenated names. Packages with hyphenated dependencies will still be allowed though.

    • On the server, migrate all existing hyphenated packages to underscored names. Keep the old packages around for compatibility, but hide them from search. To keep things simple, only the name field will change; dependencies will stay as they are.

  2. In Cargo:

    • Continue allowing hyphens in package names, but treat them as having underscores internally. Warn the user when this happens.

      This applies to both the package itself and its dependencies. For example, imagine we have an apple-fritter package that depends on rustc-serialize. When Cargo builds this package, it will instead fetch rustc_serialize and build apple_fritter.

  3. In rustc:

    • As with Cargo, continue allowing hyphens in extern crate, but rewrite them to underscores in the parser. Warn the user when this happens.

    • Do not allow hyphens in other contexts, such as the #[crate_name] attribute or --crate-name and --extern options.

      Rationale: These options are usually provided by external tools, which would break in strange ways if rustc chooses a different name.

  4. Announce the change on the users forum and /r/rust. Tell users to update to the latest Cargo and rustc, and to begin transitioning their packages to the new system. Party.

  5. Some time between the beta and 1.0 release, remove support for hyphens from Cargo and rustc.

C dependency (*-sys) packages

RFC 403 introduced a *-sys convention for wrappers around C libraries. Under this proposal, we will use *_sys instead.

Drawbacks

Code churn

While most code should not break from these changes, there will be much churn as maintainers fix their packages. However, the work should not amount to more than a simple find/replace. Also, because old packages are migrated automatically, maintainers can delay fixing their code until they need to publish a new version.

Loss of hyphens

There are two advantages to keeping hyphens around:

  • Aesthetics: Hyphens do look nicer than underscores.

  • Namespacing: Hyphens are often used for pseudo-namespaces. For example in Python, the Django web framework has a wealth of addon packages, all prefixed with django-.

The author believes the disadvantages of hyphens outweigh these benefits.

Alternatives

Do nothing

As with any proposal, we can choose to do nothing. But given the reasons outlined above, the author believes it is important that we address the problem before the beta release.

Disallow hyphens in crates, but allow them in packages

What we often call “crate name” is actually two separate concepts: the package name as seen by Cargo and crates.io, and the crate name used by rustc and extern crate. While the two names are usually equal, Cargo lets us set them separately.

For example, if we have a package named lily-valley, we can rename the inner crate to lily_valley as follows:

[package]
name = "lily-valley"  # Package name
# ...

[lib]
name = "lily_valley"  # Crate name

This will let us import the crate as extern crate lily_valley while keeping the hyphenated name in Cargo.

But while this solution solves the usability problem, it still leaves the package and crate names inconsistent. Given the few use cases for hyphens, it is unclear whether this solution is better than just disallowing them altogether.

Make extern crate match fuzzily

Alternatively, we can have the compiler consider hyphens and underscores as equal while looking up a crate. In other words, the crate flim-flam would match both extern crate flim_flam and extern crate "flim-flam" as flim_flam. This will let us keep the hyphenated names, without having to rename them on import.

The drawback to this solution is complexity. We will need to add this special case to the compiler, guard against conflicting packages on crates.io, and explain this behavior to newcomers. That’s too much work to support a marginal use case.

Repurpose hyphens as namespace separators

Alternatively, we can treat hyphens as path separators in Rust.

For example, the crate hoity-toity could be imported as

extern crate hoity::toity;

which is desugared to:

mod hoity {
    mod toity {
        extern crate "hoity-toity" as krate;
        pub use krate::*;
    }
}

However, on prototyping this proposal, the author found it too complex and fraught with edge cases. Banning hyphens outright would be much easier to implement and understand.

Unresolved questions

None so far.


[Pre-RFC] Resolve support for hyphens in crate names
#2

I liked the way Perl modules used to be organized in two-level hierarchy. It used to make sense and sounded more professional then the general random project names. But lately CPAN is just as full of “fancy” names as other project sites, so I don’t think attempt at organization would have much chance anyway. So not worth it.


#3

Thanks for this! I think this is a well written RFC :smile:

This may want to explicitly mention that the foo-sys convention will become foo_sys (or propose some other alternative). The “Drawbacks” section may not be the right place for it, but it may want to be mentioned somewhere at least.

Another possible alternative here as well would be for the compiler itself to consider - == _ for the sake of crate name comparison. That way --extern foo-baz=... would match against extern crate foo_baz or extern crate "foo-baz" as foo_baz;


#4

If packages with hyphens are going to be renamed, can I explicitly request renames to something else? I’m not entirely a fan of _sys.


#5

Thanks for doing this. Again, apologies for dropping the ball halfway through.

One note: it would be nice to be able to use hyphens in binary crate names (since they can’t be linked against) and maybe static libraries (i.e. not rlibs), though that could be something that Cargo can do by renaming the output file. By no means is it necessary, since it could be allowed in a backward-compatible fashion. (Actually, it might be worth making it a follow-up RFC to allow arbitrary names for output crates; crates that will never be consumed by another Rust program.)


#6

Cheers for the review, @alexcrichton! I’ve addressed both your points.

Though I agree with you on _sys, I’d rather not allow these requests. Under the current proposal, the instructions will simply be “change all hyphens to underscores”. If we allow custom renames, it’ll end up as “change all hyphens to underscores… unless someone chose a different name, in which case spend a few minutes looking it up”.

It’ll already be annoying having to fix all those hyphens, and I don’t want to make it harder for everyone involved.

If you’d like to change the convention, then we can do that in a separate RFC.

We can already override the output filename in rustc using -o, but yes it would be nice to have something like that in Cargo.


#7

If the only counterargument to “Disallow hyphens in crates, but allow them in packages” is that it leaves the package name and crate name inconsistent, then I would argue that Cargo shouldn’t be allowing us to specify them separately. That is, either we wholly believe that having two names (one for package, one for crate) is a bad thing, and we follow that to its logical conclusion, or else … (tada!) there is no valid counter-argument remaining to “Disallow hyphens in crates, but allow them in packages,” and yet several pro-arguments (no package renaming, aesthetics, pseudo-namespacing).


#8

Two years on and there’s more crates with hyphens AND underscores though I haven’t seen any yet that use both. Could we please prevent new crates from having hyphens. This way in a decade we’ll be able to say look - 90% of the crates are using underscores, we should standardise on that. Think of the children!


#9

We plan to make it not matter which one you use. rust-lang/cargo#2775