I would like to point out that this behavior is pretty confusing, especially for anyone who just casually reads the Cargo.toml
and expects package names to be the same as paths in the source. And some tools have problems with it, like cargo-machete
for instance.
As part of brainstorming what( could be done, cargo new foo
could create a manifest like:
[package]
name = "$USER-foo"
...
[lib]
name = "foo"
I've not dug too much into splitting package.name
and lib.name
before
- What does renaming do when the names are split like this?
- How do we help ensure the
use
name is discoverable? One-to-ones between declaration and use of anything I think is important for understanding code. In particular for this case, there is adding a dependency, trying to look up a the dependency from the code you are looking at, etc - If we de-emphasize
package.name
, we likely will also need to emphasize eachbin.name
Note while it would be trivial to pick up on custom lib.name
for registries, anything more (bin.name
, whether a package has a lib, etc) would require the registry re-implement the auto-target discovery code from cargo. I have been toying with the idea of finding a way to perform all auto-discovery as part of cargo package
(and disabling the auto
fields) so crates.io wouldn't have to re-implement anything.
If you specify a dependency as name = { package = "package" }
, the crate name used in code is always name
, regardless of what the package has set as lib.name
.
If you don't set an explicit lib.name
, the behavior is exactly that lib.name
is set to package.name
with any -
replaced with _
with no other different behavior.
We already don't have this property, so if it's considered important, then there should be a warning for setting an explicit lib.name
.
But I actually agree; many people don't actually know that package.name
and lib.name
can be set independently, so running into packages which do so tends to be surprising (and because of this, people tend to not set lib.name
even if it would improve use of the crate; e.g. nalgebra recommends consumers to use extern crate nalgebra as na;
and extern crate nalgebra_glm as glm;
).
At one point I toyed with just always using the { package = "package" }
form of dependencies, to make the fact that package and lib name are distinct more immediate. In a world where package names being different than lib names were the norm from the beginning, I'd expect this to be the only way to declare dependencies, and the lib.name
key would essentially be what cargo add
uses when first adding a package.
It's already fairly common for bin.name
to be different than package.name
. cargo already isn't really meant to be a binary package repository, but rather a library one.
I think a reasonable heuristic would be to display as
bin.name (package.name)
if a singlebin.name
is set explicitly and nolib.name
is set,lib.name (package.name)
iflib.name
is set explicitly,package.name
if nolib.name
orbin.name
are set, andpackage.name
if multiplebin.name
are set but nolib.name
is set.
This gives developers a reasonable amount of control to get the result they prefer.
This also leads to thoughts of what it would look like for a package to contain multiple lib crates versioned, packaged, and distributed as a single unit. Existing discussion of such has mostly limited itself to encapsulated/private crate units, since in the case of multiple public crates it's mostly sufficient to just publish them as separate crates with =
dependencies to keep them in lockstep.
What if two crates use the same display name of "serde"? That would be very confusing as they are incompatible with each other even if they have the exact same source. You can't take data structures with a Serialize impl of one version and use them with a serializer that uses the other crate that had a display name of serde. The current method that forces a different crate name prevents this confusion.
That's what happens when a feature is hidden, while available to hard-core users, isn't well documented or promoted and thus remains unnoticed and doesn't get potential problems handled.
One might be serde (serde) and the other would be serde (myorg-serde) or serde (serde-5678). But yeah, I know there're people who don't like the idea, just like how they think different users sharing the same (display) name would confuse them.
I found the Rust concept of crate/package extremely confusing.
I'll try to show what I got from The Rust Programming Language 7.1. Packages and Crates (a), 14.3. Cargo Workspaces (b), The Rust Reference 14. Linkage (c), The Cargo Book 3.2. The Manifest Format (d), 3.2.1. Cargo Targets (e) and 3.3. Workspaces (f), and how it confused me. Please correct me if I made a mistake.
- A crate is the smallest amount of code that the Rust compiler considers at a time. [a]
- A package is a bundle of one or more crates that provides a set of functionality. [a]
- Each
Cargo.toml
file defines a package. [a][d][f] - A workspace is a set of packages that share the same Cargo.lock and output directory. [b]
- You can define a workspace in a
Cargo.toml
file that consists of one or more packages that are defined by their respectiveCargo.toml
files. [f] - The
[lib]
section inCargo.toml
specifies the library target which defines a âlibraryâ that can be used and linked by other libraries and executables. [e] - There's a field called
crate-type
under this[lib]
section, which controls how the compiler generates artifacts. [c] - Cargo packages consist of targets which correspond to source files which can be compiled into a crate. The list of targets can be configured in the
Cargo.toml
manifest. [e] - You can only specify one library (target) for each package (
Cargo.toml
). [e] - A package can contain as many binary crates as you like, but at most only one library crate. [a]
...
I mean, WTF? From what I've seen, the term crate sometimes means build unit, sometimes package, sometimes library, sometimes target, which is really ambiguous.
I'll take the file hierarchy of repo github.com/hawkw/mycelium for example.
- The root
Cargo.toml
defines a workspace, with subdirectorybitfield
being one of its members. - The root
Cargo.toml
defines a root package namedmycelium-kernel
- The root
Cargo.toml
specifies a[lib]
target with the namemycelium-kernel
for the workspace. I suppose this name specification can be omitted? - The
Cargo.toml
under subdirectorybitfield
defines a package namedmycelium-bitfield
. - Subpackage
mycelium-bitfield
specifies no target, but it is published on crates.io individually. - According to doc example, you use
mycelium-bitfield
byuse mycelium_bitfield;
cordyceps
ismycelium-bitfield
's peer subpackage under the subdirectorycordyceps
.- If I installed cordyceps along with mycelium-bitfield, what is the target lib? Is it
cordyceps
+mycelium-bitfield
, or is itmycelium-kernel
as specified by the[lib]
section in the rootCargo.toml
of the workspace? Should Iuse cordyceps;
use mycelium_bitfield;
, oruse mycelium_kernel;
? Why? (According to various people here, specifyinglib.name = "mycelium-kernel"
would result in youuse mycelium_kernel;
) - What would happen if I specified
lib.name = "bitfield"
in theCargo.toml
under subdirectorybitfield
? Would that even compile? - If you have installed mycelium + mycelium-bitfield, does that mean you can access mycelium-bitfield via either
mycelium-kernel
ormycelium-bitfield
ďź
From The Rust Programming Language 14.3. Cargo Workspaces:
The workspace has one target directory at the top level that the compiled artifacts will be placed into; the
adder
package doesnât have its own target directory. Even if we were to runcargo build
from inside the adder directory, the compiled artifacts would still end up in add/target rather than add/adder/target .
From The Rust Reference 14. Linkage:
With all these different kinds of outputs, if crate A depends on crate B, then the compiler could find B in various different forms throughout the system. The only forms looked for by the compiler, however, are the
rlib
format and the dynamic library format.
So here are my questions:
- what does the
lib.name
field really do? - how do we know the boundary of a crate, as in the smallest amount of code that the Rust compiler considers at a time?
- what is the most accepted definition of the term
crate
? - how does single-repo-multiple-published-packages(crates) work?
- how should we understand all these concepts?
I'd be interested to see the history but it didn't seem too bad? A crate is a Cargo.toml with a [package]
section, so it seems pretty clear to me that "crate" is just a cutesy name for a package.
Since the only thing you can depend on with cargo is a library target of a crate, and you can only have one library target for a crate, there's no distinction drawn between depending on a crate and depending on it's target.
As I understand things:
lib.name
is the name that you use in source code- You follow the included
mod
statements from the root file for the target, eg.main.rs
vslib.rs
by default. These can use cfg and path attributes, so there's no simpler accurate method. - As I mentioned, there's little need to distinguish between crates as proper packages and their library targets, so the two are generally conflated depending on context.
- Generally a monorepo of packages well use workspaces so they can all be developed with dependencies on each other locally without publishing, but at publish time they are treated as independent crates (paths in dependencies are removed, etc) and crates.io doesn't care. The only other time this comes up is git dependencies, where you have to provide the repository root, and cargo tree searches for the package name (which I find a bit lame)
- In general, if you need to worry about this, you've probably done something wrong! Having multiple targets in a crate is somewhat less favored now, workspaces cover the same need with less confusion.
- There exists a glossary which may help.
- Generally I don't think a normal user need to know these concepts all at once. It's the proposal trying to solve a deep-rooted issue so the author need to understand more than usual.
Maybe we could have a "cargo-by-example" book teaching people common patterns to setup projects, so that they can check example-style doc instead the current reference-like doc. That is also a way to "promote" features.
Check The Rust Programming Language 14.3. Cargo Workspaces:
The workspace has one target directory at the top level that the compiled artifacts will be placed into; the
adder
package doesnât have its own target directory. Even if we were to runcargo build
from inside the adder directory, the compiled artifacts would still end up in add/target rather than add/adder/target .
From what I've learnt, you can only specify 1 library target for the entire workspace, or am I wrong about that?
Because I've only seen examples of workspaces that has one [lib]
section specified in the root Cargo.toml
.
So if I specfied a [lib]
section in a subpackage Cargo.toml
,
- will that be ignored by
Cargo
because the subpackage doesnât have its own target directory? - or is it valid specification about the target of the subpackage (e.g.
lib.name
that controls the name you use in source code), and the only thing ignored byCargo
is the target directory, so compiled artifacts will not be placed into the subpackage's target directory, even if we have these lines:
[lib]
name = "subpackage_name"
in /subpackage/Cargo.toml
, while we have
[lib]
name = "rootpackage_name"
in /Cargo.toml
?
Let me rephrase my questions:
- Is it true that while you can have only 1 target directory specified according to workspace definition, each package can specify its own library target (that would have the compiled artifacts be placed together in the workspace target directory)?
- If you install a package from crates.io, whose
Cargo.toml
defines a workspace, how does that workspace interact with your own workspace? Or is that workspace definition in theCargo.toml
file from the crate/package repo removed during thecargo package
process?
Yes, one target directory, where all the targets are placed on build; and workspaces are removed from crates on publish, they're a completely local concept, at least so far as I know!
Keep in mind also you don't need to have a package at the root of a workspace, that is you can have a Cargo.toml with only a [workspace]
section.
Ah, right, I missed it, from The Cargo Book 4.5.4. cargo package DESCRIPTION
This command will create a distributable, compressed
.crate
file ... This performs the following steps:
...
Create the compressed
.crate
file.
- The original
Cargo.toml
file is rewritten and normalized.[patch]
,[replace]
, and[workspace]
sections are removed from the manifest.
So in the end it is possible to have a [lib]
section for each subpackage, since target â target directory so a subpackage can have its own lib target, and it is no longer a subpackage when it's package
d for pubish
. It just happened that I failed to find a good example showing how this could be done.
And this can be further simplified to:
- lib.name (package.name), when a lib target exists, no matter whether
lib.name
is set explicitly or not, (lib.name
defaults topackage.name
) - then bin.name (package.name) (if we haven't got the name from previous step) when a single bin target exists, (
bin.name
forsrc/main.rs
defaults topackage.name
. Caution: if multiple bin targets exist and only one has its name set explicitly, I don't think that the name explicitly set should be the "representative name" for a "bin-type crate") - then package.name for multi-bin packages
I think the usage of such icons is good for distinguishing different types of crates. Currently I don't know how to tell a lib crate from a non-lib crate on crates.io at the first glance!
The only thing that isn't intuitive enough is that lib.name
overrides everything.
But again, most people go to crates.io to look for libraries, and it would have already been confusing to them if a "crate" has a package.name
(crate name), a lib.name
and bin.name
s which differs from each other.
And I think there's a benefit if developers are discouraged from publishing packages with both a [lib]
target and [bin]
targets. They don't look intuitive even now without a crates.io ui change!
On the other hand, a developer who can understand the relationship between bin.name
, lib.name
and package.name
would have no difficulty with the new crates.io ui mentioned above.
Whether we should omit the "(package.name)" part, if that name equals the display-name (either lib.name or bin.name) before it, is a matter of ui design.
I agree that we habitually conflate the concept of a package/dependency and the library crate which it exports, and don't do much of anything to prevent this. Especially with edition2018 paths and the extern prelude removing the need for extern crate
, the concept of crates as the unit of coherence and compilation, especially w.r.t. how it differs from the package as the unit of versioning and distribution, is an advanced topic that most developers manage to go without clarifying.
This, plus the annoying fact that English words can mean more thing depending on context (and sometimes be ambiguous even with context) means that the answer is usually "it's complicated." Packaging is complicated, and as much as Cargo has done to make the common case simple/easy, it also has a tendency to make the complexity cliff when you step outside the goldilocks zone feel even steeper.
I wonder, now that we have come up with a draft of the change, should I file an issue here referring to this thread, or wait for the rust dev team to do that?
I mean, are rust dev team members collecting feedbacks on this forum?
While on the surface, this can be viewed as just a UX change in crates.io, I think this is a major change to how to view crates and has impact on cargo documentation, likely cargo new
, etc. Personally, I lean towards this going through the RFC process, starting with a more formal PreRFC here.
Ok then, what can I do to help for now?
One use case that I think we should be careful of with bin crates is sometimes the bin crate is subordinate to the lib crate within the package, rather than the other way around.
pulldown-cmark is a great example of this though in that case the explicit bin crate is names the same as the package.