Pre-RFC: Workspace Member Auto-Discovery

  • Feature Name: workspace-member-autodiscovery
  • Start Date: 2021-9-22
  • RFC PR: N/A
  • Rust Issue: N/A

Summary

If a directory named members is present in the root of a crate without an explicit workspace members declaration, and the members directory contains one or more subdirectories, those subdirectories will be inferred to be workspace members, as if

[workspace]
members = ["members/*"]

were present in the root's Cargo.toml.

Motivation

By providing a convenient default, we encourage uniformity of workspace layouts, making workspaces easier to navigate and identify as workspaces. Additionally, it becomes slightly easier to create a workspace crate, or transition an existing crate to a workspace, as there is no need for a [workspace] declaration in the root Cargo.toml.

Guide-level explanation

(First part stolen from The Cargo Book.)

Root package

A workspace can be created by adding a [workspace] section to Cargo.toml . This can be added to a Cargo.toml that already defines a [package] , in which case the package is the root package of the workspace. The workspace root is the directory where the workspace's Cargo.toml is located. Alternatively, workspace members can be placed in a directory called members, and cargo will automatically add them as workspace members.

For example, given the following directory layout:

foo/
  Cargo.toml
  members/
    bar
    baz
  …

Crates in foo/members/bar and foo/members/baz will be inferred to be workspace members. Note that it is an error for these directories to exist but not contain valid crate manifests, i.e., in foo/members/bar/Cargo.toml and foo/members/baz/Cargo.toml.

This behavior can be overridden by including an explicit workspace members declaration.

Reference-level explanation

If a directory named members is present in the crate root, and the root manifest does not contain an explicit workspace members declaration, cargo will behave as if the crate's root manifest contained:

[workspace]
members = ["members/*"]

Drawbacks

cargo would complain when invoked in pre-existing crates with a members directory with subdirectories that are not valid rust crates.

If a crate did contain a members directory that did contain rust crates, the root crate might unexpectedly become a workspace with those crates as members.

Rationale and alternatives

We could not do this, with the downside of requiring users to explicitly add a workspace members declaration, and not encouraging crate layout uniformity.

We could also use an alternative method of auto-discovery of workspace members, for example treating all directories in the root crate which contain Cargo.toml files as workspace members. However, this has the disadvantage of not promoting crate layout uniformity, and of causing false-positives from crates which are in the root, but are not workspace members.

Prior art

Prior art includes the existing auto-discovered targets, including binary targets in src/bin, examples in examples, integration tests in tests, and benchmarks in benches. These seem to be useful and widely used, so supporting automatic workspace member discovery seems likely to be well received.

Unresolved questions

Some projects use crates as the name of the subdirectory that contains workspace members, so we could consider using that instead, although members is arguably clearer, since it contains workspace members, not just arbitrary crates.

Future possibilities

There are a number of steps required to set up a new workspace, or to transition an existing crate to being a workspace, for example adding and keeping the members declaration up-to-date, mirroring manifest metadata in multiple places, figuring out how to publish members, and adding path dependencies between members.

These barriers to using workspaces might inadvertently deter crate authors from using them.

This RFC is quite a modest improvement, but combined with other changes, for example:

  • Inferring workspace path dependencies, so they need not be declared in the manifest.
  • Allowing manifest entries such as license and repository to be placed under the [workspace] section, to apply to all members.
  • Adding support to cargo for publishing an entire workspace atomically.
  • Allowing crates to serve as namespaces, so that crate members can have non-global names like foo/bar, perhaps accessed as workspace::bar, where foo is the root crate and bar is a workspace member, to alleviate the need for workspace members to have globally unique names.

would make it much easier to use workspaces, and thus encourage crate authors to use them where appropriate.

I just want to note that the future possibilities are in no way reliant on member auto-discovery. (And the workspace improvement I want the most is to be able to use workspace subcrates that don't have to be publicly usable, but still publish a public library using them.)

Today, the conventional folder seems to be crates, so I'd argue for using that. See matklad's Large Rust Workspaces for actual real-world recommendations on how to structure a meaningfully sized (but less than full-Google monorepo) workspace; if we're going to add new convention, we should probably build on that lived experience from rust-analyzer.

5 Likes

There are a lot of projects that are mixed Rust projects with other languages. These codebases often have their own conventions in place for where to put sub-projects, which will end up not following the ideal structure of crates/*. I think we should still support this.

Also, it's always seemed to me that the crates/* pattern is just a patch around the fact that micromanaging workspace.members is a pain in the ass, tbh.

1 Like

That's correct, I just wanted to note that this would make workspaces marginally easier to use, which, along with other things that made workspaces marginally easier to use, would make workspaces much easier to use. Is there a way that I could change the wording to make this clearer?

I agree that crates seems to be the common choice today, and I wouldn't mind using that instead. I'll mention the choice in "unresolved questions". However, members fits better when compared with other autodiscovery targets, which are all named after the thing they autodiscover. (bin, example, bench, etc.)

In that blog post, the main point is to put member crates in some subdirectory, but there isn't any particular rational or justification for the name of that subdirectory, so I don't think it can really be taken as a strong argument for the name crates.

I do experience the problem of micro-managing the workspace.members value, in part because I don't like the solution of adding the additional indirection of crates/* (that is, an extra directory level between the workspace Cargo.toml and the crate directories), but I don't think this is the right solution for it.

I'm not sure I value uniformity of workspace layouts all that much. In fact, in my view it is an anti-pattern to have a Cargo.toml that defines both a project and a workspace (I think Aleksey mentioned this in his article, as well) so most of my workspace Cargo.tomls only have the workspace.members value.

(So I guess what I would like is if autodiscovery resulted in finding all sibling directories that contain a Cargo.toml.)

1 Like

Most of my projects have workspace crates as siblings in the top-level directory of their repository. Except one project where I have a few crates in a shared/ subdirectory for technical reasons (git submodule). I don't have many other things to put in the top-level repo directory, so I don't feel I need to put crates in their own dir.

Another subdirectory would cause rightwards drift in file browser in my IDE's sidebar.

Cargo already forces a src subdir. In this case at least the name is obvious and short, and something I would usually do anyway. members is very bikesheddable.

I don't think uniformity is that important, given that there's cargo metadata for discovering crates and targets.

3 Likes

I think I'd be more interested in this if the corresponding workspace = "../.." could also be elided somehow. Without also removing that…I'm not sure how useful I would find this personally. But I oly maintain 3 projects with workspace setups, so maybe my experience is just narrow here. Maybe if one could do workspace = auto or something? Though it'd have to be some non-string to disambiguate from some weird auto subdirectory layout, so maybe workspace = true works better.

Either way, I'd not like additional directories hanging around becoming an error. For examples, items named members/README.md, members/doc/, or members/.git existing should not cause errors just because they're not crate directories.

You already don't need to add that, if the child crate of the workspace is below the workspace manifest in the filesystem tree. Just omit the workspace key and it'll be inferred.


I'd like to counterpropose another solution: "just" make cargo new/cargo init do the "right" thing in a workspace environment.

Currently, if you cargo new a crate inside a workspace, cargo will print an error along the lines of "crate thinks it's in a workspace, but isn't." Cargo could instead (offer to) add the crate to the workspace root's workspace.members array (and not do so if it's already covered by a glob; the error wouldn't be printed today), so that things "just work" by default.

This wouldn't endorse a convention for where to put crates in a workspace, but rather just transparently work with any convention, only requiring manual editing when renaming folders.

This does kinda fall more into the cargo-edit domain, though, as IIRC cargo still never writes to (an existing) Cargo.toml, only reads it.

4 Likes

Once we have toml_edit in cargo, this was the first thing I was going to do, even before merging cargo add.

EDIT: For status of toml_edit, see Make `toml_edit` viable for `cargo` · Issue #133 · toml-rs/toml · GitHub

3 Likes

This topic was automatically closed 540 days after the last reply. New replies are no longer allowed.