Anonymous utilities crates

It seems like often when I have crates in a workspace, I run into situations where I want to use some utility code private to multiple crates, in which case I have two non-ideal options:

  1. Duplicate the code in question in both crates
  2. Make a 3rd crate for the shared utilities, and live with semver forever

These are typically just internal implementation details that making public only reduces flexibility. It occurred to me that it might be possible to introduce a 3rd option, by adding a 3rd kind of crate which here I am calling an "anonymous utility crate", the general idea is that it would be able to be depended upon by crates in the same workspace, but crates outside of the workspace would not be able to add a dependency to it.

With that, the consequences of making the API public would be much reduced because the ability to introduce a dependency is still reserved by the workspace.

I can only imagine this has probably come up before, but I can't recall it?

FYI nested packages has been discussed but ran into issues with the Index.

5 Likes

A concept of workspaces doesn’t (currently) exist on crates.io, so I’m not sure what mechanism could allow…

You aren’t trying to address problems only in the context of local crates, are you?


As far as I understand the discussion in the RFC @epage linked above, that one doesn’t address use-cases where a “utility crate” is supposed to be shared between multiple packages (without effectively code-duplication[1], anyway), or am I missing something?


  1. perhaps just automated code duplication through the publishing process? ↩︎

My main thought was to avoid adding a notion of workspace to crates.io by leverage characters that weren't valid crate names in the "naming" of anonymous utility crates. That is to say to pick a random unicode character and prefix a crate name with that (perhaps even a completely random name). Then allow that name to be used in Cargo.lock but not valid to appear in dependencie section of Cargo.toml.

It occurs to me that that may not work though, because I believe at least last time I looked that crate name validity was an aspect of crates.io and not Cargo.

I don't exactly understand what you mean by local only, I have 2 crates, crate-a and crate-b, both are library crates, and published to crates.io, and exist in a single workspace, one crate depends on the other, and both want to use the shared utilities, neither wants to expose the shared utilities as part of their public API (but I don't feel like that needs to be enforced for this to be useful).

I just meant “situations not involving crates.io (or other registries)” though I do suppose your mention of semver should have made it clear to me already that this is about crate registries :smiling_face_with_halo:

1 Like

Yeah, that is the impression I get from the rfc, still automated code duplication would be an improvement over what I currently do (duplicate code between crates since amazingly making it public somehow manages to seem even worse).

But using the a the trick I mentioned above for valid crate name for crate resolution but invalid crate name for specifying dependencies the intent was that these utility crates would be actually shared.

I haven't tried it, but perhaps something like this could work to avoid duplicating the code?

#[path = "../shared/mod.rs"]
mod shared;

The only workaround that I know of that would make this work when publishing to crates.io is to do something like git clone during build.rs, because ../shared ends up outside the crate package source tree for at least one of the crates. I haven't considered that workaround viable though, because it has all kinds of downsides.

If you have a symlink, cargo package / cargo publish will follow it and include the files.

The downside being that this puts constraints on Windows contributors (must enable symlink support, ensure git has symlink support enabled)

It occurs to me there is another similar approach, of using multiple instances of a git submodule, which is maybe more windows friendly.

I think the inconveniences of all these solutions are likely to end up worse than just copying the code unfortunately.

I think it would be fine to publish a crate with a description of something like "Internal implementation details for [name of your project] only; depending on this crate directly is not recommended" and give it a semver starting with 0.. That would strongly discourage another crate from depending on it, and it would be very clear that breakage to the public API should be expected if someone does choose to depend on it.

Is there a scenario you're concerned about that would only be solved by forbidding any dependency on the internal crate?

I guess the reason I hesitate to do that is because if one goes the route of reading code, copy/pasting the dependency from some code where it is intended to be used, and then some line of code they may never encounter the description or read the docs.

More than anything it is the principle of the idea that publishing a crate should come hand in hand with the participation towards compatibility guarantees, these types of crates have no intention ever being stable. So right or wrong, it feels to me like it is diluting the strength of semver by both being a crate and never participating in the stability guarantees.

In that sense I'd rather save my "I told you so's" for people clearly working around the intent by downloading the crate/building an rlib, over people who may accidentally copy/paste some internal usage, or had an AI generate code which used a crate which was intended to be internal.

You'd still need to follow semver in these crates or Cargo could end up picking an incompatible version. However, semver never says you have to hit 1.0 and then never go to 2.0. Whether to keep your majors low or not is dependent on the product you are creating.

1 Like

you could even give it a 0.0.x version to emphasize that you shouldn't be using it.

I guess what I would say is in our current situation, where none of this code is pub, and there is no separate version which is distinct from the version of crate-a and crate-b themselves is pretty ideal for our use case. inventing a nonsense version and doing a breaking change for every release is still less ideal than just not making it pub in the first place.