Pre-RFC: Cargo mutually exclusive features

I guess the question is: should it focus on conflict resolution? (and particular, approaches which make this sort of feature activation conflict-free).

In my alternative above I ended up using a slightly different syntax, so I think it's worth at least considering conflict-free approaches up front so whatever syntax is chosen can support it.

BTW, the analogy to semver-major conflict resolution gives me an idea: also duplicate crates if they have incompatible exclusive features.

Currently Cargo separates and duplicates crates by (semver_major,). It could duplicate by (semver_major, exclusive_features) set. So for example, instead of "error: diesel must have one back-end", you may end up with diesel (with mysql) and diesel (with postgres) in your dependency tree.

4 Likes

That assumes that the different feature/capability configurations are of top-level significance to the binary. Requiring the binary to have deep enough knowledge of its entire dependency graph in order to set configurations for the whole thing seems unreasonable - either implicitly or if all packages with configurable dependencies need to pass those configurations along to its dependendents in some way.

Hi! Thanks for working on this - there's a clear need for some kind of exclusive feature model which isn't currently being met by Cargo.

One comment on the examples - the TOML doesn't seem to make sense from a data model POV:

[features]
json = []

and

[features.json]
...

are referring to the same entity with different types, so this either shouldn't parse, or if it does it loses one or the other. The more detailed description later on clarifies this to some extent, but it doesn't help with making the initial explanation any clearer.

I like the use of the term "capabilities", but I'm not sure about this design and their interaction with features. Features are already fairly awkward to use at scale (its hard to tell where a feature has been enabled when analyzing a large dependency graph), and this seems to make the problem worse.

More generally, you don't really frame what problems exclusive features are needed to solve. As I see it there are two cases:

  1. The two features are intrinsically incompatible - if a single binary contained both then it would be non-viable (it either wouldn't build, or would behave badly on execution). An example of this would be linking to two versions of a foreign library with conflicting symbols (though this is what the link metadata is intended to address).
  2. The features change either the API or the functionality of the package in incompatible ways which affect its dependents. This is currently completely unaddressed, however I think it can be handled more gracefully than this proposal.

The former is something that should be flagged at build time as an error, but it needn't necessarily prevent dependency resolution (this is a subtle point, but for my use-case distinguishing resolution from build is important). This is currently a limitation with link, especially as it doesn't distinguish build script dependencies from library dependencies (though -Zavoid-dev-deps looks like it could help with this).

For the latter case, rather than failing at either resolution or build time, I think it should just split the dependency graph. In other words, if both "A with capability X" and "A with capability Y" appear, then treat them as completely separate dependencies, in the same way that two distinct versions of A currently are.

As @withoutboats mentioned above, non-brittleness is one of the ecosystem's good properties we'd like to preserve. If you have an executable A depending on B and C, and B currently depends on D+X, but then you rev C and it depends on D+Y, you don't want the build to break. You just want B and C to link with the variants of D which satisfy their requirements.

Of course this gets complicated if B and C expose definitions from their D variants as part of their public API, but that's no worse than having version skew on D.

Perhaps another way of putting it, you could consider capabilities as being extra flags on the version which are never semver compatible, so 1.0+X is never compatible with 1.0+Y, but would be compatible with 1.1+X.

2 Likes

Yes, I agree with @jsgf (disclosure: we're coworkers). The "fundamentally incompatible because of underlying system characteristics" and "currently incompatible but can be made compatible through splitting versions" models are different and we shouldn't lock ourselves into either, I think.

I will say that it's quite possible we may want to solve the latter in a different way, perhaps through building tooling to make it easier to upload multiple crates with different names to a package repository out of a single codebase. In that case the incompatibility would be encoded in the crate name, not the version number.

1 Like

This notion of mutually exclusive flags as making them semver incompatible and allowing duplication is certainly an interesting approach to solving the problem! It would be worth exploring the implications more fully if there's enough energy for a working group on this idea.

4 Likes

This doesn't sound unreasonable to me. Leaf dependencies may need to know which runtime/back-end/OS API to use, but these things aren't just implementation details. They may affect things like operating system compatibility or cause performance issues that affect the whole binary.

Only thing I'm worried about is fragility of such configuration, similarly to how patch.crates-io section is fragile and can cease to apply at any time. For example, I may want to make HTTPS requests with a specific implementation of a CA certificate store. I'll configure fooTLS library to use it. But if my requests library changes its dependency to barTLS library at any time, my fooTLS config won't be applicable any more.

Regarding your HTTPS request library example, I've come across this very problem "in the wild" recently: https://github.com/seanmonstar/reqwest/pull/1058

The solution I went with was to just add the root certificates of all configured stores together, and wait for people with use cases that require true exclusive switching between backends to speak up, because I'm not sure that their use cases are indeed real. If exclusive switching is desired, I think a runtime way of configuring it when building a Reqwest client would be much better than compiling all of reqwest twice and increasing compile time even more :).

1 Like

Personally I like that features are purely additive. As has already been stated, this has a number of useful properties and it would be a loss to give them up even in a limited way.

Obviously people do sometimes need mutually exclusive features and I understand (in terms of complexity) wanting to reuse the feature field for that as much as possible. But would it really be so bad to separate "capabilities" from "features" so that they are distinct concepts?

Yea, since these would have different rules from the "features" feature, I would also want it to be a different feature.

This seems like a good solution to the problem that RFC 2962 was trying to solve.

There's still some common points - a "capability" will still want to control the crate's dependencies ("capability X needs crate foo, Y needs bar"). We could have one of:

  1. a second mechanism for configuring dependencies at resolve time
  2. overload/integrate features and capabilities
  3. factor dependency control out from features and then make both features and capabilities use that

I think 3 is the most palatable from an implementation point of view, but it would end up looking at 2 for users, since additive features already exist and we don't want to make massive changes to the file format or UX, and I think they're likely to be the most common in future (ie, we should make sure that capabilities are only used when definitely needed).

So I'm sympathetic with the RFC's attempt to fold capabilities and features into a unified conceptual mechanism, but I'm not sure that this specific approach is the right way to do it. (TBH I've read it a few times and I don't really understand it.)

1 Like

That's just at #[cfg(...)] evaluation time, isn't it? It doesn't help with the problem if "what if more than one is set?" - the only options are to flag a compile-time error, or actually we can handle more than one of them being set.

I'm saying that that RFC was trying to solve the same problem, but without addressing the issue in cargo. This approach, adding mutually exclusive features or "capabilities" in Cargo, seems like the right solution to the underlying problem.

1 Like

I think is similar to the, currently existing, way that dependencies are modeled:

[dependencies]
foo = "1.2.3"
foo = { version = "1.2.3" }

[dependencies.foo]
version = "1.2.3"

To be honest, I lost track of what you are actually discussing. I don't see any real feedback on the RFC other than: a) the problem seems to be a valid one b) all kind of other things could be done.

My proposal would be, if you have some concrete things that you think should be changed on the RFC, then let me know, and I will try to incorporate that feedback.

Those are either/or syntaxes, you cannot have

[dependencies]
serde = "1"
[dependencies.serde]
features = ["std"]

because that's attempting to create key dependencies.serde at both type Table and String.

An alternative to the "Basic Example" that would be supported is:

[features]
yaml = []

[features.json]
requires = ["json_backend"]

[features.json_foo]
subfeatures = [ "foo_json" ]
provides = ["json_backend"]

[features.json_bar]
subfeatures = [ "bar_json" ]
provides = ["json_backend"]

Where each feature appears once as either a simple array of subfeatures, or a table of subfeatures and other keys.

Something that I haven't seen discussed. Can a dependency provide a capability? Or, is this limited strictly to features?


[dependencies.serde_json]
version = "1"
provides = [ "json_backend" ]

I love the concept of capabilities. Just to throw it out there as @withoutboats mentioned wanting to require the binary crate to specify, here is another, slightly different, idea. This is a pure addition to the feature system and doesn't change it at all.

// lib/Cargo.toml
[lib.capabilities.json_backend]
// activate features and/or optional deps based on capabilities
json_foo = [ "serde_json" ]
json_bar = [ ]

// default values for capabilities perhaps?
[capabilities] 
json_backend = "json_foo"
// bin/Cargo.toml
[capabilities] 
// the entire dep graph now understands that for the capability
// "json_backend" we want it to be "json_foo"
json_backend = "json_foo"

I could see the community come together with convention on capability names such as tls or async-runtime, etc.

1 Like

I think the best solution is to add a capabilities section:

[package]
name = "web-framework"
...

[capabilities]
json = ["crate1", "crate2"]
tls = ["rustls", "openssl", "boring-ssl"]

Ideally, a library depending on web-framwork should be able to indicate that it requires the tls capability, without setting the exact implementation:

[dependencies]
web-framework = { version = "1", capabilities = ["tls"] }

This requires that the library has the exact same API for each feature that is part of the same capability.

As a result, only the final binary has to specify the exact features that provide the needed capabilities, which makes conflicts very unlikely.

For convenience, it should be possible to propagate capabilities upward in the dependency tree, for example:

# crate a
[features]
feature1 = []   # mutually
feature2 = []   # exclusive

[capabilities]
foo = ["feature1", "feature2"]
# crate b
[features]
feature1 = ["a/feature1"]
feature2 = ["a/feature2"]

[capabilities]
foo = ["feature1", "feature2"]

[dependencies]
a = "1"