Pre-RFC: Cargo mutually exclusive features

Hi everyone,

I started to work on an RFC, trying to solve the GitHub Issue:

https://github.com/rust-lang/cargo/issues/2980 – Mutually exclusive features

In a nutshell, the idea is to annotate cargo features with a "capability" that they can provide. Ensuring that only one feature at a time may be active providing that capability.

It was suggested to write up an RFC, and post a draft to this forum. As I am new to this, please excuse me if that wasn't the right way to do it.

The pre-RFC branch is here:

https://github.com/ctron/rfcs/blob/feature/cargo_feature_caps_1/text/0000-cargo-mutex-features.md

Cheers

Jens

4 Likes

Thanks for working on this! Pre-RFCs usually have their RFC inline to make it easier to read and respond to. I'll just do that now:

NOTE: This is work in progress!

Based on the idea from: https://github.com/rust-lang/cargo/issues/2980#issuecomment-700687022

Summary

This RFC proposes a way to implement "mutually exclusive" features in Cargo, by introducing the concept of "capabilities" of a feature.

Motivation

Cargo currently supports features, which allow one to use conditional compilation. A common use case is to provide some kind of capability, backed by different implementations, based on the feature selected by the user.

In group of cases, these implementations are mutually exclusive. While custom build.rs scripts can be used to implement this behavior partly, it is not a great experience. Neither from the implementor's side, nor from the user's side.

Examples

One example is the crate k8s-openapi. The crate compiles for different Kubernetes versions. However, at most one API version may be active.

To achieve this, the Cargo.toml currently contains features (v1_11 to v1_19), and has a build.rs, which uses custom code to validate this.

Another example is the crate stm32f7xx-hal (and its sibling crates), which require exactly one hardware configuration to be enabled. Again, the Cargo.toml contains a list of features, one for each hardware configuration. It also introduces a meta feature named device-selected, which gets set to simplify checks in the code if a platform was selected.

Guide-level explanation

In addition to an identifier, features can specify one or more capabilities that they provide or require, when the feature is enabled.

Basic example

The following example shows the basic idea:

[features]
json = []
yaml = []
json_foo = [ "foo_json" ]
json_bar = [ "bar_json" ]

[features.json]
requires = ["json_backend"]

[features.json_foo]
provides = ["json_backend"]

[features.json_bar]
provides = ["json_backend"]

In the above example there are two standard features json and yaml, which can be activated as needed. The json feature additionally declares, that it requires the capability json_backend. json_backend is a simple identifier, which is local to the crate. The two additional features json_foo and json_bar both provide the functionality of a JSON backend. Due to some limitation, they are mutually exclusive, and the user has to choose one.

If the user requests the feature json, but does not select a feature, providing the capability json_backend, then cargo will fail the build:

$ cargo build --features json
error: Package `example v0.0.0 (/path)` requires the capability `json_backed`, but no feature which could provide
the capability is enabled. Possible candiates are:
  - json_foo
  - json_bar

At the same time, if a user select more than one feature that provides json_backend, cargo will fail the build as well:

$ cargo build --features json,json_foo,json_bar
error: Package `example v0.0.0 (/path)` has multiple features enabled that provide the mutually exclusive capability
`json_backend`. Enabled features providing this capability are:
  - json_foo
  - json_bar 

Another example

In the area of embedded development, it may be interested to let the user provide buffer sizes during compile time to tweak the memory footprint of an application. The following example shows how this can be achieved using features and capabilities:

[package]
requires = ["buffer-size"]

[features]
default = ["1k"]
1k=[]
2k=[]
4k=[]

[features.1k]
provides = ["buffer-size"]

[features.2k]
provides = ["buffer-size"]

[features.4k]
provides = ["buffer-size"]

Using the requires field, the build requests the user to select a feature providing the capability buffer-size, and at the same time, defaults to the feature 1k, which will provide an implementation of a 1k buffer.

The user has the ability to choose a different buffer size, using the different features. At the same time, cargo will ensure that exactly one provider of buffer_size is enabled.

Reference-level explanation

Add as cargo feature

The use of capabilities would need to be added to the list of cargo features.

Cargo.toml

The model of the cargo file needs to extended, to allow for additional feature information. Specifically:

Field for "requires"

Add a field requires: Vec<String> to Package, which lists the capabilities required by this crate.

FeatureMap and FeatureValue

Currently features in a Cargo file is expected to be a map of String -> String, which is translated into a FeatureMap, which is defined as BTreeMap<InternedString, Vec<FeatureValue>>.

The type FeatureValue is currently defined as:

pub enum FeatureValue {
    Feature(InternedString),
    Crate(InternedString),
    CrateFeature(InternedString, InternedString),
}

Instead of having Vec<FeatureValue> as the map's value, we would add a FeatureInformation (or other name) struct:

pub struct FeatureInformation {
    capabilities: Vec<InternedString>,
    enables: Vec<FeatureValue>,
}

Which would be represented in the Cargo.toml as:

[dependencies]
bar = "*"

[features]
#foo = [ "bar", "bar/feature", "baz"] # should still work
foo = { capabilities = ["cap1"], enables = ["bar", "bar/feature", "baz"] }

# alternatively
[features.foo]
capabilities = ["cap1"]
enables = ["bar", "bar/feature", "baz"]

Of course the existing format (commented out in the above example) should still work. However, it doesn't support the use of capabilities.

Aside from moving the string array to the enables field, I would not change its format. This should help people migrate to the new format, as you only need to shift the values to the new field.

Add additional validation

When validating, cargo builds up a list of capabilites: Map<String, Vec<&FeatureInformation>>, listing the found capabilities, and their providing features. Now it validates:

  • that the number of features providing a capability less then 2
  • that for each required capability (for package and feature) the number of features is exactly 1

Conditional compilation

It would be possible that no selected feature provides a certain capability. Thus it makes sense to allow conditional compilation also using capabilities (in addition to #[cfg(feature="foo")]).

Conditional compilation can use the attribute capability, which is handled the same way feature is, just with capabilities:

#[cfg(capability="json_provider")]
fn get_provider() -> Option<Provider> {
    Some(json_provider::new())
}

#[cfg(not(capability="json_provider"))]
fn get_provider() -> Option<Provider> {
    None
}

Capability identifier/name

The capability identifier should be limited to: ASCII letters, digits, _, and -. A future extension (see below) could be a "global capability", which could be scoped by adding a prefix like global:.

Drawbacks

An argument to not do this, could be that using build.rs, you can already achieve similar results. Without the need to introduce additional concepts. And while this is partially true, it is not a very user-friendly experience.

Additionally, if different people create different ways of processing situations like this in Rust code, contained in build.rs, it may be complicated over time as different concepts and ways of handling similar situations evolve.

Also, is it problematic (or even impossible) to crate proper tooling on logic contained in build.rs files.

Rationale and alternatives

NOTE: to be written …

  • Why is this design the best in the space of possible designs?
  • What other designs have been considered and what is the rationale for not choosing them?
  • What is the impact of not doing this?

This design solves the original use case, of providing mutually exclusive features. It does this by using a clearly defined piece of information, the "capability". It also re-uses the existing feature definition and format (in Cargo.toml) as far as possible. So it isn't completely new, just an extension.

Additionally, it leverages the capability concept to allow conditional compilation and "requiring" of a capability, which plays along with the original use case.

Leveraging the concept at a global level is left as a future task (see below).

As the feature definition (FeatureInformation) could easily be amended now with additional information (like "conflicts", "deprecated", …) it feels as a future proof, yet simple extension to the current information model.

However, the current proposal stops at this point and leaves out implementing additional capability requirements (also see Prior art).

Cargo.toml feature definition

An alternate definition of capabilities for features in the cargo file could be to amend the FeatureValue enum:

pub enum FeatureValue {
    Feature(InternedString),
    Crate(InternedString),
    CrateFeature(InternedString, InternedString),
    Capability(InternedString),
}

In Cargo.toml this could be represented as:

[dependencies]
bar = "*"

[features]
foo = ["provides:bar", "bar", "bar/feature", "baz"]
       ^               ^      ^              ^
       |               |      |              |--- Feature
       |               |      |
       |               |      |--- CrateFeature
       |               |
       |               |--- Crate
       |
       |--- Capability

However, this would require to change the way FeatureValue is parsed from a string. This would mean putting more logic into the specialized format. I think that the value/format of the features is already overloaded. So it makes sense to provide some actual fields, like the dependencies have as well.

Prior art

Package manages like RPM, deb, …

Packages managers like RPM and others, already have the concept of capabilities. It is possible to define dependencies on specific other packages, or on "capabilities", which may be provided by different packages.

For example, an RPM of sendmail may declare that it provides the capability of MTA (mail transfer agent). http://rpmfind.net/linux/RPM/fedora/32/x86_64/s/sendmail-8.15.2-43.fc32.x86_64.html. At the same time, other packages can provide the same capability: http://rpmfind.net/linux/rpm2html/search.php?query=MTA

A crate can be seen a "software package" and Cargo as a package manager.

A key difference to packages managers like RPM is, that RPM doesn't consider a capability "mutually exclusive". This can however be achieved by the use of additional rules, like "Conflicts". Also do such package managers support weak dependencies, like "Suggest" or "Recommends". However, adding such concepts makes things more complex, harder to implement and harder to understand for the user.

OSGi

The Java programming language itself did not have module concept before Java 9. However, OSGi did provide a "bundle" concept on top of Java before that. Dependencies between bundles can be declared explicitly by referencing another bundle, or using generic capabilities. See: https://blog.osgi.org/2015/12/using-requirements-and-capabilities.html

"Bundles" in OSGi would map to "crates" in Rust.

While the capabilities concepts have been re-used in OSGi package managers (like P2 or BND), OSGi itself enforces the capabilities during runtime. Which Rust/Cargo would not do.

For OSGi it is also possible to use much more complex definitions, providing actual values in the capabilities, and enforcing requirements using LDAP queries. While this allows one to crate rather complex constructs, it also makes things harder to understand and maintain. Both from an implementation as well from a user perspective.

If such complex scenerios are required, this proposal still allows for the use of custom code in the build.rs script to implement more complex requirements.

Unresolved questions

In the future there should definitely a second look at the use of "global capabilities" (see below).

Future possibilities

Global example

The focus of this RFC is on crate-local capabilities. However, it would be reasonable to expand this also on a global level.

A common example in the embedded space is the "panic handler" or "allocator", which must be selected, but must also be unique on the whole application.

Currently, people simply comment out alternatives (https://github.com/rust-embedded/cortex-m-quickstart/blob/master/Cargo.toml).

For example, if it would be possible to declare global capabilities as well, different crates could offer different panic handler implementations. Cargo could then ensure that exactly one is part of the build tree.

Consider the application having the following Cargo.toml

requires = ["global:panic-handler"]

[dependencies]
panic-rtt-target = { version = "0.1.1", features = ["cortex-m"] }
panic-halt = "0.2.0"
panic-itm = "0.4.1"

[features]
panic-by-rtt = ["panic-rtt-target"]
panic-by-halt = ["panic-halt"]
panic-by-itm = ["panic-itm"]
panic-by-mock = []

[features.panic-by-mock]
provides = ["global:panic-handler"]

Additionally, assume the dependencies panic-rtt-target, panic-halt and panic-itm each have a Cargo.toml containing:

provides = ["global:panic-handler"]

A provides declaration on the global section defines that a crate will always provide this capability.

With this configuration, Cargo will now ensure that exactly one implementation of a panic-handler is enabled in the build. It can also provide guidance on features to enable or disable.

And my response: this is a decent way of going about it, but it feels like it doesn't quite fit in with how other stuff in the Cargo manifest works. In particular, it feels weird that capabilities don't really get defined anywhere -- they are just 'provides'ed and 'requires'ed. I also worry that adding yet another concept here will make manifests using this feature substantially harder to understand.

This older Pre-Pre-RFC is also somewhat related, if you haven't done so maybe read through it and link to it as well? Pre-Pre-RFC: making `std`-dependent Cargo features a first-class concept. The still experimental namespaced-features work is also relevant reading, I think: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#namespaced-features.

Another alternative I can think of is to have ! prefixes, which you can simply use in feature values. So this:

[features]
json_foo = ["foo_json", "!bar_json"]

would amount to not allowing the json_foo feature to be selected at the same time as the bar_json feature. (Note that your basic example doesn't actually define foo_json either -- is this an optional dependency? For clarity, it would probably helpful to make it explicitly a feature, assuming that optional dependencies result in implicitly defined features of the same name.) This has the advantage that it's easier to understand in simple cases and doesn't introduce a new concept, although it will get messier in the case where you have many alternatives for a single "capability".

Finally, cc @ehuss.

Thanks for this pre-RFC. This is definitely a real problem that users experience, and one which deserves careful attention and a better solution than what we have.

The trickiest design constraint that plays into this feature is our desire to encourage ecosystem compatibility. Both Rust and cargo have made design decisions (including the design of the current cargo features feature to be additive) to encourage that compiling two different packages together does not result in errors. This isn't a hard guarantee, but we try not to design features that would encourage incompatibility. As a result, I think users experience a lot less "DLL hell" with cargo and Rust than with other tools, because its very rare for two dependencies to be incompatible with one another.

Mutually exclusive features are naturally incompatible with one another. As a result, I would be inclined toward a design in which only the end binary can select which of these features they wish to turn on; libraries would have to be compatible with all feature selection.

13 Likes

This looks really interesting, although I do not have time to give it the careful attention it deserves. two important points that will need to be addressed before this can be accepted:

  • How it will be stored in the index. Specifically how do we arrange for older Cargos to either ignore crates that use this or do something reasonable.
  • To what degree will the resolver backtrack in cases where there is a conflict deep in the tree. It sounds like the answer is nun, it will pick the crates you asked for and error when it starts the build.

But mutually exclusive features are already possible in the current system though. Or are you saying that baking this feature in would make it more likely?

Officially, mutually-exclusive features are not supported. Authors are discouraged from using existing additive features as if they were mutually-exclusive.

I guess the idea is that if it's not officially supported, crate authors will try to use some other approach, or at least consider what to do when more than one feature gets selected.

I have a use-case for mutually-exclusive feature: selecting whether a sys crate should be linked statically (and optionally built form source) or dynamically.

This is tricky, because a good default for it depends on the OS and the sys crate. You want to dynamically link to libraries that come with the OS, but statically link everything else. OSes vary greatly in what they have available for linking. Also some apps/libraries may need a very specific version of a sys dependency, so they may prefer static linking in all cases.

That static vs dynamic choice doesn't fit into additive features. I'm not sure if it fits into provides/capability model either, but that's something to consider.

2 Likes

@ctron What is the basis for limiting this feature to ASCII? Rust uses UTF-8. Less than 5% of the world's population have English as their native language, and 80% don't have any real facility in English. [These statistics were found by a trivial Google search.]

Why should a program written for and maintained by colleagues who write in Arabic or Thai or Hanzi have to limit use of this feature to the subset of words which use only unaccented vowels and consonants that can be written in ASCII? That seems to me rather naïve. (Note that the word naïve cannot be spelled correctly using only ASCII.)

One pattern I've found helpful trying to deal with conflicting features today is to assign them a sort of priority/precedence (manually, through some annoying #[cfg(all(feature = "priority2", not(feature = "priority1"))]-style gating. With this approach, if you enable both the priority1 and priority2 features, the priority1 feature takes precedence.

This has been handy for toggling on various backends when only one can be selected, but selecting for the highest performing backend.

It'd be nice if this were a bit easier. It has the benefit of being conflict-free, which I think is more in-line with how Cargo features are expected to work. Granted it's unhelpful in cases where backends are truly mutually exclusive and there's no clear precedence over when one should be selected over the other.

1 Like

You can simplify that slightly by using a build.rs that checks the active features, then sets an internal cfg flag for the one that is actually active, then use that instead of the features in your cfg() tests.

1 Like

Thanks for taking care of putting the post in the right format.

The drawback of the "!" style syntax is that a) it puts more complexity into the string value of the feature. Basically this introduces a "conflicts" feature value, which is as well a new feature. However it makes future enhancements more complicated as well. Which leads to b) adding the concept of the "capability" allows you to define the requirement that you must select one feature to provide this. I don't think this would be possible with the ! syntax.

I think I can do a better job, describing this, and explicitly mentioning the ! syntax proposal in the RFC, do you think that would help?

I think this mixes two things here. a) who selects the feature and b) is that selection valid. I agree that the final application (not the library) should select the feature set. And that still is the case. Only does this RFC introduce a way verify that the select the user has made is actually working.

On some cases you have conflicting features, and it is hard or impossible to design a way around the conflict (e.g. the buffer size use-case or the Kubernetes API case). In cases like this, a library cannot support all features at the same time.

I just found out that the forum prefers to answer in a single post, so I will put the other comments all in this one:

  • I don't have any detailed answer yet: older versions of Cargo would/should simply ignore that, and would run into compilation issues as they would now. So I don't think that changes much. Only when you use a cargo version that has support for this, you would get a proper error message instead.
  • As the "global" version of capabilities is out of scope (future enhancement) this should be on a per-crate level for now. So each crate would be verified individually during the build.

I don't see how. Neither do others. Can you give pointer on how to do this?

You are right! With "ASCII" I meant "letters and digits", not the original 7bit ASCII. My motivation is to have an identifier, and a separator. So that later on, it would be possible to introduce a prefix like "global:". Similar to the "crate/feature" syntax of the existing "feature value". I don't think there is much value in using emojis as capabilities, although it may be fun. :grin: … Of course if would also be possible to add a different namespacing method later on, and simply allow all of UTF-8 as a capability identifier.

This RFC tries to simplify this. And give a proper tool for exactly this use-case. I understand that you can do much more things in a build.rs. However, as I tried to reason in the RFC, this will cause more complex logic to be written, which needs to be maintained, which will divert and bit-rot over time. And doesn't allow to use tooling in order to manage it. And just imagine how complicated that expression will get when you have 10 features, that are mutually exclusive.

I think the case of:

ought to be mentioned. It's an approach that is cumbersome for the library author(s), but has the advantage of solving some issues quite magnificently: the APIs with and without the rc "feature" are incompatible, and yet you can have both version coexists within a dependency tree thanks to the crate having been duplicated.

Although I reckon this will be no panacea, since it has the following drawbacks:

  • if two features are incompatible because they export, linker-wise, the same unmangled symbol, then splitting those into two crates will not solve it;

  • it also has the issue of a combinatorial explosion of the total number of published crates :sweat_smile:

Still, I think it is worth mentioning for the other cases :slightly_smiling_face:

That's why I mentioned using a build.rs script to not write that expression:

// build.rs
fn feature_active(feature: &str) -> bool {
  let feature = feature.to_upper().replace("-", "_");
  let name = format!("CARGO_FEATURE_{}", feature);
  env::var(&name).is_some()
}

let executors = ["tokio", "async-std"]
  .iter()
  .first(feature_active)
  .expect("an executor to be configured");
println!("cargo:rustc-cfg=executor={}", executor);
// in code
#[cfg(executor = "tokio")]
mod tokio;
#[cfg(executor = "async-std")]
mod async_std;

Adding more features to this priority-override-mutually-exclusive-set is just adding them to the array to test for.

Perhaps you're referring to something different from the use case I described, but the one I gave was more along the lines of:

Given features x, y, and z:

  • have x override y and z if either or both are activated
  • have y override z if both are activated (but x is not)

i.e. a declarative conflict-free approach to selecting a "winning" feature from a set of mutually exclusive ones. Another way of thinking about it is assigning priorities to conflicting features, and picking the highest priority feature if all of them are activated.

However as far as I can tell from this pre-RFC, the described mechanism has inherent conflicts rather than a way to resolve activation of multiple mutually exclusive features, particularly based on this part:

Now it validates:

  • that the number of features providing a capability less then 2
  • that for each required capability (for package and feature) the number of features is exactly 1

It also seems like you could update the RFC to introduce conflict resolution (if it's already in there, I missed it, mea culpa), perhaps by restructuring the provides attribute into something like a providers attribute which contains an ordered list by preference. Just spitballing here, based on your previous example:

[features]
json_backend = { providers = ["json_foo", "json_bar"] }

Which leads to the problem that everyone who wants a similar mechanism, needs to write custom Rust code. Everyone does it a bit differently, and everyone has to maintain that logic. Tools cannot process/understand it, as it not declared, but custom logic.

Besides, you now have two locations defining features, the build script and the Cargo file.

You still can use this approach though, if you prefer.

The focus of this RFC is not to resolve conflicts, only to detect them. In the end, the user enables the features as needed. This RFC can help in avoiding conflicts in the code, by pointing them out, through the defined capabilities.

I can imagine, that future improvements could build an automatic resolution on top of this. As additional information can be added to features now.

This is a very low bar. I'm not sure if that is sufficient.

Compare to semver-major crate requirement conflicts. Cargo not only detects when dependencies want incompatible versions of crates, but also automatically mitigates the problem by including multiple copies of the crate. I have some projects when duplicated crates never happen, but I also have some projects where I have dozens of duplicated crates, and it's too hard to deduplicatem them (e.g. because some crates use old depedencies and upgrading them would require a major rewrite). If Cargo only detected semver-major conflicts, I would be completely stuck and unable to compile such projects.

Another package manager I use is Debian's apt. It can detect version conflicts, but it can't mitigate them. That works mostly fine, until it doesn't. If I need to install a Debian package that requires a different version of some core package, it can cause avalanche of upgrades that can render the whole system unusable.

Detecting conflicts is better than silent failures, but OTOH some conflicts can be completely unsolvable dead-ends. Users need to have ability and help from tooling to overcome the conflicts.

2 Likes