Differentiating semver-safe and semver-unsafe #[cfg] usage?

While discussing this pre-RFC one of the raised issues so far is that it creates a potential semver hazard. It occurs to me that this isn't globally a problem with #[cfg] (in general, cfg is allowed to break semver), but exclusively an issue with a subset of #[cfg]. Primarily this applies to features, which are intended to be additive-only and not break semver.

Has there ever been any discussion of forking #[cfg] into two forms -- a semver-safe version (disallowed in positions that could break semver), and a semver-unsafe version (allowed more permissively), and restricting features to only be used in the former?

This would provide, I think, three benefits:

  • It would clean up situations right now where features can break semver, despite recommendations not to (e.g. #[cfg(feature = "...")] decorating a function parameter).

  • It would allow for semver-unsafe functions, like custom conditionals to be more permissively added to the language in positions currently unsupported due to semver-safety concerns.

  • Like unsafe in general, it would draw attention to code authors of where they might be breaking semver, and would allow analysis tools to warn about potential API compatibility breakage.

There are situations where you explicitly want to make changes that break semver on a compiler flag. Consider, for example, a demo build of a video game stripping out code used only in the full version, and doing so in some cases on individual function parameters or traits. Here, semver isn't a concern -- this is an application build with no aspirations of library-level compatibility or a public API.

1 Like

Can you elaborate more on what the different positions would be, and how the compiler would check them?

There's a ton of stuff that could break semver, and checking stuff that's cfg'd out is hard -- after all, it might be referencing stuff that doesn't even exist without the cfg.

Good question, and that's tricky. My limited understanding of the feature guidance is that "enabling a feature shouldn't break code that already compiles against that library".

More likely "safe" #[cfg] positions in that case would be:

  • Decorating a function signature
  • Decorating a struct declaration

More likely "unsafe" #[cfg] positions would be:

  • Decorating an individual parameter of a function
  • Decorating a trait impl (potential repeat impl or coherence problems)

It's probably nontrivial to assess all of these and definitively rule in/out semver safety. That said, I think my main concern is finding ways to actually permit more semver-breakage, but in a disciplined way, primarily for application development where you're not concerned about supporting a semver'd public API.

one place where you explicitly want semver-unsafe cfg abilities is when features are being used (misused?) to enable explicitly unstable/breaking APIs. stability attributes are generally a better solution for this but those have been unstable/mostly-std-only for waay too long...

This is often necessary for optional dependencies. For example:

#[cfg(feature=serde)]
impl serde::Serialize for MyType { ... }

#[cfg(feature=serde)]
impl<'de> serde::Deserialize<'de> for MyType { ... }

My understanding of the intent of the coherence rules is to make sure that any given trait impl should be valid in only one crate, for the express purpose of making conditional and future trait impls semver-safe.

I guess my questions then are, in the context of RFCs to expand the scope and power of #[cfg]:

  • Is the responsibility of features not breaking semver actually a responsibility of #[cfg] as a whole (and the responsibility of potential expansions to #[cfg] usage)?

  • If so, is there a way to separate out the usages of #[cfg] that are semver-sensitive (again, mainly just features) from those that don't care about semver hazards?

#[cfg] is an insanely complex thing to keep track of. Simply ensuring that all possible configurations compile is already hard enough. Tracking that none of those configurations break semver borders on impossible for moderately complex #[cfg] usages.

Thus I don't think that separating "semver-stable" and "semver-unstable" is possible, makes any sense or would be a good addition to the language. You can, of course, explicitly introduce features which gate semver-unstable changes, but that's something that can be decided at the library level, and there is no way to enforce proper usage at the language level.

With regards to the #[cfg] on where clauses proposal, I consider it too complex in general. Sure, it may look convenient to just #[cfg]-out some bits of code here and there, but the end result of such process is unreadable code with a huge combinatorial complexity of feature interactions. I have PTSD from copiuos usages of ifdef-gated code in some C/C++ projects. That's the last thing I'd want to see in Rust.

At first glance it may look easier to gate a few where-clauses and a few statements in the function body, but in the end it's much easier to read, and thus maintain, if you split it into several entirely different implementations, or better, if you are able to abstract the diverging logic. For example, you can introduce a new trait which encodes the possible logical differences, and #[cfg]-gate its different implementations. The consuming function would just use the trait as normal, without any #[cfg]-directives.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.