Suppose that a crate's API has a public struct that wraps an enum that's an implementation detail:
// Initial API
pub struct Foo { inner: FooInner }
enum FooInner { A, B, C }
Intuitively one might think that the crate can later decide to expose that internal enum without breaking backwards compatibility:
// API is later changed to this
pub enum Foo { A, B, C }
However, this is technically not backwards-compatible. This is because client code might match the Foo type with a rest pattern (Foo { .. }). This is because, currently, a rest pattern only matches structs, and doesn't match enums.
// Client code.
// Works with the initial API, but doesn't work with the later API.
fn do_something(x: Foo) {
match x {
Foo { .. } => {}
}
}
To fix this, I propose that a pattern like Foo { .. } should match any value with type Foo, regardless of what Foo is. Any thoughts?
I've had this thought myself but not mentioned it anywhere.
More broadly, the principle is that it should be a non-breaking change to replace a struct that has entirely private fields with a different type (that implements all the same traits). Thus, the next question to ask is: Are there any other language features that would have to change to support this, or is struct pattern matching the only one?
cargo-semver-checks maintainer here. TIL about this accidental breaking change! Thanks for writing it up here.
Until Rust is tweaked to make this no longer breaking, I think this is something we should lint for and warn users about, since it would be awful for a maintainer to find out about this edge case only after publishing an accidental breaking change.
Uniformity was the reasoning in accepting Foo { .. } for tuple-structs and unit-structs, so full from me. Technically this should be a small RFC, but I think it could "sneak" in with just lang FCP under existing RFC'd design guidance.
(Disclaimer: I'm on T-opsem, which is a subteam of T-lang, but have no special input on T-lang decisions. Barring maybe a greater familiarity with the grammar edge cases than typical given previous involvement with WG-grammar, rust-analyzer's AST, and syn.)
Note that becoming an enum or union is not the only possible change, because newly-declared nominal types are not the only kinds of types that can be given a name, thanks to type aliases. One could also imagine replacing a struct-with-private-fields type with:
Type-alias-impl Trait opaque type (when that stabilizes)
Type alias for an array or tuple
Type alias for a slice
Type alias for dyn Trait, if the original type was !Sized
It's separately a breaking change to make two types equal that weren't before, because it could create a trait implementation overlap downstream — but all of the above can be made to not overlap using a newly introduced contained type, e.g.
// Old
pub struct Foo([Bar; 8]);
struct Bar(u32);
// New
pub type Foo = [Bar; 8];
pub struct Bar(u32);
So, if we decide Rust should follow the principle that a struct with private fields is the maximally opaque type that can always be replaced with some other kind of type, then Foo { .. } needs to match almost anything, not just enums and unions in addition to structs.
Other solutions to the problem that are not expanding what the pattern can be applied to:
Shrink instead: prohibit Foo { .. } except when Foo has at least one visible field, or no fields. That way, all the cases where this pattern can be used are cases where changing the type away from a struct would be a breaking change.
This would have to be done across an edition, so it wouldn't be completely reliable protection since old editions can still be used.
Introduce a new kind of explicitly opaque newtype (or an attribute modifying structs), which cannot ever be matched by any syntax but the general ones (_ and ident). This is the same as the previous except that it is opt-in by the definer of the type.
Another option: change the spelling of the “match a type but nothing else” pattern (leaving the old one intact, of course). Foo _, for example (though probably not a good one).
EDIT: I suppose this doesn’t fix the problem unless the old syntax is at least deprecated.
Another solution is combining the two approaches: Have a new #[opaque] attribute that prevents external users from matching the struct with a rest pattern, and a #[not_opaque] to allow the matching. (Exact syntax is bikesheddable.) In the current edition, default to #[not_opaque]. In a future edition, default to #[opaque].
Good point. So, reduce my list of candidates to TAIT and dyn Trait, both of which can introduce actually completely new types (that still aren't enum or union types).
Oddly, the pattern Foo(..) can only match a tuple-like struct named Foo whose fields are all public. However, the pattern Foo {..} can match any struct named Foo at all.
I cannot manage to recall exactly where, but I know I've seen a Struct { .. } pattern used in a macro to check the type of a place. Although, _: Struct should generally be as good as Struct { .. }, so I don't recall why Struct { .. } was used, unfortunately… best inference is that it's generic related (Struct { .. } is fully unconstrained on Struct's generics whereas _: Struct only accepts the zero-generic instantiation). I do think minimizing coercion options was a part of it.
That's an interesting point. I guess maybe we should allow u32<..> and Vec<..> and such -- after all, u32<>already works -- as a way to say "look, there might be generics here, and I don't care". Kinda like how Vec::from(x) works without needing to be Vec::<_>::from(x) -- or Vec::<_, _>::from(x) I suppose, with allocators -- and thus we could say Vec::from is Vec::<..>::from.
Interestingly, Vec::from is Vec::<>::from. An empty or missing <> in an expression-path means "infer all generics," but an empty or missing <> in a type-path means "default all generics." This is an unreasonable level of annoying to me for how irrelevant it really is.
Hello fellow Crustaceans,
I hope that this is not too off-topic.
I have collected a few examples as testcases how this should be handled in cargo-semver-checks.
Since rust-SemVer is a bit new for me, did I miss an edge case?
I have omitted these testcases, as those are otherwise covered by different (less edgecase-y) lints:
the struct ⇒ type def changes mentioned by @kpreid.
an empty struct which is marked #[non_exhaustive].
Update (for those only watching this forum):
These lints relating to this post were released in cargo-semver-checks=v0.36.0, meaning that projects doing this change now get an appropriate warning: