Rest patterns (`Foo { .. }`) should match non-struct types

Suppose that a crate's API has a public struct that wraps an enum that's an implementation detail:

// Initial API
pub struct Foo { inner: FooInner }
enum FooInner { A, B, C }

Intuitively one might think that the crate can later decide to expose that internal enum without breaking backwards compatibility:

// API is later changed to this
pub enum Foo { A, B, C }

However, this is technically not backwards-compatible. This is because client code might match the Foo type with a rest pattern (Foo { .. }). This is because, currently, a rest pattern only matches structs, and doesn't match enums.

// Client code.
// Works with the initial API, but doesn't work with the later API.
fn do_something(x: Foo) {
    match x {
        Foo { .. } => {}
    }
}

To fix this, I propose that a pattern like Foo { .. } should match any value with type Foo, regardless of what Foo is. Any thoughts?

12 Likes

I've had this thought myself but not mentioned it anywhere.

More broadly, the principle is that it should be a non-breaking change to replace a struct that has entirely private fields with a different type (that implements all the same traits). Thus, the next question to ask is: Are there any other language features that would have to change to support this, or is struct pattern matching the only one?

5 Likes

:wave: cargo-semver-checks maintainer here. TIL about this accidental breaking change! Thanks for writing it up here.

Until Rust is tweaked to make this no longer breaking, I think this is something we should lint for and warn users about, since it would be awful for a maintainer to find out about this edge case only after publishing an accidental breaking change.

I've opened an issue for writing a cargo-semver-checks lint for this case: New lint: public API struct became enum or union · Issue #954 · obi1kenobi/cargo-semver-checks · GitHub

8 Likes

Uniformity was the reasoning in accepting Foo { .. } for tuple-structs and unit-structs, so full :+1: from me. Technically this should be a small RFC, but I think it could "sneak" in with just lang FCP under existing RFC'd design guidance.

(Disclaimer: I'm on T-opsem, which is a subteam of T-lang, but have no special input on T-lang decisions. Barring maybe a greater familiarity with the grammar edge cases than typical given previous involvement with WG-grammar, rust-analyzer's AST, and syn.)

I remember talking about this before, where was it --

searches

-- ah, here: Pre-RFC: Stable rustdoc URLs - #31 by elidupree

though that thread was about its implications for documentation filenames, and I didn't think about fixing it.

Well, fixing it seems good!

1 Like

Note that becoming an enum or union is not the only possible change, because newly-declared nominal types are not the only kinds of types that can be given a name, thanks to type aliases. One could also imagine replacing a struct-with-private-fields type with:

  • Type-alias-impl Trait opaque type (when that stabilizes)
  • Type alias for an array or tuple
  • Type alias for a slice
  • Type alias for dyn Trait, if the original type was !Sized

It's separately a breaking change to make two types equal that weren't before, because it could create a trait implementation overlap downstream — but all of the above can be made to not overlap using a newly introduced contained type, e.g.

// Old
pub struct Foo([Bar; 8]);
struct Bar(u32);
// New
pub type Foo = [Bar; 8];
pub struct Bar(u32);

So, if we decide Rust should follow the principle that a struct with private fields is the maximally opaque type that can always be replaced with some other kind of type, then Foo { .. } needs to match almost anything, not just enums and unions in addition to structs.

9 Likes

Other solutions to the problem that are not expanding what the pattern can be applied to:

  • Shrink instead: prohibit Foo { .. } except when Foo has at least one visible field, or no fields. That way, all the cases where this pattern can be used are cases where changing the type away from a struct would be a breaking change.

    This would have to be done across an edition, so it wouldn't be completely reliable protection since old editions can still be used.

  • Introduce a new kind of explicitly opaque newtype (or an attribute modifying structs), which cannot ever be matched by any syntax but the general ones (_ and ident). This is the same as the previous except that it is opt-in by the definer of the type.

1 Like

Another option: change the spelling of the “match a type but nothing else” pattern (leaving the old one intact, of course). Foo _, for example (though probably not a good one).

EDIT: I suppose this doesn’t fix the problem unless the old syntax is at least deprecated.

not the best example yet, because it still breaks downstream code like

trait UserTrait {}
impl<T, const N: usize> UserTrait for [T; N] {}
impl UserTrait for Foo {}
2 Likes

Another solution is combining the two approaches: Have a new #[opaque] attribute that prevents external users from matching the struct with a rest pattern, and a #[not_opaque] to allow the matching. (Exact syntax is bikesheddable.) In the current edition, default to #[not_opaque]. In a future edition, default to #[opaque].

Good point. So, reduce my list of candidates to TAIT and dyn Trait, both of which can introduce actually completely new types (that still aren't enum or union types).

Oddly, the pattern Foo(..) can only match a tuple-like struct named Foo whose fields are all public. However, the pattern Foo {..} can match any struct named Foo at all.

I think at the very least we can warn about that in every edition.

TBH I'd even be tempted to bump it to deny, because I'm not sure why you'd ever be doing it in the first place? The only time I've ever seen Vec{ .. } as a pattern, for example, is in this joke: Don't like `<_>`? Avoid it with this one weird trick! - The Rust Programming Language Forum

(Well, only to deny if we'd also decided to ban it in an upcoming edition. It doesn't meet my bar for deny-by-default otherwise.)

I cannot manage to recall exactly where, but I know I've seen a Struct { .. } pattern used in a macro to check the type of a place. Although, _: Struct should generally be as good as Struct { .. }, so I don't recall why Struct { .. } was used, unfortunately… best inference is that it's generic related (Struct { .. } is fully unconstrained on Struct's generics whereas _: Struct only accepts the zero-generic instantiation). I do think minimizing coercion options was a part of it.

That's an interesting point. I guess maybe we should allow u32<..> and Vec<..> and such -- after all, u32<> already works -- as a way to say "look, there might be generics here, and I don't care". Kinda like how Vec::from(x) works without needing to be Vec::<_>::from(x) -- or Vec::<_, _>::from(x) I suppose, with allocators -- and thus we could say Vec::from is Vec::<..>::from.

1 Like

Interestingly, Vec::from is Vec::<>::from. An empty or missing <> in an expression-path means "infer all generics," but an empty or missing <> in a type-path means "default all generics." This is an unreasonable level of annoying to me for how irrelevant it really is.

4 Likes