How to handle pattern matching on constants

What are these uses? I'm keen to understand the use cases. (sorry, I couldn't parse the long list of data very well).

(To my mind, I like the idea of banning pattern matching on non-built-in types (perhaps also allow public C-like enums?), but I would like to understand the reasons it is used before committing to that position).

https://github.com/rust-lang/rust/blob/master/src/libsyntax/codemap.rs#L977

https://github.com/rust-lang/rust/blob/master/src/libsyntax/diagnostic.rs#L810

https://github.com/rust-lang/rust/blob/master/src/libsyntax/parse/parser.rs#L2072

https://github.com/rust-lang/rust/blob/master/src/libsyntax/ext/base.rs#L683

https://github.com/rust-lang/rust/blob/master/src/librustc/middle/region.rs#L650

1 Like

@petrochenkov thanks for gathering that data. To my eye, these mostly look like they are intended to compare for equality, and so would continue to work the same under any set of rules (presuming Eq impls exist, of course).

Yes, agreed. I guess saying that this "solves" the privacy question was an overstatement -- it addresses the "downstream clients continue to build" part of privacy, but the value of private fields can still be "observed" in a sense. This seems to be the overlap of what I was calling "privacy busting" and "abstraction busting".

If we could get away with it, I’d ideally like to move towards a “you can only pattern match on constants” world (the intersection alternative). This gives us maximum flexibility later on and may not actually break code in practice. Cases which use structure constants can switch to the manual desugaring of guards if required.

I definitely don’t think we can do this if it’s heavily used, however, so I’d at the very least expect a crater run, 1-2 cycles of warnings of usage, and then turning it into an error.

2 Likes

In the short term, I think the intersection approach would be best.

Long term, I think using PartialEq would match what I expect most. While I’m perhaps not very familiar with pattern matching due to mostly being familiar with C++, I expect pattern matching to be user extensible and respect the existing ways that users can change the behavior of a type. If I define PartialEq for a type, I expect the semantics of the entire language to respect it, not just parts of the language. That’s probably the biggest consideration for me, since I don’t like special cases (if I wanted lots of special cases to remember, I’d use C++ rather than Rust :slight_smile: )

I expect pattern matching to be user extensible.

I think this may be the primary difference. I am familiar with pattern matching, and I do not expect pattern matching to be user extensible. As far as I know, neither OCaml nor Haskell have user extensible pattern matching. On the other hand, I think familiarity argument is a moot, since people familiar with pattern matching is a minority, and Rust does not target that minority. In that sense I am interested in what people expect.

Continuing on my expectation, even if pattern matching were user extensible, I do not expect it to use user extensible equality. This is because equality is not sufficent to match patterns.

Consider Expr type, with values like Int(4) and Add(Int(2), Int(2)). Assume user extensible equality for Expr is defined such that expressions are compared after evaluation. (This is probably not a good idea for this particular case, but it is a stand-in for user extensible equality here.) Following constants are defined.

const ONE: Expr = Int(1);
const TWO: Expr = Int(2);
const THREE: Expr = Int(3);
const FOUR: Expr = Int(4);
const TWO_AND_TWO: Expr = Add(TWO, TWO);
const ONE_AND_THREE: Expr = Add(ONE, THREE);

Using user extensible equality, following patterns match TWO_AND_TWO: FOUR, TWO_AND_TWO, ONE_AND_THREE, Add(TWO, TWO). This pattern is not matched: Add(ONE, THREE). This pattern is matched: Add(x, y), but equality does not provide what should be the value of x and y.

1 Like

Not sure if you've seen it, but Scala has nice support for extensible patterns. I'd be interested in pursuing similar solutions at some point, though I don't think it's something we need in short term or in medium term. Still, while I don't think we have to go out of our way to make patterns 100% extensible, I think it makes sense for pattern matching to respect the extensibility that we DO have.

This is also my current opinion. It seems best if we can make this decision in an "affirmative" way -- that is, picking the semantics that we want freely, without feeling constrained. I will gin up a branch that enforces this condition and do a crater run to see what the results are.

That's not quite what I had in mind. In particular, Add(TWO, TWO) would not match against Int(4), because matching the pattern Add(TWO, TWO) is distinct from matching the constant TWO_AND_TWO. The former would test that the variant is Add, and then test whether the two arguments are equal to TWO. The latter would check whether the enum as a whole is equal to TWO_AND_TWO.

I am not familiar with pattern matching in Scala. Does Scala’s extensible pattern matching use extensible equality? I can’t really imagine how.

Aren’t we saying exactly the same thing? Of course expression Add(TWO, TWO) does not match pattern Int(4), but constant expression TWO_AND_TWO does match constant pattern FOUR. Or am I misunderstanding?

The big problem with “extensible pattern matching” is that equality tests are not sufficient. For example, Option<T> implements PartialEq when T implements it. This means that if I pattern match on a constant Option, consistency demands that it use that?

enum Weird<T> {
    Yes(T),
    No
}
impl<T> PartialEq for Weird<T> {
    fn partial_eq(self, other: Self) -> bool {
        panic!();
    }
}
const OPT_C = Weird::Yes(3);

match Weird::Yes(3) {
    OPT_C => calls PartialEq under this proposal,
    Weird::Yes(_) => cannot call PartialEq,
    Weird::Yes(3) => should this call PartialEq?,
    Weird::No => should this call PartialEq?,
}

Scala allows you to define a function which will be used to match a value, so you can for example have a function that takes a string and parses it into an int, and the function can return Some(value) on success or None on failure, which doesn’t match. And then you can do a match which extracts the value of the int. For the record, I find this functionality to be more than a little weird in the way it works, because it doesn’t quite fit in with the rest of the language in various cases.

And I think matching something structurally is fundamentally different than doing a value match. If I do:

match x { 1 => 10, _ => 0 }

Then what I’m doing is asking for equality checks. The fact that the compiler may or may not be able to optimize this is largely an implementation detail, not a semantic one.

Whereas if I do:

match x { Some(1) => 10, Some(_) => 0, None => 0 }

I want something different. I want ‘structural’ matching on Option, but I want value matching on the thing inside Some, if it structurally matches Some, and therefore, it would use equality.

So, if you have a const value being matched, it uses value matching, since the compiler isn’t expected to introspect the structure, since there is no visible structure. Syntactically this makes sense, since the programmer didn’t specify structure, they specified a value. If you want to match the structure of something, you should type it out and make it obvious that’s what you want, right?

3 Likes

I think it would be like this:

match Weird::Yes(3) {
    OPT_C => // panics
    Weird::Yes(_) => // structurally matches against Weird, doesn't call PartialEq
    Weird::Yes(3) => // this can't run, given the above, but if it did, it would structurally match Weird, but call PartialEq for 3
    Weird::No => // structurally matches, without calling PartialEq
}

Note, this is consistent with the current behavior for builtin types, ignoring consts.

1 Like

As my other reply suggests, I agree with this completely.

Constants usually behave as if they were simply inlined at the use site (the reference even describes their functionality that way). It would seem odd if manually inlining the constant changed the behavior of a match (except perhaps by one version being forbidden), and strikes me as something people would screw up when refactoring.

That’s trivially not true, because it’s possible to have a constant with a value specified for a private field used outside the module which has access to the private field. Hence, if the code were inserted verbatim, it would fail to compile due to privacy errors.

(I did say one version might be forbidden.) But yes, I should have been more precise.

If both forms are valid Rust and all the tokens in the definition refer to the same things in both places, then I think they should behave the same way. Put another way: if the constant and its definition are equivalent when used as expressions, they should also be equivalent when used as patterns (if both are accepted).

(These are the same sorts of caveats that occur with macros, which really do expand to their definitions, but not quite in a literal text-copy-paste way.)

This is primarily because I expect using a constant as an expression to be more common, and I think it would be bad if you could think of a constant as a literal value almost everywhere, but in patterns doing so would give you incorrect behavior, often without even a warning from the compiler.

1 Like

Edit: after rereading the OP, I support tweaks to the current behavior. Fix the holes. Don’t attempt to base match on PartialEq, because that’s not powerful enough to do everything, and doing it sometimes is inconsistent.


I’ve been trying to think of a behavior that avoids leaking private members, gives consistent behavior with Weird (which pretty much has to be structural matching for everything), and isn’t stupidly complicated:

PatternMatch(P: Pattern, X: Value) -> bool:

  1. If P ends in @var, check if the rest of the P matches with X. If so, the pattern matches and X is bound to var.
  2. If P is _, P always matches X.
  3. If P is a single variable, P always matches X. Also bind X to the variable P (the complexities of borrow checking this are identical to the current implementation).
  4. If P is a different type than X, throw a type error. Type inference is performed, but structural pattern matching between different types makes no sense.
  5. If their type is a primitive, pattern matching is done by direct comparison.
  6. If their type has any private member variables in its definition, or is a generic (and thus we don’t know one way or the other), throw an access error.
  7. If their type is an enum, they do not match if their discriminants are different. If their discriminants are the same, they match if their variant is C-like, or if PatternMatch(P’, X’) returns true, where P’ and X’ are the tuple or struct after the discriminant.
  8. If their type is a tuple or struct, they match if, for every member M, PatternMatch(P.M, X.M) returns true.

I guess I can summarize this as: structural equality is identical to equality for primitives, and is identical to destructuring and comparing all the member variables for structural equality for compound types, including the restrictions on destructuring (privacy, parametricity, etc).