How to handle pattern matching on constants

#4

For me, I never expected pattern matching to run ==, so I am against such changes. Current semantic roughly agrees with my intuition.

I agree with forbidding associated constants and function calls in constants for pattern matching. I am not sure about privacy checking. In my mind, there can be a public constant whose value contains private fields. I don’t see it as busting privacy, because after all, for the case to trigger, constants need to be public.

#5

It is unclear how big a problem this will be in practice, but the interaction with exhaustiveness definitly adds new and exciting ways for you to break clients. I guess there is another middle-ground that I hadn’t considered, which is to preserve the existing runtime semantics, but remove the interaction with exhaustiveness checking etc. This would even permit associated constants at some point.

#6

To elaborate a bit more on this alternative, which I will call Pattern conversion only at trans time. The idea would be that we treat constants as opaque when checking for exhaustiveness and dead-code, which ameliorates the privacy-busting and encapsulation-busting cons. It also permits associated constants to appear, since we only need to do pattern expansion during trans, at which point we are fully monomorphized.

It does not, however, resolve the fact that we have two notions of equality, so there is still abstraction-busting that lets you distinguish “deeply structural equal” and “semantically equal” (even if you didn’t intend to do so).

#7

In my mind, exhaustiveness backward compatibility hazard is an orthogonal problem to constants in pattern matching. While the interaction with exhaustiveness is certainly surprising and undesirable, adding a new enum variant can break clients by exhaustiveness today and the interaction doesn’t seem worse than that.

We had RFC 757 to address exhaustiveness problems and I think it is a good solution for this too. If you have public constants TRUE and FALSE but do not want clients to see them as exhaustive, you can make them an extensible enum.

#8

I disagree that these problems are the same. It is well-known that if you export a public enum you can break clients, yes, but if you can break clients by changing private details or your structure, that is another thing.

EDIT:

There is no enum here to mark as extensible or inextensible, though.

#9

The “private field matching” problem is annoying as it can break encapsulation, and I think the best way to prevent it is to prohibit matching on values with private fields. I also don’t care much for projections in match - you can always write the guard version.

On the other hand, I don’t think the const fn problem is really problematic - callers can observe their callee’s precise behaviour, and if the call occurs at compile time, they can observe it at compile time, with it not being much worse than [u8; 1/(HASH_ONE^HASH_ZERO)].

+1 for making unreachable arm detection a lint - this is required for improving check_match backwards-compatibly.

#10

trans-time expansion still allows you to do structural equality on types with private fields, which breaks parametricity in possibly-unintended ways (not that we will have any parametricity left after specialization anyway, but this seems more “accidental”).

I would prefer that matching always do structural equality, even on types with floating point.

#11

What are these uses? I’m keen to understand the use cases. (sorry, I couldn’t parse the long list of data very well).

(To my mind, I like the idea of banning pattern matching on non-built-in types (perhaps also allow public C-like enums?), but I would like to understand the reasons it is used before committing to that position).

#12

https://github.com/rust-lang/rust/blob/master/src/libsyntax/diagnostic.rs#L810

1 Like
#13

@petrochenkov thanks for gathering that data. To my eye, these mostly look like they are intended to compare for equality, and so would continue to work the same under any set of rules (presuming Eq impls exist, of course).

#14

Yes, agreed. I guess saying that this “solves” the privacy question was an overstatement – it addresses the “downstream clients continue to build” part of privacy, but the value of private fields can still be “observed” in a sense. This seems to be the overlap of what I was calling “privacy busting” and “abstraction busting”.

#15

If we could get away with it, I’d ideally like to move towards a “you can only pattern match on constants” world (the intersection alternative). This gives us maximum flexibility later on and may not actually break code in practice. Cases which use structure constants can switch to the manual desugaring of guards if required.

I definitely don’t think we can do this if it’s heavily used, however, so I’d at the very least expect a crater run, 1-2 cycles of warnings of usage, and then turning it into an error.

2 Likes
#16

In the short term, I think the intersection approach would be best.

Long term, I think using PartialEq would match what I expect most. While I’m perhaps not very familiar with pattern matching due to mostly being familiar with C++, I expect pattern matching to be user extensible and respect the existing ways that users can change the behavior of a type. If I define PartialEq for a type, I expect the semantics of the entire language to respect it, not just parts of the language. That’s probably the biggest consideration for me, since I don’t like special cases (if I wanted lots of special cases to remember, I’d use C++ rather than Rust :slight_smile: )

Core team meeting 2015-11-03 (1.5 release during work week; cargo check; const patterns; std::time; ffi)
#17

I expect pattern matching to be user extensible.

I think this may be the primary difference. I am familiar with pattern matching, and I do not expect pattern matching to be user extensible. As far as I know, neither OCaml nor Haskell have user extensible pattern matching. On the other hand, I think familiarity argument is a moot, since people familiar with pattern matching is a minority, and Rust does not target that minority. In that sense I am interested in what people expect.

Continuing on my expectation, even if pattern matching were user extensible, I do not expect it to use user extensible equality. This is because equality is not sufficent to match patterns.

Consider Expr type, with values like Int(4) and Add(Int(2), Int(2)). Assume user extensible equality for Expr is defined such that expressions are compared after evaluation. (This is probably not a good idea for this particular case, but it is a stand-in for user extensible equality here.) Following constants are defined.

const ONE: Expr = Int(1);
const TWO: Expr = Int(2);
const THREE: Expr = Int(3);
const FOUR: Expr = Int(4);
const TWO_AND_TWO: Expr = Add(TWO, TWO);
const ONE_AND_THREE: Expr = Add(ONE, THREE);

Using user extensible equality, following patterns match TWO_AND_TWO: FOUR, TWO_AND_TWO, ONE_AND_THREE, Add(TWO, TWO). This pattern is not matched: Add(ONE, THREE). This pattern is matched: Add(x, y), but equality does not provide what should be the value of x and y.

1 Like
#18

Not sure if you’ve seen it, but Scala has nice support for extensible patterns. I’d be interested in pursuing similar solutions at some point, though I don’t think it’s something we need in short term or in medium term. Still, while I don’t think we have to go out of our way to make patterns 100% extensible, I think it makes sense for pattern matching to respect the extensibility that we DO have.

#19

This is also my current opinion. It seems best if we can make this decision in an “affirmative” way – that is, picking the semantics that we want freely, without feeling constrained. I will gin up a branch that enforces this condition and do a crater run to see what the results are.

#20

That’s not quite what I had in mind. In particular, Add(TWO, TWO) would not match against Int(4), because matching the pattern Add(TWO, TWO) is distinct from matching the constant TWO_AND_TWO. The former would test that the variant is Add, and then test whether the two arguments are equal to TWO. The latter would check whether the enum as a whole is equal to TWO_AND_TWO.

#21

I am not familiar with pattern matching in Scala. Does Scala’s extensible pattern matching use extensible equality? I can’t really imagine how.

#22

Aren’t we saying exactly the same thing? Of course expression Add(TWO, TWO) does not match pattern Int(4), but constant expression TWO_AND_TWO does match constant pattern FOUR. Or am I misunderstanding?

#23

The big problem with “extensible pattern matching” is that equality tests are not sufficient. For example, Option<T> implements PartialEq when T implements it. This means that if I pattern match on a constant Option, consistency demands that it use that?

enum Weird<T> {
    Yes(T),
    No
}
impl<T> PartialEq for Weird<T> {
    fn partial_eq(self, other: Self) -> bool {
        panic!();
    }
}
const OPT_C = Weird::Yes(3);

match Weird::Yes(3) {
    OPT_C => calls PartialEq under this proposal,
    Weird::Yes(_) => cannot call PartialEq,
    Weird::Yes(3) => should this call PartialEq?,
    Weird::No => should this call PartialEq?,
}