Pre-RFC: syntax sugar for `matches!`

this has nothing to do with software. It is about syntax.


Not necessarily possible, at least not easily. The => from match is not just a binary operator, so this is not your usual operator expression parsing anymore. To make it even more complicated, the match-arm-=> is either PAT if EXPR => EXPR or PAT => EXPR while the matches-=> would be EXPR => PAT. Total madness to apply any kind of associativity.

Approaching a if b => c => d you would at least need to try parsing c => d as an expression which can either fail (when c is only valid syntax for a pattern) or succeed but then be immediately discarded because you actually need to parse it as a pattern followed by => instead (and make sure to discard it quickly, crucially before fully parsing d, otherwise you’re quickly in quadratic complexity land). Also imagine the inconsistency if the match arm a if b => c => d is supposed to mean a if (b => c) => d but the match arm a => c => d probably still means a => (c => d) (since the left hand side of the arm can only be a pattern).

And taking any anology to == would suggest to have no associativity but require parentheses anyways.

And taking any analogy to == would suggest to have no associativity but require parentheses anyways.

Yes. I would like the new matches! syntax to be analogous to ==.

Requisite note:

(Disclaimer: speaking as a member of the grammar wg but not on behalf of the group)

Adding new sigils is highly problematic for macro_rules! macros. E.g. >>= gets handled as a single token for macro_rules!, and must be matched as >>= and not as > > = or >> = or anything else. ~= is currently interpreted as ~ =, and the two literal sequences in source code are completely interchangeable in both macro definitions and use. Adding a new token to glue the two together would potentially be possible, but difficult and with surprising edge cases, in order to avoid breaking macro_rules! macros.

Only parsing the ~= token in a new edition mode would change the trade-off to be a decent amount less problematic, but it'd still be a surprisingly subtle edition difference and make calling macros cross-edition using ~= weird 202X->2018 (have to break it into ~ =) and impossible 2019->202X (as there is no way for edition 2018 to write the token if it's edition gated).

(Disclaimer²: I am not providing an opinion on new syntax for matching at this time.)

1 Like

fixed

Actually I think I understand. Any extra sigils would break custom syntax in somebody's macro_rules. I hate to bring up the try operator but I'm curious what path they took to avoid this.

? wasn't really problematic because it's a single character and wasn't already part of a multi-character operator. Multi-character operators are the challenge for the reasons mentioned above.

2 Likes

Regarding the analogy with Perl, it's '=~', not '~=', and there is also '!~' for "does not match".

3 Likes

Just a thought, how would you expect bindings to work inside a match arm?

I'm picturing something like

let myvar = Some(Ok(6));
match myvar {
    Ok(x) if myvar ~ Some(y) => { /* do something with x */ }
}

but I can also imagine something like

let var1 = Ok(7);
let var2 = Some(6);
match var1 {
    Ok(x) if var2 ~ Some(y) => { /* do something with x and y */ }
}

It seems like this issue interferes with a huge number of proposals for new syntax. Which in most cases is not much of a problem, since most proposals for new syntax are unjustified for other reasons. (I'm not a fan of this one.) Indeed, it's plausible that a need will never arise to add new sigils to Rust. But if one ever does, it will be too bad if this gets in the way.

Surely there's some kind of solution. Using >>= as an example, what if it were parsed as two tokens, but with a special case to make it an error if the tokens were not physically adjacent in the source file? Yes, there are edge cases (proc macros), and yes, it's a hack, but it should be doable. And Rust already has at least one hacky edge case around sigil tokenization: the fact that >> is sometimes treated as two closing brackets, despite being a single token. C++ used to force you to write > > in that situation, but people realized that it makes no sense when there's no real ambiguity in the syntax; it was fixed in C++11, and from the beginning in Rust. A similar principle should apply here.

3 Likes

In my opinion, the fact that matches!() can be implemented purely as a macro without any problems indicates exactly that it should not be a language feature.

It's already syntactic sugar anyway, basically – it stands for a match with a catch-all that defaults to false. The advantages (if any) of having it as a builtin feature would be marginal, but it would be heavily redundant, because pattern matching already exists in the language.

6 Likes

Yeah, think this is not really an issue, but rather the artifact of current implementation which we’ll need to fix for libraryfication anyway.

The core problem is that the „text -> toke trees -> ast“ model is a lie.There‘s no such thing as universal token tree format, because proc macros and macro by example already are using TTs of different shapes. Now, the rest of the compiler „happens“ to use mbe-style token trees, but that’s a pretty ad hoc model.

The right way to think about TTs is as an interface between compiler and macros. When expanding a macro, compiler needs to lower its internal representation to the TT format, appropriate for the macro. The knowledge that $tt matches == but not ~= should be the part of this lowering layer.

To my mind, the infix matches is the obvious choice. If you know what a match statement is, the meaning of if foo matches Ok(0) { is immediately clear without learning any new rules.

I certainly wouldn't want a new sigil for this, especially not one with ~, which has many other meanings, like "approximately" and "not". The slightly shorter code can't possibly be worth the downside of adding one more thing for new users to stumble over.

6 Likes

I think that, if a language construct were to be added to replace the matches! macro, this would read quite nicely, even if the keyword matches is a bit long by Rust's standards (compare fn, impl, type, let, etc).

However what could make this worth it for me is the orthogonal feature of being able to pattern match while also using && to create larger boolean expressions in an if-style block. Currently this is possible in a match expr but it brings with it 2 levels of indentation for the match expression arms, which is not always desirable.

So basically I'd like something like this to be possible:


let opt = Some(42u8);

if let Some(num) = opt && num != 42 {
    // take action
} 

But that brings with it its own design issues w.r.t. composability of boolean expressions.

What about struct literals? matches! gets around any parse ambiguity because has surrounding delimiters.

What is the meaning of the following?

if foo { bar } matches baz { bar } { bar }

Think about what the parser needs to know for that to work, and what a human will have to do to understand it.

We already have a similar case when it comes to if and struct literals where we recover somewhat gracefully, but it is an ugly hack that I would like to avoid infecting other parts of the grammar.

8 Likes

I already address this issue in the OP. This feature does not include variable bindings.

Nothing left to design here, the RFC has landed almost 2 years ago.


And in the meantime, just in case it wasn’t clear, matches! support guards:

if matches!(opt, Some(num) if num != 42) {
    // take action
} 
3 Likes

Is there another way to parse this besides

if matches!(foo{bar}, baz{bar}) { bar }

?

Yes.

if matches!(foo { bar }, baz) {
    bar
}
// next statement:
{
    bar
}
1 Like

how could it even know? It’s 100% ambiguous:

#[derive(PartialEq, Eq)]
struct foo { bar: () }
type baz = foo;
const baz: foo = foo { bar: () };
const bar: () = ();

fn main() {
    // if foo { bar } matches baz { bar } { bar }
    if matches!(foo { bar }, baz) { bar } { bar }
    if matches!(foo { bar }, baz { bar }) { bar }
}

(playground)

Funnily enough it actually works on stable. Haven't figured out which one it's using yet (playground)