Control flow in final operand?

rye · September 9, 2022, 1:41am

Should the following code compile? (playground) I'm trying to determine whether a GitHub issue is appropriate for this case and if so whether this is more of a diagnostics issue (i.e. this shouldn't compile) or a bug (i.e. this should compile).

fn foo(input: bool) -> u8 {
    match input {
        true => 0b01 << 4,
        false => 0b00 << 4,
    } & 0b00110000
}

fn main() {
    println!("{}", foo(true));
}

(Spoiler: it doesn't compile.)

I ran into this earlier and was a bit confused both by the diagnostics rustc emits as well as the fact this doesn't get compiled at all. When I put () around the whole expression that foo returns, it compiles just fine. When I assign the expression (without parens) to a variable and just put the variable's name on the last line, it works just fine — it's just when the match (or an if statement) is a subexpression of the final operand that rustc doesn't compile the block. Is there something about mixing match / if with other operators specifically in the final operand context that messes things up?

CAD97 · September 9, 2022, 2:23am

This is an unfortunate consequence of semicolon elision. Formatting with rustfmt gives the game away:

fn foo(input: bool) -> u8 {
    match input {
        true => 0b01 << 4,
        false => 0b00 << 4,
    }
    &0b00110000
}

Because the match is in "statement position", it's interpreted as a statement rather than an expression. This is what allows you to not always require a ; after a block-like expression (e.g. match, if, or even regular blocks).

How the syntax is interpreted is decided long before types are available. Though it'd actually be much worse if blocks were conditionally statements or expressions depending on their type, as then parsing would become undecidable.

As such, this is a diagnostics issue. You get much better results from + than &, so there's maybe some hope at least... although, the + error is a syntax error and not a type error.

error: leading `+` is not supported
 --> src/main.rs:5:7
  |
5 |     } + 0b00110000
  |       ^ unexpected `+`
  |
help: parentheses are required to parse this as an expression
  |
2 ~     (match input {
3 |         true => 0b01 << 4,
4 |         false => 0b00 << 4,
5 ~     }) + 0b00110000
  |

schungx · September 9, 2022, 10:35am

This usually trips up context-free grammars (context-free because you cannot depend on the context to parse, the context here being the type returned by the match) when there is ambiguity with regards to the following token (in this case, & can mean take-reference or bitwise and).

When the parser sees match (or if for this matter), it has to decide whether to parse it as an expression or a statement.

Here it seems to prefer statement, and I can guess why: it is much more common for the & operator to actually be the following statement instead of a bit-wise operator.

I would say the following is probably quite common in code, and users will scream if it constantly gets parsed as a bitwise &...

match something {
    true => do_something(),
    false => do_something_else()
}
&return_value

On the other hand, it is probably much less common to start off a statement with +, so Rust disallows it instead.

ckaran · September 9, 2022, 1:12pm

The fun part is that the following works (and to @CAD97's point about using rustfmt, the parenthesis format it correctly for an expression):

fn foo(input: bool) -> u8 {
    (match input {
        true => 0b01 << 4,
        false => 0b00 << 4,
    }) & 0b00110000
}

fn main() {
    println!("{}", foo(true));
}

playground

cuviper · September 9, 2022, 2:03pm

It's not just for statements -- Rust doesn't have unary + at all.

Lonami · September 9, 2022, 11:21pm

I recently opened Nested match not being treated as an expression which I find to be quite similar to OP's code listing (and I wouldn't be surprised if my issue was itself a duplicate). Another user left this link to the reference Statements - The Rust Reference.

tczajka · September 9, 2022, 11:40pm

Your example is different because it's not about semicolon elision but rather about comma elision in a match.

I couldn't find anything in the reference that the ambiguity is resolved in favor of always adding a comma there, but it looks like it is in practice, so either it needs to be documented, or it's a bug.

system · December 8, 2022, 11:40pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Understanding decisions behind semicolons language design	34	3896	January 18, 2022
Syntax of block-like expressions in match arms language design	9	2923	March 25, 2019
Possible parser defect compiler	8	558	September 3, 2024
Pre-RFC: syntax sugar for `matches!` language design	42	2675	November 1, 2020
Compiler error when return expression ends with `as` language design	2	579	April 5, 2023

Control flow in final operand?

Related topics