Here is a case where a human reviewer might misinterpret how code is actually executed:
fn main() {
{ false } || { true };
}
unless the code is conventionally formatted with line-breaks:
fn main() {
{
false
}
|| { true };
}
In this specific case, the ambiguity is inadvertently pointed out by rustc because the author intended a logical OR expression, while the compiler follows two specific rules:
- the compiler always prefers parsing a block as a separate statement in ambiguous cases;
- the result type of a block expression must be
()when it's parsed as a statement with its trailing semicolon omitted;
Combined together, a confusing type error is triggered which eventually points out the ambiguity (related issue: `{ expr1 } || expr2;` logical or / closure ambiguity · Issue #150552 · rust-lang/rust · GitHub)
But what if the author actually intended:
{
expr1
}
|| expr2;
by writing { expr1 } || expr2;?
If expr1 evaluates to (), there is no compiler error. The unused closure that must be used warning could be turned off or lost in the noise. Meanwhile, a reviewer / collaborator might read the entire line before ; as a single logical OR expression. What can we do to reduce this kind of ambiguity?
A Naive Proposal
We could restrict the conditions under which a trailing semicolon can be omitted. Specifically, we could require that an ExpressionWithBlock used as a statement can only have its trailing semicolon omitted when it is the last statement before a newline.
So
{
expr1
}
|| expr2;
can only have alternate forms like this:
{ expr1 }
|| expr2;
or this:
{ expr1 }; || expr2;
but not this:
{ expr1 } || expr2;
However, this would be a breaking change and contradicts the following rule from Whitespace - The Rust Reference:
Rust is a “free-form” language, meaning that all forms of whitespace serve only to separate tokens in the grammar, and have no semantic significance.
A Rust program has identical meaning if each whitespace element is replaced with any other legal whitespace element, such as a single space character.
Although in an intuitive sense, the claim that "A Rust program has identical meaning if each whitespace element is replaced..." is already slightly inaccurate due to line comments, where we would certainly break the program if we replace the line-break at the end of a line comment with a normal space character.
Discussion
Even if expr1 in { expr1 } || expr2; evaluates to (), the unused closure that must be used warning provides a hint. But is that enough?
I'm not sure whether this should be a hard error (or maybe even change the default parsing preference so that { expr1 } || expr2; is actually interpreted as a logical OR expression with its result implicitly discarded in future editions), given the cost of a breaking change and adding exceptions to existing rules. I’m curious to hear the community’s thoughts on whether the "free-form" nature of Rust should be strictly preserved, or if we should lean into line-break sementic significance in cases like this.