I’ve a few questions on statements based on the following observations.
//! Statements and semicolons
// We can capture statements in macros.
macro_rules! stmts_semicolon { ($($stmt: stmt);*) => {}}
macro_rules! stmts_whitespace { ($($stmt: stmt)*) => {}}
// Single expression statements must not be terminated by semicolons when
// captured by a macro. This implies that the semicolon is **not** a part of
// the statement.
stmts_semicolon!{1}
stmts_whitespace!{1}
// Likewise with let statements.
stmts_semicolon!{let x = ()}
stmts_whitespace!{let x = ()}
// When we get to two expression statements in a row, we need the spearator.
stmts_semicolon!{1; 1}
stmts_whitespace!{1 1}
// Items are also statements
stmts_semicolon!{struct X{}}
stmts_whitespace!{struct X{}}
// And the separator is needed when multiple item declaration statements.
stmts_semicolon!{struct X{}; struct X{}}
stmts_whitespace!{struct X{} struct X{}}
// Some items require ending semicolons, and that's not a part of the statement.
stmts_semicolon!{struct X;}
stmts_whitespace!{struct X;}
// So we need double semicolons when separating them.
stmts_semicolon!{struct X;; struct X;}
stmts_whitespace!{struct X; struct X;}
// But note, an empty statement is not valid for this macro.
// stmts_semicolon!{;} // ~err: expected a statement
// ^
// If the expression is a block with unit return, it must not end with a semicolon.
stmts_semicolon!{{}}
// And if the expression ends with non-unit, it must not either.
stmts_semicolon!{{0i32}}
// But the actual semantics in blocks requires semicolons terminating blocks
// with non-unit returns.
fn blocks_with_semis() {
{} // End of statement
// {0i32} ~err: mismatched types
// ^^^^ expected (), found i32
{0i32} /* not end of statement */ ; // End of statement
() // Explicit block expression.
}
// Furthermore, within blocks, extraneous semicolons are allowed and ignored.
// The Rust grammar that's unused by the compiler calls these statements while
// the compiler will just detect and discard them.
fn extraneous_semicolons() {
;;;;;;;;;;
}
// So, is the following an extraneous semicolon?
fn maybe_extraneous_semicolon() {
{}; // Extraneous or ends the block expression?
1;
()
}
// And in blocks, not every statement needs to end with a semicolon.
// Specifically items that aren't semicolon terminated don't need a semicolon
// after them in a block either.
fn statements_without_semicolons() {
struct Foo {}
struct Bar {}
()
}
So this leaves me in a weird spot with semicolons.
In non-macro-land, they appear to be required as part of an expression or
let statement, while they are forbidden in macro-land. Should I just ignore
the statement macro matcher?
Furthermore, in blocks, are semicolons that aren’t strictly necessary
“extraneous” or “empty”?
And finally, is a semicolon after a block or control flow expression of unit
type one of those extraneous/empty statements or is it actually a part of the
expression statement?
which cuts out exact textual duplicates that often (though not always) arise due to the cache having multiple versions of the same crate. For the pat fragment I used pat[^h].
[^2]: vis isn’t even stable and yet it is still more common than stmt!
I've recently made similar investigation.
I have some possible explanation for stmt matcher's behavior there.
Regarding statements outside of macros:
{}; // Extraneous or ends the block expression?
The answer is that it's unobservable and doesn't matter!
Right now parser can immediately eat one or two semicolons after a "naked statement" even if they are not required, and then eat remaining extraneous semicolons one-by-one, but that's an implementation detail.
Probably not a regular macro, because it already follows the way Rust parses expressions. You could do it using a procedural macro, which gives you freedom to use any language grammar you want. However, you’d quickly find cases where it makes the language ambiguous and writing a sensible parser for it pretty hard.
Oh nice, but if I add return () it shows how hacky it is:
warning: expected `;`, found `return`
--> src/main.rs:10:9
|
10 | return ()
| ^^^^^^
|
= note: This was erroneously allowed and will become a hard error in a future release