Eliminating "seemingly unnecessary" braces

mcy · August 27, 2020, 6:35pm

Rust has a number of expression productions that have the following pattern:

keyword <tokens> { ... }

where keyword is not valid immediately after an expression. Examples include for, while, if, match, unsafe, and so on. Most of these productions (all of them except match, really) end in a block expression.

It's pretty common to nest these expressions in a way that seems very wasteful of indentation but which I can't figure out a better way of writing. The worst offender in my won code is match-in-for:

for x in xs {
  match x {
    // ...
  }
}

unsafe is another big offender in some places.

My question: is there anything specific, other than ossifying syntax, that stops us from changing the production for every ExpressionWithBlock (that isn't a bare block or a match) such that it ends in another ExpressionWithBlock rather than a BlockExpression? In other words, allow productions like

for x in xs match x { ... }
if cond unsafe { ... }
while let Some(blah) = foo match blah { ... }
async loop { ... }
for x in y for z in x { ... }

As far as I know, none of these productions are currently allowed, and I think that in some cases they would be helpful for decreasing unnecessary indentation. This isn't to say that all of the above are good ideas, but Rust's syntax is already free-form enough that you can form some pretty groddy but correct syntax already (see: the weird expressions test).

I think this is morally dual to how $pat => { $expr } is sometimes written out by rustfmt when it wraps a match arm.

More extremely, we could imagine allowing any expression that starts with a specialized keyword instead of a block, allowing for silly things like while cond continue; or potentially useful things like for x in xs yield x; (in an imaginary world where we stabalize generators). I suspect this would significantly complicate the grammar (mostly around semicolons).

(One could also extend this to functions, fn foo(x: i32) -> i32 match x { ... }, but I'm certain that would not play well with where-clauses.)

Of course, this is the "filll-in-the-matrix" version of indentation compression. Perhaps a better approach might be to identify common "double blocks" and introduce alternative syntax to deal with them.

For example, consider the somewhat obvious starting point of

for $refutable in $iter { ... }
// =>
for x in $iter { if let $refutable = x { ... } }

Though might be tempted to write for let $refutable in $iter { ... } instead, and analogy with while let would imply that you should break on the first match failure!

One might also consider allowing a production like unsafe $expr, comparable to the suggested const $expr form from RFC 2920, though those are somewhat less powerful in that you may write unsafe if cond { .. } but not if cont unsafe { .. }. Unclear if it matters though.

H2CO3 · August 27, 2020, 6:54pm

Please google "goto fail", a massive security bug caused by the fact that C and C++ allow unbraced statements when braces are """redundant""". It is a mistake that we should not repeat, and that Rust's designers have carefully avoided by requiring braces.

kornel · August 27, 2020, 7:49pm

On a few occasions I wanted this, because it reduces rightwards drift of indentation.

However, that falls into cute/clever syntax sugar, and it will probably confuse someone about what async for does, or they'll read if cond unsafe {} as if (cond unsafe) {}.

As usual, there's a trade-off that the less "redundant" stuff you have to write, the more careful readers of that code have to be.

It'd be nice if it was a compile-to-verbose-Rust language dialect.

bascule · August 27, 2020, 8:01pm

As it were, I highlighted this incident in a 2014 talk I gave on Rust Cryptography, noting that mandatory braces would've prevented this problem:

mcy · August 27, 2020, 8:34pm

I am aware of this problem in C/C++, given that I review about 500loc of it per week, and a lot of it is spent telling people off for writing if (cond) continue;. This syntactic change is unrelated, because the classic problem is:

if (cond)
  expr;
  expr;  // Not bound!

wheras in all the suggested syntactic productions, you are still required to include braces to terminate the "full expression", with an expectation that the formatter would include the necessary braces any time a line break between "block expression starters" occurred (much like it does for match arms today).

Of course, one could always write the form

for x in xs
  match x {
    // ...
  }
  match y {
    // ...
  }

but this is what we have formatting tools and linters for (one can argue those are optional, but if you don't aggressively use such tools you should not pretend to be building a secure product). Such tools (of a sufficiently high quality) only started existing for C/C++ very recently.

Also, this misses that the first part of my post was an introduction to the problem, describing a naive solution. The interesting design space is for the second part: what common patterns can we identify and provide a robust, concise syntax for? Extreme indentation due to alternating for/match blocks, where the match scrutinee is often trivial, is a real readability problem in Rust code.

toc · August 27, 2020, 9:24pm

The other "normal" objection I can think of is that syntax elision can often mean that it's harder for a parser to provide sensical suggestions for slightly incorrect code, e.g.

for x in xs
  match x {
    // ...
  }
}

Which isn't handled perfectly in rustc...

...
1 | fn main() {
  |           - this opening brace...
...
7 |     }
  |     - ...matches this closing brace
8 | }
  | ^ unexpected closing delimiter

but is highlighted decently in my IDE as for x in xs ?? (for this simple example ymmv). This has pretty strong implications for teaching-by-compiler-error. I don't know how much this actually affects your suggestion, but it should be mentioned.

scottmcm · August 27, 2020, 11:33pm

FWIW, C# technically would allow many similar collapsings as its braces are usually optional, but the coding standards generally say not to. In fact, I've never seen any that allow writing for or while without braces.

The one common exception was that you could write

using (var foo = ...)
using (var bar = ...)
{
    ...
}

But of late not even that one is suggested, as the language now allows it to be written like this:

using var foo = ...;
using var bar = ...;
...

This comes up periodically. A common point against it is that it's unclear whether semantically it should be filter (as you expanded it) or take_while (expanding to let mut it = IntoIterator::into_iter($iter); while let Some($refutable) = it.next() { ... }).

matklad · August 28, 2020, 6:33pm

One is allowed: else if. Otherwise, one would have to write else { if.

ckaran · August 31, 2020, 1:17pm

While I see the lure of this, I agree with the comments that @H2CO3 and @bascule made; in my mind, the security headaches aren't worth the tradeoff. And, yes, I did see your comment about limiting where this could be done, but at that point it becomes sufficiently limited that I'm not sure that the benefits outweigh the costs that @anon2808951 mentioned.

For me at least, I prefer everyone being forced to use braces as they are now. I've worked with code where people have different preferences for where they place their braces, and how they indent their code; I hate it. It forces me to switch mental gears to match whatever code I'm currently reading is using, but I don't have any guarantee that I've switched into the right gear (e.g., when I'm reading C/C++ code, some people use braces everywhere, while others won't for single-line if statements. I won't know what style is being used until I see the first example of the style that is used). The worst is when multiple people have been working on the same file, but using different standards, so now I have to switch gears in the middle of the code, all for no useful reason. Honestly, this is one of the places where I feel like python got it right; by using indentation to demarcate nesting, it forced everyone to use the same style, which dramatically improved legibility over C/C++.

Aloso · August 31, 2020, 5:36pm

Another example where this is a problem:

async try { .. }.unwrap()

How is this parsed? It could be one of

(async try { .. }).unwrap()
async ((try { .. }).unwrap())
async try ({ .. }.unwrap())

The last one isn't backwards compatible, but it seems like a valid option when curly braces are not required anywhere. Overall, I think this would make the grammar much more complicated, leading to confusion and frustration.

CAD97 · August 31, 2020, 8:14pm

Nobody has proposed that.

The loosest proposal was to allow block-like expressions as the block for block-like expressions.

That would mean that your example would unambiguously parse as (async try { ... }).unwrap().

The looser proposal is to pick specific keyword block-like expressions to combine, the same way we have else if { ... } as shorthand for else { if { ... } }.

I definitely think that allowing this in general would not be a good idea, but it is possible that specific cases (like else if) could benefit from a combined syntactic form. I just haven't found or seen any specific cases yet.

jthemphill · September 2, 2020, 10:12pm

I'm very much in favor of mandatory braces, but I do think there's a third path here. Rust has augmented if and while statements with if let Some(x) = x {} and while let Some(x) = x {}, taking advantage of reserved keywords to avoid the ambiguity and accidental assignments that are common in C code.

We could do something similar and augment for loops to have an optional match keyword:

for match x in xs {
  X::Foo => { /* ... */ },
  X::Bar => { /* ... */ },
}

This removes the "redundant" braces in the original code without introducing goto fail;-type risk. It's probably a fair amount of work, and it's probably not worth the effort it takes to build, but I don't see any reason why we wouldn't welcome this if the work were already done.

mjbshaw · September 2, 2020, 10:14pm

Isn't that essentially what the OP proposes?

jthemphill · September 2, 2020, 10:24pm

OP proposed

@H2CO3's issue with this construct is that someone could easily write

for x in xs
  match x { ... }
  match y { ... }

and get different behavior than they expected, because they think that for loop braces are optional as they are in C. This perception exists whether or not braces actually are optional, or whether we only allow for x in xs match x to exist.

But if we instead wrote

for match x in xs { ... }

we wouldn't be creating this same perception.

mjbshaw · September 2, 2020, 11:43pm

Thanks, the difference is subtle enough that I didn't notice it.

cliff · September 3, 2020, 1:52pm

mcy:

Of course, one could always write the form
for x in xs
  match x {
    // ...
  }
  match y {
    // ...
  }
but this is what we have formatting tools and linters for (one can argue those are optional, but if you don't aggressively use such tools you should not pretend to be building a secure product). Such tools (of a sufficiently high quality) only started existing for C/C++ very recently.

This is literally the problem you said your proposal is designed to avoid. Pointing users to linters to avoid it is kicking the can down the road. Might as well allow any expression a la C, and tell users they should be linting to avoid the problem.

I tend to think the noise of the extra braces is less headache than the complexity of figuring out which order keywords are supposed to go in to make the compiler happy. I already struggle with this with things like ref mut vs mut ref and async pub fn vs pub async fn. (Let's not even talk about unsafe pub async fn.)

mcy · September 3, 2020, 2:51pm

This is the type of construct I was alluding with my proposal[1] anyway: identifying places where sugar similar to if let would allow us to reduce nesting in common patterns. Personally, I would have gone so far as to write this as

for match in xs { ... }

since the scrutinee variable x is unreachable if it is not Copy.

[1] Calling it a proposal is a stretch, it was more of a "is it worth thinking about this at all?" and "Is there a deeper design space than doing the stupid thing and doing this trivial grammar relaxation?" Unfortunately, it seems the discussion was not as constructive as as hoped it could have been.

(Also, the example you give would have the same behavior one "should expect" of C: the first match is in the loop, the second is not. Doesn't change that this behavior is dumb. This is why we have formatters.)

This is not the problem I want to solve. If you feel the need to indent the match expression you should have used braces in the first place (something a formatter would enforce, which it currently does for match arms in some cases). The problem I actually set out to solve is common nesting patterns that we could eliminate by intelligently adding new grammar constructs.

Rust already has similar syntactic pitfalls, especially around closures, that can't be dealt with intrinsically in the language:

let x = || foo(); bar();

This is unavoidable unless every single production which can contain a dynamic number of sub-productions is braced. We accept this middle ground because it makes the code easier to read and write.

Building quality software writ large in 2020 is simply not possible without ancillary tooling like formatters, linters, and presubmit-testing, because we are human, and because we make mistakes. What we should be doing is making the language easy to write, and the formatter's output easy to read and subsequently modify.

tesuji · September 3, 2020, 3:40pm

This example isn't convincing. I can clearly see this is two statements.

mcy · September 3, 2020, 5:39pm

I'm not here to convince anyone of that, because this is a place where reasonable people can disagree. I'm here to discuss things like the for match construct sketched by someone else above.

At any rate, I don't feel like this thread is going to be constructive, so perhaps it should be locked (or forked?).

steffahn · September 5, 2020, 12:37pm

To give an actual syntactic pitfall around closures that has to do with the fact that closures don’t need braces (but they do need them when the return type is explicit):

|| {
    /* ... */
}.method()

is parsed differently than

|| -> _ {
    /* ... */
}.method()

(demonstration in the playground—admittedly, it is probably uncommon in practice that both expressions actually compile)

Topic		Replies	Views
A look at pros and cons of brackets on if statements language design	22	4159	August 17, 2023
Feature Request: Avoid brackets on if (And similar) statements language design	10	1119	December 20, 2022
Loop-match, for-match language design	19	4744	April 21, 2021
Syntax of block-like expressions in match arms language design	9	2924	March 25, 2019
Current syntax	17	5559	March 25, 2019

Eliminating "seemingly unnecessary" braces

Related topics