[Feature Request] Opt-in for significant whitespace

clichekhfan · June 19, 2022, 9:42pm

Python was the first programming language I learned and I really liked the way it used indentation and newlines to cut out some of the clutter that other languages have. It made it feel a lot less intimidating to me when I was first learning how to program. I also believe that it is important to maintain block indicators when using less full featured IDEs that do not make it easy to distinguish spaces from tabs as this can create very hard to find bugs. Because of this the individual nature of style preferences I do think having the language use braces and semicolons by default is the right way to go.

However I think that adding a way to opt into significant whitespace would be a nice feature for rust to have. I'm not quite sure how much work this would entail on the backend from the compiler but I do think it is feasible as something that could be done.

I think this could almost be done with a macro if not for the requirement that the input for a macro be encapsulated in braces or something similar which defeats the purpose. I suppose the entire contents of the file could be enclosed and input into a proc_macro but there are rules about what you are allowed to replace with what and I'm not sure how that works when the entire file is your input. Something that adds braces when indentation levels change starting when appropriate (following an if, while, fn... ect) and ending when the indentation level matches the previous.

If this is not possible with a macro I think it could be implemented with a tag at the top of the file like #![feature(significant_whitespace)] or with something like a single colon operator to indicate where to start.

this:

if (blue):
    color = "Blue"
    println!("I'm blue!")

would become this:

if (blue) {
    color = "Blue";
    println!("I'm blue!");
}

It might be worth seperating out the feature of inserting simicolons on line endings as it allows things like this:

io::stdin()
    .read_line(&mut guess)
    .expect("Failed to read line");

instead of:

io::stdin().read_line(&mut guess).expect("Failed to read line");

Whereas in my opionion the first version is easier to parse than the second. I might try to take a stab at implimenting this as a macro myself. What do other people think?

eggyal · June 19, 2022, 10:06pm

Rather than doing this at the language level, you could have a translation layer in front of the compiler. Indeed, this could sit in your IDE so you're presented with significant whitespace but the file is saved as Rust.

That way the choice resides with the individual programmer and is not perpetuated onto their collaborators or the wider ecosystem (where consistency is a tremendous aid).

steffahn · June 19, 2022, 10:09pm

Certainly not the first time this was suggested. I don’t have particular ones in mind at the moment, but perhaps others can provide links to existing discussions they remember.

In my experience new users of Rust loose their desire for such fundamental changes to the syntax fairly quickly. I can relate to your initial desire as I personally came from Haskell (another language with few parentheses and significant whitespace).

Regarding the concrete proposal at hand, there are a few technical points. For instance, Rust does explicitly not require parentheses around if conditions, so you’d write if blue { … } instead of if (blue): { … } in current syntax, and then the proposal might become

if blue:
    color = "Blue"
    println!("I'm blue!")

yet there’s other usage of the colon already that makes this

confusing as it’s adding yet-another meaning (beyond struct fields, and type annotations and trait bounds)
somewhat in conflict with a planned feature called type ascription, where expr: type is an expression, so : cannot terminate an expression

As you already noticed, implicit semicolons are nontrivial, but there’s even more problems: in Rust a semicolon has some actual meaning in the case of block/function return expressions. There is a difference between

{
    foo();
    bar();
}

and

{
    foo();
    bar()
}

the latter returns the return value of bar() from the block, the former block evaluates to ().

Finally, with any major syntactic change/additions, there’s potential problems with how good/bad they interact with macros; which includes how good/bar they interact with existing macros which might not be able to work with code that has semantic whitespace. And this particular proposal is also problematic fundamentally, due to the fact that the whole macro infrastructure currently operates on “token trees”, with no whitespace information. E.g. the current macro design is such that the expand to entirely unformatted code without any whitespacing really; and when parsing macro arguments, the compiler does nothing beyond tokenization, so code with semantic whitespace could be hard to feed into a macro ^[1]. I wouldn’t say it’s necessarily actually impossible to make this work, but I’d say it still might be impossible, and it’s definitely either impossible or very hard.

note that the compiler can make no a priori assumptions about whether or not something fed into a macro is proper Rust syntax, so there could be no preprocessing to “translate” semantic whitespace early, either, so I suppose it would need to be processed by the macro itself… ↩︎

afetisov · June 19, 2022, 10:31pm

The primary motivation for significant whitespace, historically, was to enforce consistent formatting at the language level, which both makes code of different projects easier to read and removes the dangerously misleading formatting interacting with poor language features (the famous goto fail bug).

Rust doesn't have any of those problems. Its syntax was carefully crafted in a way which eliminates the common syntactic ambiguities by design (the braces around then-else branches are mandatory, unlike many C-legacy languages; this also allows to omit parenthesis around the condition). Its formatting is enforced by rustfmt, which is widely used at the ecosystem level, and it is culturally expected that some autoformatter (either rustfmt or a different one) is used on the projects. Even if you encounter a project which didn't use proper formatting, recovering it is as simple as running rustfmt on the project.

With those two issues gone, what other motivation is left for significant whitespace?

So the feature you're really asking for is not significant whitespace, it is the possibility to omit braces and semicolons. That, frankly, isn't a strong motivation, or something that is an issue once you use the language even a little bit. Do note that it would be a heavily disruptive change to the ecosystem, with different projects using or avoiding significant whitespace, perhaps even within the same project (depending on the implementation). It is also basically guaranteed to clash with the existing syntax, creating issues for macros (which, by the way, cannot rely in any way on the formatting of their contents by design) and requiring a significantly reworked syntax.

I don't believe that it is even possible to eliminate most braces from Rust, there are so many different syntactic construct built around them (expression blocks, async blocks, closures, const expressions, destructors, labels etc) that it is unlikely the braceless syntax can be made to work at all, except for some minor special cases. Also note that some the reasons people hated and tried to avoid e.g. semicolons also don't exist in Rust. Whereas in old languages with primitive parsers omitting a semicolon would result in a confusing parse error, meaningless to someone not well-versed in parsers, omitting a semicolon in Rust will result in a precise compiler error telling you to insert that semicolon. Whereas in old languages a semicolon was just a cludge to ease parsing, in Rust it has semantic meaning, differentiating statements from value-returning expressions (and a semicolon can be omitted in many cases, but you will have to use some non-ambiguous terminator anyway, like a brace).

kornel · June 19, 2022, 10:37pm

Rust has already chosen one syntax, and got all the documentation, tutorials, books, QA sites, and all the tooling using this syntax. At this point any other official syntax would cause chaos and fragment the language.

CAD97 · June 19, 2022, 10:57pm

Though I do have to repost this fun macro:

macro_rules asi {
    ($($stmt:stmt)*) => { $($stmt)* }
}

afetisov · June 19, 2022, 11:23pm

There is a precedent. Scala has introduced optional significant whitespace in Scala 3. Now, my personal opinion is that it's a gratuitous change which will cause more trouble than it's worth. But I think we would need someone with experience of Scala 2 to Scala 3 transition to really confirm or deny that perception.

eggyal · June 20, 2022, 7:47am

I had understood the proposal was to replace braces with significant indentation, which (I think) should be straightforward to mechanically translate in both directions?

But otherwise I agree with everything else you said!

afetisov · June 20, 2022, 9:16am

It can't be straightforward because you need to distinguish between significant and insignificant whitespace, and this may cause parser ambiguities. Steffahn has given some examples higher in the thread.

system · September 18, 2022, 9:17am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rust with indents (Python, CoffeeScript, Haskell, F# style) ideas (deprecated)	45	10259	March 25, 2015
Feature Request: Avoid brackets on if (And similar) statements language design	10	1140	December 20, 2022
Rust ; {} and indent language design	5	342	October 28, 2024
Optional curly brackets language design	21	7939	July 23, 2022
Support stipulation of custom compiler preprocessor language design	2	687	March 10, 2022

[Feature Request] Opt-in for significant whitespace

Related topics