Make (Some) Separators Optional

I’m not in favour or omitting semicolons in general. It hasn’t worked out well for JS — it caused quirks, and in the end majority of developers write semicolons anyway.

However, I wouldn’t mind a bit of flexibility on whether semicolons are terminators {a;b;} or separators {a;b}. Both interpretations have pros and cons, and which one is required varies between languages and from construct to construct.

Coming from C I keep making the mistake of typing struct Foo {};.

2 Likes

Personally, I keep forgetting when semicolons are required and when they’re forbidden in C++, Rust and Javascript. And I keep forgetting to get my indentation right in Python. Changing any of the above wouldn’t really reduce my fat fingering any more than making Rust use === for equality would make me stop accidentally writing === in my C++ code every day.

While Go can get away with ASI, I see that as the exception that proves the rule. ASI is acceptable in Go is because it’s a very opinionated and simple language targeting a fairly specific type of project (web services) and thus can successfully get most of its userbase using the one true coding style. Plus, it’s still using newlines as statement terminators, even with ASI. Although Rust is more opinionated than C, it’s also trying to have a much broader appeal than Go, and is much more complicated because it tries to do a bunch of things Go (and C) simply can’t do.

tl;dr I think every language benefits from some kind of explicit “this is the end of the statement” character, be it semicolon or newline or whatever, except for languages that don’t have statements at all.

1 Like

I’ll probably never get used to commas separating fields, so :heart: to whoever put in “help: struct fields should be separated by commas”. But additionally allowing semicolons (or omitting them) and causing style guide arguments would be worse than the status quo, IMHO.

I definitely get tripped up after structs. (Apparently my brain filed Rust under C+±like, not C#-like.) It doesn’t help that struct Bar1; must have it but struct Bar2 {}; must not. Though "expected item, found ;" is clear enough that I don’t lose meaningful time to it.

@kornel Any particular places for terminators/separators? I know that arrays, (non-unit) tuples, structs, use, and enums all allow both. Is it just the end of a block? (Hmm, traits are terminator-only, though personally I like that.)

4 Likes

Single-function trait declaration requires ; after the function. This particular case looks odd to me because fn{} must not have a semicolon.

Lua was mentioned and is interesting in that whitespace isn’t significant. The EBNF also fits on approximately one screen: https://www.lua.org/manual/5.3/manual.html#9

The following code is (I think) the only ambiguity due to having optional semicolons:

local a = 0
local b = a
("%d"):format(b)

Here, a("%d") is parsed as a function call rather than as assigning a to b and then calling format on the string “%d”.

However, in this case, the result of the format function is being ignored, which makes it contrived. I can’t really think of a non-contrived ambiguity, though I’m sure they exist.

My opinion about Rust and optional semicolons is that I’d prefer the redundancy, thanks.

1 Like

You might want to read some previous threads about the semicolons:

I've had to do a lot of Go programming lately and have been bothered by the semicolons in Rust :slight_smile:

Yes, and it turns out that style isn't just a recommendation, but actually necessary in some cases -- precisely due to the automatic semicolon insertion. That is, a program like this is illegal:

func main()
{
	fmt.Println("Hello, playground")
}

I happen to prefer the usual Go style, but this kind of gotcha can be confusing and annoying to people who are used to being able to format the code like they like. Removing semicolons is not free since it basically imposes some restrictions that weren't there before.

So if Rust were to make some semicolons optional, please take great care to handle code like above nicely.

I keep forgetting ; after a large let x = { … } block.

if foo {
   bar
} else {
   baz
} // no ;
let x = if foo {
   bar
} else {
   baz
}; // ';' required!

So I’d vote to make that one optional :slight_smile:

But that is because it's actually a let (of course). This case, too, would suffer from potential ambiguity, depending on what follows:

let z = {
    let _x = if foo { bar } else { baz }
    *p
}

Contrived, perhaps, but syntactically ambiguous without semicolons (multiplication or result value).

I personally don't like the proposal for the opposite reason: it prevents me from breaking long expressions into lines without having to worry that this might arbitrarily create an unintended split.

Hence I would argue the opposite, if anything: allow additional, superfluous semicolons where they don't hurt, such as at the end of structs:

struct Zorg;           // currently required
struct Zerg { ... };   // currently forbidden, could be allowed
6 Likes

Yes, absolutely this! I have been writing in rust for two years and still without fail make this mistake every time I turn a unit struct into a struct with fields. What's more is that this is allowed:

fn func() {
    fn inner_func() -> u32 {
        3
    };

    // the above is "allowed" because it actually means:
    //     fn inner_func() -> u32 { 3 }    // <-- an item...
    //     ;                               // <-- ...followed by a statement
}

which frequently appears as the result of turning a closure into a function, but causes errors once the inner function is pulled out to item level.

4 Likes

@ExpHP I would love it as well; I’ve also been bitten by struct Foo {}; many times.

However, there seem to be backwards-compat hazards: https://play.rust-lang.org/?gist=0db4692763e19e7c54c955f6bcd411f6&version=stable&mode=debug

Perhaps this is fixable by keeping in mind that a ; was matched as an item, and then the macro will see this as having matched the next infinite sequence of ; in matchers?

EDIT: A more refined rule would be to count the number of consecutive ;s matched as an item, and count exactly that many ;s as matched in a macro matcher.

If we can think of a good way to keep macros working I’d love to work with you on an RFC.

1 Like

I'm not sure what this is showing.

To be clear, my thought is for the single token ; to be a valid item. Is your example intended to show that having $a:item match the input ; is a backcompat hazard?

Yeah, that's my idea also.

Yes. If you naively just make ; match an item, then those currently working macros will break I think.

@Centril but they seem to work fine for :stmt matchers in statement context, so I’m not sure how they would be problematic for :item matchers in item context.

fn main() {
    macro_rules! i1 { ($($x:stmt);*) => {} }

    i1! { struct Foo {}; struct Bar {} }
    
    macro_rules! i2 { ($x:stmt; $y:stmt) => { } }
    
    i2! { struct Foo {}; struct Bar {} }
    
    macro_rules! i3 { ($x:stmt;; $y:stmt) => { } }

    i3! { struct Foo {};; struct Bar {} }
}

Rust currently gives a clear error for a semicolon at the end of the struct; it won’t pass silently. That makes it trivial to catch and remove.

error: expected item, found `;`
 --> src/main.rs:3:2
  |
3 | };
  |  ^ help: consider removing this semicolon

Along the same lines, we could and should detect empty statements inside a function, and lint against those. Clippy had a feature request to do exactly that, but punted it to rustfmt. Perhaps we should reconsider adding that?

1 Like

My concern is that in macro_rules! i1 { ($($x:item);*) => {} } the token ; would eaten as an item and then there is no ; to match as the separator. That is: given struct F {}; struct G {} it is interpreted as struct F {}; and struct G {}.

While the current diagnostic is better than nothing, I think this simple mistake occurs frequently enough to be in the way of writing flow; so making struct Foo {}; legal would make life easier. We can normalize this via rustfmt, removing any redundant ;s.

1 Like

Huh? I would think it matches just fine:

  • $x:item matches struct F {}, stopping before the ; because struct F {}; is not a valid item (or a prefix thereof)
  • $();* matches the semicolon nonterminal, and thus repeats
  • $x:item matches struct G {}
  • $();* does not match a semicolon, and thus ends
  • EOF; the expansion succeeds

They are not redundant – they are necessary for unambiguous parsing. The designers didn't put them in the language out of pure passion.

That's a call for all sorts of pain. Rust is intentionally not the """convenient""" whitespace language – those – basically – heuristics tend to introduce all sorts of hard to debug errors arising out of discrepancies between what's "obvious" to the human eye vs. what rules and exceptions and special cases the compiler handles.

I am a long time Swift user and I can tell you, life would be much simpler in Swift with semicolons. The language has all sorts of ugly assumptions about where expressions and statements end, and it's just extremely irritating.

2 Likes

But ; is now a valid item, so you’d get the following instead?

  1. $x:item matches struct F {} since it is a valid item
  2. $x:item matches ; since it is a valid item
  3. ; is expected as a non-terminal separator, but it has already been matched in 2. The matcher fails.

That would require it to attempt to parse two items right after the other, but $(<stuff>)<delim>* must match <delim> before it attempts to match <stuff> again.

1 Like