Don't keep complicating the syntax (soft post, maybe off topic, maybe irrelevant)

Hello,

Across the Rust versions, it appears that Rust is kind of following C++'s direction syntax wise - the syntax keeps getting more complicated, even after release 1.0. Eventually that could lead to something like C++ and Haskell - everyone is using their own subset. In fact, a few rusteceans have already said that Rust’s syntax is about as complex as Haskells! That might eventually cause safety / scalability issues for large teams, or at least complicate the job for the compiler creators and programmers. I’ve created this post because quite a few people who suggested new features to Rust are suggesting additions to the syntax…

As I said, soft post, probably quite wrong lmao.

7 Likes

One of the reasons that there is now a Grammar-WG is to release a formal grammar that RFC proposers have to consider when they make proposals to modify the grammar. In other words, “Show us what parts of the grammar have to change, and whether those changes require a change to the parser (such as more lookahead, or backtracking).” IMO many of these proposals will fall by the wayside when considered in light of the extent of required grammar changes.

4 Likes

Regarding pure syntax sugar, I think the solution would be to have compile-to-rust languages a’la CoffeeScript to experiment with the additions. That may satisfy people who want really fancy syntax or don’t like Rust’s, and help try out additions for Rust better to avoid adding something half-baked to Rust proper.


BTW, in your question there are a couple of implied assumptions:

  • that more syntax makes language more complex,
  • that adding stuff leads to problems like C++ has

which I don’t think necessary have to happen. For example, returning iterators from functions was hard or impossible in some cases. Rust has added impl Iterator syntax to make that simpler. Similarly, working with futures is currently quite cumbersome, but Rust is adding pinned references and async/await syntax to make futures simpler to use.

C++ is used as a cautionary tale of misfeatures and piled-on complexity, but I think C++ is unique in that regard: it’s built on top of 45-year-old C, and it itself has nearly 40 years of backwards compatibility to preserve. In the near future Rust is going to have it easier, because Rust 1.0 is a much more modern and cleaner foundation for the language than the early C/C++ was, and Rust was able to learn form their mistakes.

In 40 years Rust 1.x may likely be outdated and similarly problematic. But Rust has editions which are able to break things more than C++ can, so maybe Rust 2050 edition won’t be that bad?

18 Likes

Note that IRLO is intentionally a place for early-stage ideas and developing things. It's absolutely expected that there will be whole bunch of things here that many people don't like, or that discuss all sorts of changes.

But once proposals get more formal, "do we really need syntax here?"-type responses are very common. Some recent examples:

I am very wary of growing the number of reference types, especially in a sort of "narrow, targeted" fashion

~ RFC: Direct and Partial Initialization using &uninit T by RustyYato · Pull Request #2534 · rust-lang/rfcs · GitHub

I propose that we postpone considering this until we have enough const generics in nightly to have a concrete comparison between this and a pure library solution.

~ https://github.com/rust-lang/rfcs/pull/2581#issuecomment-435233740

And yes, this conversation does come up occasionally. One good reply from last time:

Another thread: Raising the bar for introducing new syntax

6 Likes

Strictly in terms of grammar complexity:

I really don’t think the grammar complexity is that much! See the current draft of the grammar that the working group will be starting from. (It’s currently very ambiguous because it lacks disambiguation for anything, but it successfully parses all of rust-lang/rust except for the files that aren’t supposed to parse.)

Compare the C++ grammar.

The Rust linked (ambiguous, preliminary) grammar has around 115 productions in it. The linked C++ has around 280. Haskell has around 150. This isn’t a fair comparison, as the linked grammars are specified within differing contexts that provide differing amounts of groundwork (and different grammar authors can create productions where they aren’t strictly necessary), but it is roughly indicative.

However, I think that (formal) grammar complexity is the wrong concern for the user of the language (as opposed to someone who’s trying to manipulate the textual representation programmatically, such as a compiler or pre-processor developer).

What the user of the language primarily cares about is the amount of surprise. This is roughly correlated with grammar complexity, but is in fact decoupled. And more importantly is semantic complexity as opposed to grammatical complexity.

Just as an example, I think the addition of &out, &own, or other &modes are practically free from a grammar standpoint (if the mode were already a keyword, but just for the point of argument). However, it is very heavy on the semantic side, and weighs heavily on the language’s complexity budget.

13 Likes

I would like to disagree with you here. I think that &out (write-only) is a logical addition to &mut (read+write) and & (read-only). We could wish for a more consistent naming (e.g. something like &r, &rw, &w), but this ship has already sailed.

2 Likes

I’m trying to think of what new syntax we’ve added since 1.0 & all I can come up with is ?. I don’t think grammatical changes or keyword additions really impact the user experience of syntax complexity (though they can make things more complex in other ways).

3 Likes

(Fun fact: parsing C++ is undecidable because the template system is turing complete.)

Yeah that's very hard to answer cause it depends on what you mean by "new syntax"... For example: are trait aliases new syntax? are new attributes new syntax? I suppose unions/const-generics/impl-trait/dyn/const-fn/async/await/try is new syntax... I filed some syntax-bug-fix RFCs that were accepted -- those could be new syntax but it would be strange to say so... Raw identifiers come to mind...

It's hard to understand what constitutes new syntax to various people imo.

1 Like

It’s technically not turing complete, so it’s only nearly undecidable :wink:

2 Likes

:smile:

(Disclaimer: regular readers probably know what my take is on what follows, and I probably know where we disagree too. I just want to summarize my views to OP – regard everything below as #[cfg(member_status = "new")] :smile:)

I agree with OP in that “complicating the syntax” is a problem, but, as others mentioned it too, it’s probably the smallest one. That said, some discussions in recent proposals for features for which the syntax would require throwing away bounded lookahead parsing make me cringe a bit, along with pulling out specific existing subsets of the language with new syntax a la try fn.

As far as I can tell, however, most unwanted complexity in Rust feature proposals comes from semantics, or at least it’s an interaction between syntax, semantics, and possibly in either case, breaking existing patterns or symmetries, or adding so-called surface area (ie. “things to learn”) which is disproportionately large compared to the advantages it brings or how common it is.

A common problem I can observe is proposals that go straight against Rust’s original, fundamental ideas. Overloading the = operator was a recent one I recall, but in the past we’ve had requests for C++ style move constructors, stdlib APIs which are unsound but not unsafe, OO-style inheritance, and many others.

There’s another tendency whereby a narrow set of newcomers try to bend the language so it matches their expectations about other languages they have previously used, instead of learning the goals and idioms of Rust, which creates a similar kind of tension. A less serious variant thereof is people understanding and knowing how Rust is different from e.g. traditional OO or pure functional languages, but still trying to add their favorite pet features, which accumulate over time.

Tangential with regards to C++: I do think it’s a valid fear, because Rust recently kept picking up features quicker than C++ as far as I can tell, and I think this rate is going to lead to unsustainable growth. Yes, C++ has a 40-year-old baggage of backward compatibility. But it also started on top of a very small and consistent language (C wasn’t always as quirky and messy as it eventually became trying to support all the exotic, niche platforms of the 70s and 80s). And conversely, Rust also cares a lot about backwards compatibility. So I don’t think that we can just dismiss the issue with hubris and say “but we know better than those lowly C++ guys” :smile: – the issue is real.

In any case, I don’t think discussions of this problem are (or should be considered) irrelevant or off-topic, however it also gets brought up every once in a while (not nearly as often as feature requests, though).

20 Likes

That's something that bothers me a bit, as it leads to a situation in which the same points must be made again and again, mostly to new people every time. As someone who abhors manual labor that can be perfectly well automated, it just feels like a lot of wasted effort over time.

Coupled with the fact that poor additions to the language affect not just everyone now but exerts a cost decades (at least) into the future, that is why I personally think it's not necessarily a bad thing to raise the bar a bit. Perhaps a checklist or guidelines of some kind before submitting an RFC. An RFC author not taking those into account could then be grounds for dismissal of the RFC, after which the author can try again with a revision or another RFC (assuming everything happens in good faith). Now, I say this with the full knowledge that RFCs are consensus building mechanisms. The role of the guidelines would then effectively be a pre-filter for things that were never going to pass muster anyway e.g. because they go against language philosophy (e.g. implicit integer conversion), or because they're objectively poor language features (e.g. unsound-but-not-unsafe code, or OO-style inheritance), reducing the necessary amount of manual labor.

3 Likes

That’s something that bothers me a bit, as it leads to a situation in which the same points must be made again and again, mostly to new people every time.

Coupled with the fact that poor additions to the language affect not just everyone now but exerts a cost decades (at least) into the future, that is why I personally think it’s not necessarily a bad thing to raise the bar a bit.

I think people are giving too much weight to threads and discussions here. Just because something is discussed doesn't mean it has any chance getting implemented. And I don't see that the Rust language gets arbitrary extended.

11 Likes

However, this general observation has been violated in quite a few occasions. The more worrying pattern is that some of the very controversial changes are still being implemented and accepted even against wide disagreement of many people in the community. A recent example of this was the "default binding modes" (also known as "match ergonomics" previously) RFC.

I also have a feeling that several (most?) language team members and core developers are generally more than subtly biased in favor of changing and extending the language. That is opposite to the design process of some other languages, e.g. Go was designed by only adding features that the designer could specifically "talk the others into".

NB: I'm not saying that "Go is better than Rust", which IMO is not the case – I'm merely saying that this is probably a better approach to language design knowing the enormous complexity of design space we are facing. I formulated this in the past as the following: the burden of proof should be on those who propose a change, and it should not be the case that the language can arbitrarily change under its users' feet unless proposed changes are constantly "fought against". (That's a strong word, but you get the idea.)

3 Likes

The ? operator was also controversial and some said it's going to ruin explicit control flow of Rust, etc. I think it turned out great and Rust has the best error handling of any language I know.

When Cargo was introduced I didn't like it, and I was vocally against build.rs. I liked my Makefiles, and couldn't fathom why would anyone want to use "NIH" Rust build system over "standard" make or bash. And now I'm seeing how well it works, and I'm so glad nobody listened to me back then :slight_smile:

20 Likes

Sure, that can happen too. To me, the dyn keyword was such an occasion, when I initially opposed it, but it turned out pretty good. But it’s not always the case. (Certainly not with default binding modes for me.) And anyway: features can always be added, but they can hardly ever be removed. Also, if the language design process is to be called democratic, then such a bias probably doesn’t fit well with it. (I see that this can be resolved by just redefining what the design process is like – I’m not trying to go in that direction.)

Yup. That’s what I’m worried about.

Also, let’s first assume that most people who suggest changes to the Rust language are people who work with rust either as a hobby or professionally. Under that assumption, the fact that lots of people are suggesting changes to the Rust programming language itself means that the community’s general idea is that we should extend the language - THAT is quite worrying. The answer to everything isn’t syntax - to be honest, I’d like Rust to be more like a safer C with a few more abstractions - basically C with safe pointers, references, etc, and additional abstractions like generics and a better macro system. Ideally I’d like Rust to still have C’s philosophy, just to a less extreme degree - if you want something, implement it by yourself (for the most part. The language shouldn’t give you every single higher level feature, that’s the task of the std and the programmer).

3 Likes

It worked out for ?, but that doesn't mean it will work out every time. The question mark operator also had a long, long design phase with try! available on stable for basically ever, which is lacking in many newer proposals. Many features testable on nightly aren't even widely promoted, as can be seen by people being surprised when impl Trait in argument position landed. And I assume when type ascription lands with the "no need to specify types in signatures" aspect, people are going to be surprised again.

I also don't remember much critique of ? being not explicit enough, I remember a lot of concerns about it being easy to overlook. Which is still something you have to keep in mind. If you have unsafe code upholding important invariants, overlooking a single-char conditional early return can have serious consequences, especially if it only happens in edge-cases.

I sometimes still write explicit matching and returns when I am worried about it being overlooked.

3 Likes

I suspect this may be a real problem. In earlier Rust versions I've used nightly 90% of the time, because stable just wasn't good enough. Nowadays, majority of features I wanted have are on stable, and I have no reason to use nighties any more, so I'm not trying out new features.

1 Like

It's one of the issues, another is feature bundling. E.g. impl Trait in return position (which everryone wanted) coming with them also in argument position. Or the type ascription part that includes the omit-types-in-signatures feature. I'm sure there are others, but I'm no longer closely following.

I like to compare that to the earlier days, when ? originally came up it had a partner sigil ! that was an implicit unwrap. People liked ? but really disliked !. If they had to stay bundled, I assume back then neither would have been added and we'd still all use try!. Similarily I still think the move from &mut to &uniq failed because it was bundled with "make let bindings always mutable".

4 Likes