Raising the bar for introducing new syntax

mgeisler · May 6, 2018, 10:08am

Hi all,

I've been looking at Rust for just little over a year now, and I am wondering about the philosophy behind adding new syntax to the language. I feel that there is a constant push towards adding new syntax to the language and I wonder how high the bar is set for this.

Basically, I feel the bar is set very low -- my impression as a casual user is that the thinking is more along the lines of a happy

Sure, this syntax looks better than what we had before, let's include it!

than a more conservative a measured response like

Okay, your syntax looks cool -- now please explain why adding it outweighs the cost of adding it to the language. It is orthogonal to existing features? Does it naturally extend existing features?

Two things got added recently:

RFC-1192: Inclusive ranges and especially the new ..= syntax
RFC-1682: Short-hand struct field initialization (Foo { x, y, z })

Both RFCs have a section about drawbacks, but neither mention the cost of introducing new syntax for something that can already be expressed with the existing syntax! They also both have a section about alternatives, but "do nothing" is not mentioned as an explicit alternative.

When reading the RFCs, I get the impression that the starting point is "this is a serious problem and we must solve it by introducing new syntax". That is not at all the starting point I would expect Instead, I would expect that there would be huge burden of proof on the new syntax, especially when it doesn't add anything new¹ to the language.

Have I missed something? Were there other pieces of syntax that only got introduced after a proper push back had been applied and hard questions got answered? Perhaps the questions got asked somewhere on this forum on in one of the myriad of GitHub comments?

¹ Since Rust is a Turing complete language, one could argue that no new syntax can really add anything new to the language. To me, this highlights that new syntax must be really, really clear and must fit really, really well with the existing syntax in order to be added.

leonardo · May 6, 2018, 10:10am

I don't think that's true. Almost every day you see new syntax proposed in the forum, and it gets shot down regularly.

mgeisler · May 6, 2018, 10:11am

Please read the above as a plea for keeping Rust a small and elegant language. It’s cool that the language improves, but please try to keep new syntax minimal and orthogonal.

Go is an interesting point on the spectrum: you can read Effective Go in an afternoon and afterwards you know almost the entire language! That’s pretty cool and I think it has contributed a lot to the success of the language. Now, Rust is more ambitious as a language than Go, so it is a bigger language. But it should still be kept as small as possible.

mgeisler · May 6, 2018, 10:15am

Yes, I know it does – just today I saw a proposal for letting || something denote a closure that throws away its arguments.

However, I still see changes that makes me wonder about the general attitude towards new syntax in the language.

The struct short-hand in particular made me question the elegance of the language. To me, it’s fundamentally a terrible idea since it mixes things from completely different levels of abstractions – one is the name of a local variable, and the other is the declared name of a struct field. It feels weird to let those two scopes become intertwined in this way.

Centril · May 6, 2018, 11:06am

When I write RFCs; I sometimes consider if I should add a "do nothing" section to the alternatives; and I did at the beginning, but now I'm coming over to the idea that it is redundant noise for the reader since it is always and option to do nothing with an RFC.

I don't know how inclusive ranges are redundant; 0...n is getting deprecated and 0 .. (n + 1) is not equivalent. You can write RangeInclusive { start: 0, end: n }, but that seems unergonomic.

Feels like an obvious ergonomics win to me to not have to write Foo { x: x, y: y, z: z }. The DRY principle should apply over simplicity of the language. I am surprised this addition is controversial.

I find this really unpersuasive. Writing programs is first and foremost about communicating intent effectively and unambiguously in such a way that other humans understand them. There are plenty of turing complete languages you'd never want to write anything in, including Iota and Jot - Wikipedia or SKI combinator calculus - Wikipedia

Yes; you can learn Go in 1 hour, but this is mainly because you are already familiar with everything from other languages and because it has few constructions. That doesn't mean that programs written in Go are easier to understand or more correct or that it is easier to write programs in Go. I'd say the opposite is true; Go makes it hard to build abstractions that help you communicate effectively and preserve correctness. It is not a language I personally aspire to in language design. But of course this is only my opinion, which you are free to disagree with!

EDIT: I don't mean to bash Go; They have their philosophy and we have ours; But ours and theirs are radically different in my view.

That said; we should not frivolously add new lexical syntax, but try to build on as much of the old syntax as we got so that new language additions fit within the overall story of the language.

leonardo · May 6, 2018, 11:08am

And it was (rightly) shot down.

Repeating two times the names of the fields doesn't make the language less bug-prone. It's just useless redundancy. And I like this improvement.

Regarding ..= syntax, lot of people think it looks bad, but having a way to denote close intervals is kind of necessary. And once you have open interval syntax it's kind of expected to have something for intervals closed on the right. You can argue that the open interval syntax is not necessary, and that's true, python has range(), etc.

H2CO3 · May 6, 2018, 12:11pm

While I don’t mind the struct field shorthand syntax in particular (and in fact I use it quite regularly), I can understand your feelings about it. Actually, I’d go even further: it’s not only syntactic additions that should be harder to make to the language, but probably all sorts of changes.

Rust is a very unique language in the sense that it managed to avoid or get rid of most pieces of bad design found in other, existing languages. It’s currently the only systems programming language that I could honestly recommend to use without subsetting. From what I can tell, the overwhelming majority of proposals that attempt to change the language are made by newcomers, and they are exactly about bringing these misfeatures back into Rust from other languages that the proposer might be more familiar with.

Fortunately, many of these proposals meet the reasonable opposition of the more experienced user base; however, I don’t feel that relying on a set of enthusiastic forum visitors to stop the language from steering into a completely new direction (that sometimes sharply opposes its past design goals) is a sustainable enough approach. If eventually it comes to the situation that nobody challenges an otherwise obviously unreasonable proposal, because everybody got tired of the constant influx of ideas that try to make Rust more like C++ or JavaScript, then will that obviously bad proposal just be accepted?

I think that acceptance of new proposals and RFCs should work exactly the other way around. Since the effects of adding to and/or changing the language are severe, it should rather be the case that a proposal or an RFC defaults to being rejected unless there are significantly more of those who support it than those who don’t. (To be specific, for example I wouldn’t consider a support-to-oppose ratio of 2:1 good enough for this purpose.)

mgeisler · May 6, 2018, 12:12pm

Of course it's shorter, but I don't think this is everything in language design

To me, this feature encourages you to couple two very different levels of abstraction: one is names of local variables and the other is names of struct fields. Changing a local variable from foo to bar now becomes a "replace foo with foo: bar (but only in struct initialization!)" exercise. It adds a new corner-case to something which should be very simple.

I like Rust so much precisely because it's not Go I used Go for 10 months at a previous job and there were lots of things I really disliked about it. Mostly that the language as such seems to be afraid of introducing abstractions, something which Rust is good at.

I'm just wishing for good, reusable abstractions, not one-off syntax that serves very small purposes -- if that makes sense?

Well, I've certainly never expected to have syntax for closed intervals

mgeisler · May 6, 2018, 12:19pm

Thanks for mentioning this -- I think it would be a good idea to always include it explicitly. Even though the RFC author is excited about his or her new idea, it is always important to be able to play "devil's advocate" for a bit in order to really test the idea.

So I would encourage putting a "do nothing" section into any RFC so the author gets into the right mindset and remembers to argue about the advantage of not implementing the RFC.

Centril · May 6, 2018, 1:13pm

This is exactly the way the RFC process does not work. It is not a popularity contest; and I hope it never will be. For language additions, it is rather based on finding consensus within the language team for something. The language team is then responsible for making sure that the objections of people towards the RFC are considered and the replies to those. If someone repeats one objection someone else has made, it should not count twice.

It's not being brief that is my primary concern; it is repeating information. For example, taking:

let binding = MyType {
    foo: foo,
    bar: bar,
    baz: baz,
};

you are repeating foo, bar and baz each twice. This is not the worst offender, but you still have redundancies.

Let's take another example:

let first_binding  = initial_value;
let second_binding = my_first_fun(first_binding);
let third_binding  = my_second_fun(second_binding);
let fourth_binding = my_third_fun(third_binding);

You are obscuring what is happening here with a bunch of temporaries that diagonally are repeated twice.

Compare this to:

let result = my_third_fun . my_second_fun . my_first_fun $ initial_value

(this is haskell syntax using function composition and then applying the composed functions to initial_value.)

or in Rust:

let result = initial_value.first().second().third();

Absolutely it does! I think we should try to design consistent syntax that fits well within a broader system.

I find that saying just "we can always opt to not do this" does not actually play devil's advocate. Instead, it is more important to actually find concrete drawbacks. Saying just "do nothing" leads more often to not bother finding concrete drawbacks in my own experience.

gbutler · May 6, 2018, 2:01pm

I'm not sure you could ever expect better than a 2/3rds Super-Majority on any proposal. I'd say a 2:1 ratios (which is 2/3rds in favor) would be a good cut-off at least.

H2CO3 · May 6, 2018, 2:42pm

There was a Twitter conversation between Manishearth and me a couple of days ago where he specifically mentioned that “the language team isn’t a dictatorship” and that “RFCs work on community opinion”. This seems to oppose your assertion.

Centril · May 6, 2018, 2:59pm

Except I didn’t say it was a dictatorship; I said: “The language team is then responsible for making sure that the objections of people towards the RFC are considered and the replies to those.”

Some examples can be helpful:

phaylon · May 6, 2018, 3:18pm

I believe this would all work a lot nicer if the discussions and opinions of the language team weren’t rather secret.

If an RFC has many people disagreeing, either via comments or votes, that should at least trigger some more extensive discussion about why the language team still went ahead. At the moment it just feels like a dismissal of the people who are against things.

I would also love that if an implementation strategy outlined by an RFC, or another community discussion was changed at a later point by the language team, it would trigger a new RFC. A change in direction later on in a tracking issue just feels like things are being hidden.

mgeisler · May 6, 2018, 4:21pm

Centril:

It’s not being brief that is my primary concern; it is repeating information. For example, taking:
let binding = MyType {
    foo: foo,
    bar: bar,
    baz: baz,
};
you are repeating foo, bar and baz each twice. This is not the worst offender, but you still have redundancies.

Yeah, I don't like redundancies either But about this concrete syntax, I'll be happy to try and flesh out a bit more why I think it doesn't fit the language:

As already mentioned, the names of local variables are suddenly influenced by names of structs defined elsewhere in there code. This mixes local concerns with more global concerns. I don't think there is any other feature in Rust where local variables are automatically used in this way.
To add to this, what exactly does x and y mean in T{x, y}? Are they variable names? Are they struct fields? It messes with my usual logic about what can be substituted for what in a program.

Normally, I can inline the value for x everywhere I see x used as a variable in my program. I cannot do that with this hybrid syntax. So I guess x is not actually an r-value here -- I guess it's then a struct field name? But then we end up the spooky situation where mentioning a struct field name automatically looks for a value in the surrounding scope.
Given that I don't have a clear mental picture of what the symbols of my program mean, I start wondering where the boundaries are for this mechanism? Can I turn use a local variable with a String to initialize a &str field?

I didn't know off the top of my head (which is a hint that the feature added some complexity to the language) but the answer is no. I tested with
```
struct T<'a> { name: &'a str }
```
and got a compilation error when trying to use the short-hand notation like this:
```
T{ &name }
```
This makes me feel that the syntax is quite specialized.
To the best of my knowledge, until this feature was introduced, you could always count on comma, separated, words to denoted a positional construct That is, item order mattered. This is true for function signatures and calls, tuples and tuple structs, vec![...] vector construction, and probably more...

Now, T{x, y} means the same as T{y, x}, but T(x, y) means something very different from T(y, x). This is a lack of consistency.
Further, I would hope that Rust can one day introduce something akin to Python keyword arguments for function calls: you explicitly mention the function arguments (in any order). That syntax could be very similar to the long-hand struct initialization syntax:
```
some_function(age: 123, bar: "Hello")
```
Here order shouldn't matter, so this would be the same as:
```
some_function(bar: "Hello", age: 123)
```
Notice how the short-hand struct initialization syntax has messed up the potential for a nice symmetry between initializing a struct and making a function call.

Without having thought too hard about this, my preference would have been to use a positional syntax here as well. It could perhaps have looked like this:

struct T<'a> {
    foo: i32,
    bar: &'a str,
}

let x = 123;
let y = "Hello";
let z = String::from("World");
T{x, y};      // same as T{foo: x, bar: y}
T{y, x};      // same as T{foo: y, bar: x}   -- compile error!
T{123, y};    // same as T{foo: 123, bar: y} -- literals just work
T{x, &z};     // same as T{foo: x, bar: &z}  -- expressions just work
T{x, bar: y}; // same as T{foo: x, bar: y}   -- mixed usage

Such a syntax would make the order of the fields part of the type -- I'm not sure if they are seen as such today? This would imply that reordering the fields would be a breaking change (just like reordering function arguments is a breaking change today).

Centril · May 6, 2018, 4:39pm

Some of your objections make sense to me; so it is not the slam dunk I thought it was...

The RFC mentions that:

Rust already allows similar syntax for destructuring in pattern matches: a pattern match can use SomeStruct { field1, field2 } => ... to match field1 and field2 into values with the same names. This RFC introduces symmetrical syntax for initializers.

In this light; I personally think it makes perfect sense from a consistency perspective, beyond being DRY.

But the syntax Foo { field: var } will look up var in the surrounding scope; The field init shorthand syntax just eliminates one step. (Sidenote: field: var is mentally jarring and looks like type ascription; it should have been Foo { field = var }, but too late now...)

Here I agree; I think the syntax could be extended and accept &name and &mut name. Maybe it is inconsistent to allow this, but it is quite useful.

This one is tricky; this would actually make the argument names of all functions part of the signature (they aren't today..) and subject to semver instantly.

I've done some thinking on unnamed structs tho; but the current FRU mechanism is in the way: Unnamed structs · GitHub

This feels more brittle to me

rpjohnst · May 6, 2018, 5:12pm

The aspect of Go that was brought up here is the same aspect that was brought up in the throw RFC. It is about being economical, about presenting a simple and coherent whole, about not becoming a "feature zoo."

That aspect of Go is extremely admirable, and does not in any way contradict the ability to build abstractions. There are more languages than Go that work toward this; Go is just one of the more recent and popular ones.

Now, Rust has to have a larger idea budget. It provides more control over memory layout, and has the accompanying borrow checker. It already has generics, which are fairly complicated. But what it does have is already extremely powerful! We really ought to consider new (especially syntactic) features very carefully, not merely in terms of their own tradeoffs, but in terms of how they affect the system as a whole. That is, avoid becoming a "feature zoo."

rpjohnst · May 6, 2018, 5:12pm

As far as this particular feature is concerned, I could personally take it or leave it on the "feature zoo" metric. I like it a lot and use it regularly, and I will say that this doesn't feel like anything new or strange- for example, take pattern matching, which already allowed the inverse.

In general, I think the idea of "using the same name in more than one place to connect things" is actually a fairly widespread thing. It comes from ML-ish languages, where you often don't have a single "declaration" point for a name. In-band lifetimes match polymorphic function types in ML-family languages, for example.

MajorBreakfast · May 6, 2018, 5:14pm

I just want to say that I think that the language team is doing a great job.

What’s important is that new additions are true to Rust’s core principles. Here’s a list with a few (You’ve seen them all, I’m sure ):

“Fearless concurrency”
“Fast, reliable, productive: Pick three”
“Stability without stagnation”
“Safety, speed, and concurrency”
“Stability as a Deliverable”
Edit: Uh, I forgot an important one: “Zero-cost abstractions”

I think that we shouldn’t be afraid of change. Also, sometimes an imperfect decision is better than no decision at all. But, as I said above, I think that the language team has done an admirable job in picking out the good ideas and the language is better for it.

Centril · May 6, 2018, 5:30pm

I don't disagree with being economical and coherent. I'm not either in favor of a feature zoo. What I think is important is the power to mental complexity ratio and consistency.

In Go it absolutely does; and I am not sure that the claim of coherency actually is true for that language. On the one hand, its proponents say that they favor explicitness, on the other, they have interfaces that magically get implemented without explicitly saying so.

Rust has much more expressive power, but is also limited from a Haskell or say Idris user's perspective. Fortunately, const generics, GATs, -> impl Trait, const fn, async fn, and proc macros are all being worked on.

Topic		Replies	Views
Don't keep complicating the syntax (soft post, maybe off topic, maybe irrelevant) language design	51	5440	March 25, 2019
The future of syntax extensions and macros	62	13503	March 25, 2019
Feature Idea: Add a macro to "de-magic" box syntax language design	9	1232	April 10, 2021
Simplify constructor function language design	49	1952	September 20, 2024
Pre-RFC: first-class support for compile-to-Rust languages language design	14	5246	March 28, 2019

Raising the bar for introducing new syntax

Related topics