Placeholder Syntax

One thing I love from Scala is placeholder syntax:

An expression may contain embedded underscore symbols _ at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.

Placeholder syntax is a nice shorthand that can be used for anonymous functions. I have not had the time to write out a full RFC yet, so these are just some thoughts:

The anonymous functions in the left column use placeholder syntax. Each of these is equivalent to the anonymous function on its right:

_ + 1 `
_ * _ `
_.map(f) `
_.map(_.inner) `

Placeholder syntax can be really useful when working with iterators. For example, this code:

users
   .iter()
   .filter(|x| x.any(|y| y.verified))
   .map(|x| x.name)
   .collect();

Would become the following placeholder syntax:

users
   .iter()
   .filter(_.any(_.verified))
   .map(_.name)
   .collect();

I personally love the style of the above code, and do not think that it is sacrificing any readability. In fact, I find the second example with placeholder syntax more readable than the first.

_ already denotes something to be "inferred by the compiler", so placeholder syntax is consistent with existing language semantics.

6 Likes

I disagree with the readability (as I have never programmed in scala I would be very lost as to what it means.

I also don't think that adding such a feature would be worth implementing as it is not that much shorter and other then that it seems to me as no other benefit.

7 Likes

The bigest problem with such a feature is scoping. How do you know how big a scope the placeholder is supposed to turn into the closure?

Is foo(bar(_, 5)) translated to foo(|x| bar(x, 5)) (gives us partial application for free! Amazing!) or foo(bar(|x| x, 5))? Or even |x| foo(bar(x, 5))?

In fact, is _.name |x| x.name or (|x| x).name? (More actually illustrative, consider _.call(5), as that method name is actually available on closures.)

You can come up with rules, but they're going to disagree with a large number of people's first guess, no matter which rules you pick.

I personally aesthetically like _ closures, and spent a decent amount of effort trying to describe reasonable rules for how big the closure should be, to avoid surprises. I think it is possible to define clear rules without surprising edge cases and being reasonably useful in all uses, but I also come to the conclusion that it does not fit a language such as Rust which prioritizes locally obvious semantics.

Lack of overloading would generally mean there's only one valid (as in it type checks) interpretation that would compile, but I can't rule out the possibility of an actual semantic ambiguity.

30 Likes

I don't think _ is right for Rust, because in other contexts it means ignored/unused/whatever, and in closures it would be an actual non-ignored value.

Swift has:

numbers.sort { $0 > $1 }

that means numbers.sort(|a, b| a > b). So maybe something in that direction?

10 Likes

Honestly I've not been that annoyed by the lack of placeholder syntax like this in Rust.

For comparison, it was really annoying in old C# when they way to make a delegate was delegate(string x) { Console.WriteLine(x) }, because that's just so wordy. But one of the newer versions added new syntax to make it just x => Console.WriteLine(x), and that's been fine. I could often just pass Console.WriteLine as a method group here, like passing Add::add instead of |a, b| a + b in Rust, but I'm using the full delegate here for a better parallel to the following Rust example.

Similarly, with rust just using |x| println!("{}", x) it hasn't bothered me to need to name the value. It reminds me of how I'm not annoyed that I need the i in for i in 0..n -- that could also have a placeholder, but I think it's good it doesn't, and just using the by-convention typical name is fine.

:100:

Especially when you consider that the placeholder might be getting passed to a macro, which makes everything way more complicated.

7 Likes

+1 for this syntax. Perhaps coming from a different background I found it surprising Rust doesn't have placeholder syntax for values, since it essentially has placeholder syntax for types (let v : Vec<_> = ....collect();).

Scoping should be fairly narrow. foo(bar(_, 5)) becomes foo(|x| bar(x, 5)), _.name becomes |x| x.name. While I know this transformation should not be type aware (that is, it should happen based just on the parse tree), I do think that the type system will catch the majority of mistakes, especially while learning. That is, foo(bar(_.name, 5)) expecting foo(|x| bar(x.name, 5)) is probably going to be a type error.

I'd love to see this in Rust.

1 Like

It does:

let Some(_) = Some(2);
2 Likes

This could be solved with explicit delimiters, which could actually make the end of the expression more apparent than it is with the existing |x| expr notation in some cases. The trouble is we're out of delimiters ... but not if we leave ASCII.

foo(bar(⟦_⟧, 5))     foo(bar(|x| x, 5))
foo(⟦bar(_, 5)⟧)     foo(|x| bar(x, 5))
⟦ _.call(5) ⟧        |x| x.call(5)
⟦ _ ⟧.call(5)        (|x| x).call(5)

You want more than one argument? Underscores not descriptive enough?

⟦ x, y | x + y ⟧     | x, y | x + y

(Not entirely joking.)

(Could this be implemented with a proc macro? #[placeholder_closures] enables this notation for the subsequent item...)

3 Likes

No, attribute macros have to take valid syntax as an input.

But this can't be handled by a bang macro, either, as the non-ASCII characters are unrecognized tokens and rejected even in positions without any syntax.

(e.g. macro_rules! m { (⟦_⟧) => {}; } is an error.)

You could probably do a proc macro to make m!(_ + 5) and such work, though. In that case I'd make the closure extent the bang macro. An attribute macro doesn't work, again because it's currently not valid syntax, as _ is not a valid expression.

(News to me actually; I thought it was syntactically accepted but semantically rejected as an expression, as a reserved identifier, but it's also rejected in #[cfg]'d out code.)

2 Likes

Yes, but. Wouldn't this defeat the whole point of the proposal? I'd rather not have two, approximately equally complex syntaxes for closures. Once we start writing ⟦ x, y | x + y ⟧, we might as well stick with |x, y| x + y.

7 Likes

It was not an entirely serious proposal, but the virtue I see in it is entirely due to the visually-obvious ending delimiter for the closure. As evidence for the value of this, see the earlier draft of my post in which I wrote that foo(bar(|x| x, 5)) would correspond to foo(bar(⟦_, 5⟧)) instead of foo(bar(⟦_⟧, 5)).

... Come to think of it, would the entirely ASCII syntax

    { arg, arg, ... | STATEMENTS }

be ambiguous with any other existing syntax in the language? With or without the additional frill of allowing {| _ + 5} to be shorthand for { _ | _ + 5 }. This would, hypothetically, replace the existing

    | arg, arg, ... | EXPR or BLOCK

closure syntax. I'm guessing it's a non-starter because it's ambiguous with | as a binary operator.

One of these? %)

\[ _ + 3 ]
.[ _ + 3 ]
^[ _ + 3 ]
~[ _ + 3 ]

The reason | works as a closure delimiter is because as a binary operator, there's no valid reason to start a expr with |. So the solution to | being a binary operator already exists: start your closure declaration with |. But then you're quickly on the road to existing closure syntax.

If you want to create a closure that omits naming its arguments, why not just do something like |..| _ + 3? Creating some alien prefix or bracket undermines the whole point of trying to simplify closure expressions.

3 Likes

But most of these simple closures only have one argument, so that doesn't make it shorter?

Also, a general downfall of all the "each _ is a separate closure parameter" proposals is that quite often, especially with iterators, you work with a n-tuple instead of n individual arguments which is impossible with the shorthand (_.0 + _.1 would not work).

1 Like

C trigraphs: ??( _ + 3 ??). (No, I'm not seriously suggesting this.)

2 Likes

To be clear, I am mostly okay with the existing closure syntax, and my original post was tongue-in-cheek. But there is one thing I genuinely don't like about the existing closure syntax: the lack of an explicit ending delimiter for single-expression closures. If not for the obvious problems with using non-ASCII punctuation, I would call ⟦ x, y | x + y ⟧ an unambiguous improvement on | x, y | x + y just because of that.

And explicit delimiters also solve the problem raised upthread about not knowing where a placeholder-based closure ends, so hey.

2 Likes

You can optionally use {} as a delimiter if you want/it's unclear, though, right?

|x, y| { x + y }
3 Likes

Rustfmt will eat those, although it won't eat

{ |x, y| x + y }
1 Like

Why are numbers better than letters? I.e., why would

`{ `1 + `2/`1 }

be better than

|x, y| x + y/x

?

The point here isn't to come of with a new syntax. The point is to come up with a better syntax (for something that, IMO, is already very neat and succinct.)

5 Likes

I could be wrong about this, but given the constrains the Rust language is under (in this case specifically making the semantics of a given piece of code obvious in a local context), I'd say that the current closure syntax is either some kind of (local) optimum, or quite close to it.

The various suggestions I've seen here suffer from various issues:

  1. Syntax unsupported by the parser. It could be added but then the issue becomes that we might as well stick with the current syntax
  2. _-based syntax suffers from ambiguous scoping rules, and using them in macros would be a potent recipe for bugs i.e. those 2 language features would not compose well.
  3. None of these options I've seen is all that much shorter than the current closure syntax. Combined with the previous issues it raises the question of whether it's worth spending engineering effort on when something like GATs, full const generics (or, dare to dream, full-blown dependent typing) or other language-based features would yield a much better ROI. This matters because even though volunteers are joining the Rust community, especially the resources available to the language team is still quite limited from what I can see. And this makes sense, working on the language part of Rust isn't exactly low-hanging fruit and requires quite a bit of expertise.
11 Likes