Compact closure args

Rust’s closure syntax is a little on the heavy side:

map(|x| x.frobnicate())

It stutters the arg name, and the pipes have lost some of their appeal with the removal of the do syntax.

Compare to Scala’s compact form:

map(_.frobnicate())

Groovy has something similar, the it keyword:

map(it.frobnicate())

I’ve seen this mentioned for Rust in passing in other places, but I haven’t seen it discussed head-on yet, so: how about a compact syntax for closure args?

I’m personally not crazy about _ or it, especially for multi-arg closures. I was thinking something like:

map($0.frobnicate())

which would allow you to specify positional arguments out-of-order, though I’m definitely not tied to this particular sigil or syntax and would be fine with _, it, or something else.

4 Likes

Ruby also uses |x|, and I’d challenge the assertion that |x| is significantly heavier than _.

But anyway, in general, Rust is about being explicit, so I don’t think that this gains very much, and adds more implicitness.

4 Likes

For this specific case, I would prefer (and I think Rust already allows):

map(frobnicate)

The hard part is if you want to partially apply some arguments.

But anyway, in general, Rust is about being explicit, so I don't think that this gains very much, and adds more implicitness.

I would say that Rust is about being correct. If it were about being explicit, then why doesn't it require us to annotate all the types, all of the captures, and so on? (In English: I don't think this is a strong argument.)

4 Likes

I think when Steve says "Rust is about being explicit" he means more accurately that Rust is about minimizing magic. It attempts to not trade significant magic for convenience.

Benjamin Striegel said on the mailing lists back in December:

Every time that we've added magic, we've lived to regret it and ultimately revert it. If there exists only a single lesson of the past few years of language evolution, let it be this.

2 Likes

_.frob() is 4 fewer characters and $0.frob() is 3 fewer characters than |x| x.frob(); they’re objectively lighter-weight.

I should be clear, I don’t think the current pipes are horrible. I just think closures in particular benefit from terseness, especially in chained invocations.

What’s implicit about this? You already don’t have to annotate argument types for closures.

How is this ‘magic’? It’s just taking closure args via a different means.

1 Like

I don’t know what _ in a closure means. I guess it’s an implicit default variable (doesn’t Perl do something like this?). But then what happens with multiple arguments? It’s not at all obvious. There’s no natural way to extend that, because there’s a magic variable that has behavior you have to memorize before using.

With the | syntax it’s slightly more keypresses, but variable definitions work mostly the same way as in functions (bar not having to annotate types): You list the arguments, and you put commas between them. If I see a closure and it has |x| and I want to add another argument to it, nobody has to explain to me to do try |x, y|: it’s just the natural thing to do. No, I have not written Ruby before (I program for scientists, and thus favor Python).

Perhaps you could make a slightly stronger case for $narg. But then you lose out on both (1) knowing exactly how many arguments you have and (2) being able to annotate the types of your arguments.

If your case is going to be “this is equally/less implicit/magic than the current syntax” I think you will find it hard to make that point.

Ok, yeah, _ in Scala is mysterious for multiple args, which is why I said I personally don’t like it. I’m primarily referring to the $n syntax, which doesn’t strike me as magic, just different; I only mentioned the others on the off chance people would be more receptive to aping prior art.

This is only intended for closure args, not free closures, so you know how many arguments you have via the caller, statically enforced by the compiler. If you want to annotate the types of your arguments, just use the regular syntax; this is sugar only.

(Cf http://discuss.rust-lang.org/t/auto-currying-in-rust/149/18)

Maybe you can use “|n” as an expression for not wasting any other symbol, because “|” is already associated to closures, but a numeric literal following “|” would not have any ambiguity. I think that syntax would also be interesting for nested lambda-expressions. Consider this ridiculously absurd example:

map( |0.filter( ||0.getNumber() != |0.getForbiddenNumber() ) )

How is it known what should be closurefied?

Edge cases:

  • map(foo($0 + 1)): is it |x| map(foo(x + 1)), map(|x| foo(x + 1)), map(foo(|x| x + 1)) or map(foo((|x| x) + 1))?
  • something(some_iter.all($0)), is that something(|x| some_iter.all(x)) or something(some_iter.all(|x| x))
  • let a = { let b = $0 + 1; ... }, is it let a = |x| { let b = x + 1; ... } or let a = { let b = |x| x + 1; ... }

Basically, as soon as you have nested expresions (which is every Rust program ever), there needs to be a rule for what expressions are in the closure and what are not.

6 Likes

Sure, you’d just make a rule. ‘Tightest fn that requires a closure arg’ seems reasonable. So for your cases:

  • map(foo($0 + 1)) is map(foo(|x| x+1)) if foo takes a closure arg; map(|x| foo(x + 1)) if it doesn’t.
  • something(some_iter.all($0)) is something(some_iter.all(|x| x))
  • let a = { let b = $0 + 1; ... } isn’t allowed b/c it’s not a closure in arg position.

I think you’re overstating the frequency that this would be ambiguous or confusing; nested expressions are common but fns that takes closure args that take closure args are not.

The cases that benefit the most are chained transforms. Something like

foo.iter().map(|x| x.bar).filter(|x| x > 0).collect()

becomes

foo.iter().map($0.bar).filter($0 > 0).collect()

In the single-arg case, the Scala syntax is even nicer:

foo.iter().map(_.bar).filter(_ > 0).collect()

but it falls apart with multiple args. I’d be happy with only making the sugar available for single-arg closures, as well, which would make the feature simpler while still covering a lot of use cases.

4 Likes

I'm not sure if introducing name resolution & type checking into choosing how a syntactic construct is 'defined' is a good idea. Maybe it's not such a problem in practice, but it is certainly magical, since you can't tell what's going on without checking type signatures, and even trait implementors, e.g. the map(foo(...)) example could have foo<T: SomeTrait>(...) and then behaviour would differ based on if any closure types implement SomeTrait.

Rust already requires you to check signatures to figure out what’s happening in a lot of cases, because of type inference. And isn’t, for instance, ‘a + b’ a syntactic construct that’s defined in terms of name resolution and type checking? I’m not a compiler writer so the distinction isn’t immediately clear to me.

As far as I can tell it’d be the only place in the grammar where you need to do typechecking to know how to parenthesize an expression, eg. how to build the ast.

This would be the only dependence on type signatures for control flow, i.e. when each of foo, bar and baz are executed in foo(bar(baz($0 + 1))) changes based on the precise type signatures of those functions.

For the record it seems Scala doesn’t avoid this issue either.

The suggestion in that comment is to have a separate marker to introduce the lambda, which, while slightly reducing compactness, doesn’t seem unreasonable.

E.g., just as a strawman to illustrate, if we use @ to introduce “compact closures”, and _ for implicit arguments:

foo(@bar(_ + 1))

would be the equivalent of:

foo(|x| bar(x + 1))

(Again, this is not a proposal - just a demonstration.)

One particular tactic which doesn’t seem like it would be horrible would be to use the same symbol both to introduce the lambda and then as a sigil for its positional arguments.

There are four reasonable possibilities I can think of for this symbol, each with different potential conflicts.

Each of the below examples is equivalent to: sort_by(|x0, x1| foo(x0) > foo(x1))

  • sort_by(@ foo(@0) > foo(@1))

This likely conflicts with e.g. the proposed use of @ for attributes (but there are many potential uses for @).

  • sort_by($ foo($0) > foo($1))

This likely conflicts with macros. (I’m not super familiar with macros but I know they use $.)

  • sort_by(? foo(?0) > foo(?1))

This likely conflicts with the proposed postfix ? for exception propagation.

  • sort_by(\ foo(\0) > foo(\1))

This likely conflicts with the idea of a \foo b for infix function calls.

Potentially (at least for some of these) we could also allow just the sigil without a number, or _, in which case the numbering would be implicit based on the order they appear in the code. (Probably the two would be mutually exclusive, i.e. either all arguments have to be numbered or none.) EDIT: Actually this would likely be ambiguous with the very idea of using the sigil to introduce the lambda… but _ would still be OK. (Unless we use that for default arguments or something.)

Of course, we could just take the KISS route, which would most likely be the best one, and define simple mechanical rules about the scoping of implicit closures. For instance:

The scope of an implicit closure extends to EITHER

  • the innermost function or method call,

OR

  • the innermost expression consisting of only unary and binary operators

whichever comes first, with a placeholder _ in place of one or more subexpressions. This translates to a lambda with a number of arguments equal to the number of placeholders, which are substituted in their place left-to-right.

And if you want to do anything more complicated, write an explicit lambda.

Some examples of how the above would work:

  • foo(f(x, _)) => foo(|a| f(x, a))

  • foo(_.f(x, y)) => foo(|a| a.f(x, y))

  • foo(x + _ + z) => foo(|a| x + a + z)

  • foo(f(_, g(x))) => foo(|a| f(a, g(x)))

  • foo(f(x, g(_))) => foo(f(x, |a| g(a)))

  • foo(f(_, g(_))) => foo(|a| f(a, |b| g(b)))

  • foo(f(_, _)) => foo(|a, b| f(a, b))

  • foo(x * _ + _) => foo(|a, b| x * a + b)

  • foo(f(x, _ * y + z)) => foo(f(x, |a| a * y + z))

  • foo(f(x) * _ + z) => foo(|a| f(x) * a + z))

  • foo(f(_) * y + z) => foo((|a| f(a)) * y + z)

  • foo(f(_) * _ + z) => foo(|a| (|b| f(b)) * a + z)

And so on. And if that’s not what you wanted, the typechecker will let you know.

(Maybe not these exact rules; these are just the first I thought of. But the point is to keep them simple, predictable, and mechanical.)

1 Like

I personally prefer explicit syntax more. It feels like being explicit is The Rust Way. So I don’t see a lot of gain from change like that.