Idea: |> operator for postfix calls to macros and free functions

Just one more footgun caused by Rust's not requiring fp<N> literals to have a digit on both sides of the decimal point when a decimal point is present. That decision was certainly one of the class of "let's save typing one character, and the hell with the visual ambiguity" (e.g., vs a method call).

6 Likes

I was recently surprised by things like 1. being accepted as float literals, because I have some memories about Rust not accepting them as a FP literals somewhere in 2014, but https://rust.godbolt.org says that 1.0 already accepts it.
I may be wrong and can't check right now, so perhaps those are false memories, but perhaps someone actually though it was a good idea to support this and intentionally changed the behavior right before 1.0.

2 Likes

I think the answer to that is "yes", but it has to be called as a function with UFCS. For example:

a |> MyStruct::my_method(b,c)

a |>my_method(b,c) would not be valid, unless there is a free function called "my_method" in scope.

which brings up another point I forgot to mention. I think the call position should specifically allow using full paths, not just the names of the functions (or macros).

1 Like

Your writing this with minimal spacing around the .> operator, and all the talk of method syntax, concerns me, because it makes me think you're thinking of this as a variant of the . operator, which is very wrong for the data-crunching use cases I care about. Data-crunching pipelines want a lowest-precedence pipeline operator, one that should be thought of as an alternative to assigning intermediate results to a temporary variable; not a high-precedence operator that is thought of as an alternative call syntax. (I don't know what use cases you have in mind.)

For instance, in data-crunching contexts it may make sense to write a bare lambda as an element in one of these pipelines,

let alpha = 0.1; 
let crunched = raw .> |m| m * alpha .> filter(|e| e > 0.01) .> ...;

where the second .> operator and everything that follows should not be part of the lambda expression.

(Apart from this I am indifferent between |> and .>.)

2 Likes

@zackw Good that you bring that up. @tmccombs’s proposal proposes a call style similar to method calls, but leaves the . method call syntax itself as it works today. Consequently similar precedence as . is intended. |> suggests low precedence, doesn’t it? That’s unintuitive if it actually isn’t intended to be low precedence.

Your use case can be implemented with an iterator and functional programming. Or, if your dataset is large using a parallel iterator from rayon. This syntax (with precedence similar to .) is intended to be quite useful for iterators. @Centril came up with the whole controversial idea about changing the method call syntax because it would be useful for crates like itertools in this post (I think that’s where the idea first popped up).

Regarding:

let crunched = raw .> |m| m * alpha .> filter(|e| e > 0.01) .> ...;

Haskell has such an operator & (infixl 1, defined in Data.Function) such that you can write:

expr & \a -> a + 1 & \a -> a + 1

However, I don't find that it is used often, partly because it is hard to tell from the operator alone where the lambda ends. I think .> in this example suffers the same problem.

My proposal can be decomposed into two independent parts:

  1. allow method syntax for free functions and associated functions.
  2. allow paths in method syntax, i.e foo.Arc::clone() etc.

They interact well with each other, but they are orthogonal.

Correct me if I am wrong, but I think what is controversial is 1. but not 2.

As I elaborated in the post you linked to, UMCS can be made to work well in documentation by listing relevant free functions that take the type as its first argument (or a reference to it..) at the type. A more general solution is to list all functions that mention the type in some page easily reachable from the type's documentation. This is sorely needed today anyways in my experience.

Associated functions of a trait with no type parameters can't be called with method syntax, however, they do "belong" to the type. I think most existing free functions also naturally bias towards one type as a receiver, or can be made to do when encoding the function. "Belonging" is too fuzzy a notion to have such rigid rules as we have right now. I also don't think the mental model of belonging goes away when letting free functions be called as methods. I know this, because when writing in Haskell, there are only free functions, but yet, I do think about which type to put first and what function belongs to what type all the time.

1 Like

How so? Do you mean if you put closure definitions in there? I think, if possible at all (currently not planned), that should require parenthesis.

  • Yes, the first wins the prize for being controversial
  • The second one is a bit controversial. It's a rarely used syntax. Personally, I find Arc::clone(&foo) is easier to understand. foo.Arc looks like Arc was a field on foo. The backwards order seems unintuitive. Feature-wise the current solution has no auto-deref, but that's no big deal because it is hardly ever used. I'm not convinced that the problem which the proposed syntax tries to solves exists.

(Edit: I've edited the text of the second point)

Given the snippet due to @zackw:

let crunched = raw .> |m| m * alpha .> filter(|e| e > 0.01) .> ...;

It is hard to tell whether to interpret this as:

let crunched = raw .> (|m| m * alpha) .> filter(|e| e > 0.01) .> ...;

or:

let crunched = raw .> |m| m * (alpha .> filter(|e| e > 0.01) .> ...);

If you require parenthesis, then it is not an infixl 1 operator (1 is the lowest precedence in Haskell, l means left associative).

UFCS in the form of <Type as Trait>::fun(recv, args..) is itself a rarely used syntax, but Type::fun(recv, args..) is not. The first syntax is included to be consistent and uniform, which I think a language should be. I want to avoid ad-hoc rules. Once you're in method syntax mode, I think you should be able to stick with it. The edit distance from recv.fun(args..) to recv.Type::fun(args..) is lower than that to Type::fun(recv, args..) and so it disturbs flow less.

Well yes, if you write it like foo.Arc then it is unclear what it means. But when you write foo.Arc::clone() it is considerably clearer because there is a path separator. Also, if you want to invoke the field of a type as a function, you have to write (val.field)(args..).

It is the same order as normal method application, so I don't see why it is unintuitive. The normal syntax is foo.clone(), but Arc::clone(&foo) is just used to highlight the fact that this is an Arc being copied.

The syntax iter.flatten() being ambiguous when you have two different traits with the same method name is a problem which manifests all the time when things are added to the standard library traits.

The syntax recv.<path::to::Trait>::method(args..) (if we can make that work..) is also useful because it allows you to call and import in one go, there's no need to put a use path::to::Trait; in there.

1 Like

the problem I have with adding this for piping data using postfix syntax is that A it adds a new operator that some one has to learn and second, that if we added a pipe trait, that is given some function,and possible some additional arguments, it generally use works just like postfix for free functions.

We would need variadic type parameters in order to write such a trait. And even then, it wouldn't work for macros.

After thinking about it a little, if we had postfix macros (or infix macros), then it would be straightforward to define a pipe macro. Something like:

macro_rules! pipe {
  ($self:self, $f:path($args:tt) => { $f($self, $args) }
  ($self:self, $m:path!($args:tt) => { $m!($self, $args) }
}

I actually think I prefer that to adding a new operator. But it would require postfix macros (that RFC is currently postponed).

However, one issue that both a pipe trait and this macro would run into is, how it would interact with borrowing. For a pipe trait, it would have to have at least three methods for self, &self and &mut self. And probably the same for the macro, unless the postfix macro system has some kind of magic to reference the self expression if necessary.

2 Likes

while yes, you would need variadic generics for pipe method to accept an arbitrary number of arguments, we could write a small section of up to say 12 functions, differing by post Number and number of parameters, which should cover over 99% use cases, with regards to actual functions

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.