Pre-RFC: Cascadable method call

This is what Dart uses and it's the best choice available, yet in Dart it looks quite noisy and this makes it no go for Rust. Moreover, this effect should be amplified because the proposed syntax is supposed to be applied for every method that takes &mut self and not on-demand.

P.S. it would be really nice if people helped me to figure out how to represent this argument in the most approachable way, so I'd be freed from obligation to repeat it again and again and again and again...

Do you really consider the difference between "a method taking a shared reference" and "a method taking a unique reference" important enough to require different call syntax? Note that this is not the difference between "modifying" and "non-modifying" - a method can take a &mut self without modifying the data, or a method can take a &self and modify something using interior mutability.

4 Likes

Ouch. No thanks, that's very hard to read, as it's very easy to confuse with ... The same goes for ...: it actually used to be the inclusive range syntax before ..=, but it was changed, exactly because it was easy to confuse with .., so I don't think it could possibly be justified more in another context.

3 Likes

I consider this difference important enough when methods are chained e.g. in Rust there's a lot of chains like

let x = something.as_mut().first().replace(another);

where it's very easy to get lost.

Compare it with

let x = something~as_mut().first() replace (another);

and you instantly see the emphasis

What I don't see is why ~ and a whitespace though. What's the difference? Why doesn't first() get the special sygil, even though it also mutates (assuming an iterator here)? That one-liner you cited is perfectly readable with the current syntax, as much as one can read it without going to the documentation. If you didn't guess without a tilde that as_mut() provides mutability, that's not the fault of the language.

2 Likes

Because a) they work good enough b) we don't have a better choice because .. is unsuitable as you've already agreed, and any other operator would be unobvious. Yet. Another. Time. I. Repeated. It.

It got the . because it doesn't mutate anything. That's just example method

This isn't about readability but about understanding the code and how much trust you need to put into people who wrote it. See, how easy I fooled you with .first()?

So, you right. When I'm reading a decent chunk of code like tensorflow example in OP I constantly feel like it's my fault that I don't understand it. It's my fault that I cannot remember a lot of things at once. It's my fault that I cannot distinguish important information from non-important. It's my fault that I didn't read the documentation and that I cannot decipher every bit of information that people put into method names and variables. Perhaps I'm capable of it but I'm just too lazy and impatient to complete such a task. I'm the real source of fault

I'm not questioning the particular choice of symbols. I'm questioning, as a reader of the code, why these methods are called using a different syntax, and why there are two different alternatives in the first place? I wouldn't have guessed that the only difference is mutability, that doesn't seem like a strong enough motivation.

You didn't fool me. Apparently this is some non-real code, which does not even follow the usual naming conventions, and with which you are trying to demonstrate a point that can't, pretty much by construction, stand for non-real code. If it were real code, I could have 1. recognized the mutability of every frequently-used function because I know the stdlib functions by heart, or 2. consulted the docs or read the definition of the first() method in case I did not know it, and the problem would simply not exist.

1 Like

Because this is useful difference that allows to implement method cascading, and because it allows to remove complexity of builder pattern which frustrates newcomers as well as experienced programmers, and because these methods internally are different, and because semantically the first is a command and the second is a query, and because they allows to remove a lot of mut and temporary variables from the code, and because we may get pipe-forward operator from it, and because it removes some dereferencing/assignment verbosity when working with Copy types, and because all together just supposed to make the language more pleasant to work with.

Is there a convention that prevents .first() from consuming self and returning a mutable value? Is there a convention that allows to guess which effect every method with non-obvious name has?

Anyway, the following compiles:

struct X<T>(T);

impl <T> X<T> {
    fn as_mut(&mut self) -> X<&mut T> {
        X(&mut self.0)
    }

    fn first(self) -> Option<T> {
        Some(self.0)
    }
}

struct O;
let mut x = X(O {});

x.as_mut().first().replace(&mut O {});

so it's real.

Well, if you feel happy and productive in this way then perhaps you really don't see the problem which I'm trying to solve and cannot give any useful advice.

Do you suggest the required syntax being chosen automatically or the library writer choosing what syntax must be used?

For example, do I understand correctly that AtomicU32::fetch_add shall be called like this?

let prev = my_atomic.fetch_add(1, Ordering::Relaxed);

It takes &self by design because for atomics being able to modify them with a shared pointer is a feature, not a bug.

How about Rc::clone - is it considered a "modifying" or a "non-modifying" method?

How about Vec::as_mut_slice that takes &mut self, but does not modify it on its own?

If you want to force a difference between functions taking &self and &mut self, what would happen to methods that take their receiver in other ways?

// All these are public methods from the standard library
fn count(self) -> usize; // Iterator::count
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>; // Future::poll
fn into_string(self: Box<str>) -> String; // str::into_string
fn wake(self: Arc<Self>); // Wake::wake
fn wake_by_ref(self: &Arc<Self>); // Wake::wake_by_ref

Does anything change for associated functions that aren't callable with the method location at all?

pub fn strong_count(this: &Rc<T>) -> usize; // Rc::strong_count

Does anything change for normal functions without self-like parameter at all?

Does anything change for passing functions by name?

// `Vec::len` takes `&self` and returns a value we want
let _ = my_vec_of_vecs.iter()
    .map(Vec::len)
    .max();
// `Vec::clear` takes `&mut self` and returns nothing
let _ = my_vec_of_vecs.iter_mut()
    .map(Vec::clear)
    .count(); // Not exactly idiomatic, but still okay
// `Vec::pop` takes `&mut self` and returns a value we want
let _ = my_vec_of_vecs.iter_mut()
    .map(Vec::pop)
    .max();
// `Vec::into_iter` and `Iterator::count` take `self`
let _ = my_vec_of_vecs.into_iter()
    .map(Vec::into_iter)
    .map(Iterator::count)
    .max();
7 Likes

It's not real in the sense that it's not using any actual, well-known APIs. Just because you can write some code doesn't mean that it's going to be idiomatic, well-recognizable, conventional, or easy to read.

My advice is not to add more syntax in order to solve non-existent problems.

3 Likes

If you keep having to repeat your arguments, perhaps the problem is not that people don't understand them, but that they simply don't agree with them. In particular:

  • You say it's the best choice available, because you've already ruled out other choices as "unavailable". This seems to be based in part on the criticism leveled at your previous proposal (which suggested .~). But, in the first place, concatenation-as-cascade is a more disruptive change to the language (macros, if conditions, etc.) than "just" adding a new sigil, so it needs to be more motivated than simply "add a new cascading operator". Any substantive argument against adding a new operator is only made stronger if the new operator is concatenation.

    Moreover, it's a very "my way or the highway" kind of approach. You may have already evaluated a variety of syntaxes and discarded them for various reasons, but that doesn't mean the rest of the language community would agree, or that it's pointless to involve the community in workshopping the syntax because you have already determined the One True Method Cascade Syntax and all other debate is a waste of time.

  • You say that any other syntax would be unobvious. But concatenation for method calls is, to me, also very much not obvious, so that's really not a discriminator. For that matter, . isn't particularly "obvious"; you have to learn what it means when you're introduced to methods in the first place, and you're proposing a new syntax to be parallel with that.

    (One thing that's really unobvious to me is removing () as well as ., which doesn't seem to be backed up by any of the arguments in favor of cascading in general, and makes a weird asymmetry between commands that take an argument and commands that don't.)

Syntax is the easiest thing for people to understand, so everyone can have an opinion on it. It seems to me you're spending a lot of effort insisting on the particular syntax you've chosen, and not enough time motivating the real innovation, which is "introduce language level support for cascading". It may be useful to introduce a placeholder syntax as a rhetorical device (similar to how yeet is used in discussions about error handling). I might suggest q. I realize you've given a lot of careful deliberation to syntax, but the value of using a dummy is that you can focus on one thing at a time; i.e., establish that there is a need for method cascading to be supported at a language level, and only later pivot to the argument that q should in fact be simply concatenation.

With that preamble, I'm interested in how you explain the exact difference between cascadeable and non-cascadeable methods. Because it seems to me the discussion has waffled between three things:

  • what kind of receiver the method takes, i.e. &self vs &mut self etc.
  • whether the method is semantically a "command" or a "query"
  • whether the method returns Self or () or some other type.[1]

As others have already pointed out, those are three separate things, and assuming you can distinguish between commands and queries based on the receiver or return type does not seem like it makes the language simpler. Which one of these things, in your opinion, is the main point of cascadable method syntax, and how do you deal with the situations (helpfully elaborated by tanriol) where they don't agree with the other two?


  1. Although, in your second post, you mention fn quux(&mut self, _: i32, _: i32) -> bool as a cascadable method signature, even though it returns neither Self nor (). Is that a mistake? β†©οΈŽ

19 Likes

I'm not sure what problem you are actually trying to solve here. Is it just a way of indicating what argument types (self, &mut self, &self) methods take? Could this be solved by things external to the language like IDE annotations?

My second question, you give this example:

let x = (x) sin;
let y = (y / 2) cos;
let z = (x + y - z) wrapping_add (4) tan;

// In current Rust:

let x = x.sin();
let y = (y / 2).cos();
let z = (x + y - z).wrapping_add(4).tan();

I don't think either is obviously better than the other, and if you were to ask people whichever they prefer they will just tell you whatever they are used to. Remember that new editions and breaking language changes have an incredibly high barrier of entry. "I prefer this syntax" is not sufficient.

FWIW, there are some very, very long sentences in your original posts, which makes it (for me, at least) hard to read and understand. They speak much but say little.

12 Likes

They just provides API, then consumers must choose an appropriate notation to call it. But that seems to be a perfect target for IDE autocorrections, so in the end users may even not need to refer to documentation in order to understand which notation to use

Yes. But we could annotate it with something like #[interior_mut] or #[command] or #[mutates] to enable:

let prev = my_atomic fetch_add (1, Ordering::Relaxed);

Updating reference counter doesn't seem to be important enough mutation to require a different notation. Most likely this should look like any other .clone()

Then mutation will happen on usage side, so a different notation is still justified:

let slice = vec~as_mut_slice();
slice[x] = y;

This is how they ought to look:

let n = iter.count();

let poll = future~poll();

let string = my_str.into_string();

waker wake;

waker wake_by_ref;

So, perhaps #[interior_mut] annotation should be a part of my proposal.

Nothing except we may have a pipe-forward operator to call them

A good question. Perhaps we should lint against passing commands by name e.g.

let _ = my_vec_of_vecs.iter_mut()
    .map(|v| v~pop()) // this is preferred
    .max();

But otherwise the current syntax should work




The truth is that we don't live in a world where every API is well-known, well-documented, idiomatic, well-recognizable, convenient, and easy to read.




We've already learned during .await proposal period that adding operators is costly because it increases line noise of code. We've also learned that consistency with ? as well as consistency with .field and any other consistency which makes understanding the construct by itself doesn't really matter, and that it may either have a negative value. Let also consider that cascading syntax would be used much often that .await and my choice of non-operator approach should become obvious. So, it's not that people don't agree β€” they don't listen.

Everyone is free to propose alternatives, I'd be even happy if someone would find a better syntax than I propose. Moreover, I've shared a lot of my knowledge, a lot of prior art references, and some very good examples for testing prototypes. Yet, I've not seen any interesting alternative to consider, so it seems to be that any further development in this area would be a waste of time.

IMO every syntax choice is unobvious. The problem isn't to make it easy to understand but to create alignment of constructs that does make sense in as most contexts as possible. For example

foo bar () baz () qux () quux ();

looks bad exactly because the symmetry between taking arguments and taking nothing commands stands in the way. Moreover, preserving () here doesn't makes too much value because e.g. in method call syntax () is required first of all to disambiguate between method and field namespaces β€” something like that isn't very useful in cascading context, and if necessary we may just use foo.bar call (baz).

Language level support for cascading isn't an innovation here because it was considered even before Rust reached 1.0. milestone and people agreed that cascade! crate is a sufficient workaround. The innovation here is to make it more viable than macro implementation, so that's why so many additional things were introduced at once including operatorless notation, formatting, command/query/request separation, less mut annotations, builders, saturating operators, pipe-forwarding etc.

Perhaps not everyone knows, but the majority of these related topics were already opened in the past β€” one thing at a time and with suggesting some placeholder syntax. There was always lack of motivation and some workaround with using the current syntax was suggested as an alternative. What a waste of time would be to reopen them again in order to repeat the same fate!

That said, I'm trying to show that we can solve a lot of problems with the minimum of solutions so there would be enough motivation/changes balance to make it pass.

There's no main point between these three things β€” all of them are real properties, but for end user most likely these would be useless. People just asked for semantics and I answered.

I think the main point of cascadable syntax might be that it's guaranteed to return the same type as its input and this holds in chaining e.g. Foo::new() bar baz qux quux is guaranteed to return Foo because every method in this chain is guaranteed to return Foo. This makes the language simpler because users won't need to jump to documentation of every method and remember their effects in order to understand what this chain returns. Neither they need to identify name of each method for that: it's instantly obvious that there's some Foo with some updates. This just makes it easier to understand what's in scope!

See above answer. I'd compare the proposed syntax with if in Rust: it's either statement, expression, branching, ternary, scope, may include pattern matching, may include else, may be chained, may behave differently at the end of function, nevertheless all of its properties work together in synchrony. There's no single problem which if solves and similarly with my syntax

That's a bold statement. Remember that you'll have to make people listen and agree with you to make you proposal accepted.

This may allow people to use the notation without reading the documentation but it doesn't mean that they will understand what they're using nor that they will be able to later read it! Remember that reading code is as much important as or even more than writing code. And while you claim that this would better transmit intentions that won't matter if the reader is lost in all the new syntax. Even if it knows the syntax, having to look for ~ and spaces in addition to . won't be easy.

I feel the opposite though. You're trying to solve few or tiny problems by introducing a lot of syntax and the tradeoffs involved don't seem to convince people, at all.

I think you're too focused on the single usecases without looking at how this would fit in the whole language. If I look at the syntax for the single uses it doesn't look that bad, but when I try to imagine it with the rest of the language it just feels a mess.

This proposal remembers me of kotlin's scope functions. They're not new syntax but serve the same purpose as what you're trying to add. They allow you to avoid repeating the receiver when chaining multiple function calls that don't return this/self, and there are also a bunch of them for specifying the exact semantics you want. However they're confusing, many people don't know when to use one over the other and always end up using the first thing that does the job. This also shows an alternative to your proposal: add something like tap to the stdlib.

12 Likes

Okay, let's check my understanding.

The proposal is to separate method calls (only method calls, not other associated functions or functions in general) into three classes:

  • "query" syntax (thing.method(args)) is used in two distinct cases:

    • non-modifying (for example, my_vec.len()) - functions that take &self (or something equivalent) and return a value
    • consuming/transforming (for example, my_iterator.count() or my_boxed_str.into_string()) take self (or something equivalent) and return something derived from it

    Q. Why does the second case use the "query" syntax? If the intended semantics is "this call does not affect self", then functions consuming self look wrong here.

    Q. Is this about modifying the receiver or about something else? Should fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> be a query or a request?

    The "query" syntax is the default for all methods not taking &mut self (or something equivalent), but can be opted out from with a special attribute (let's call it #[not_query]). Any method taking &mut self cannot opt into using this syntax.

    Methods using "query" syntax cannot be called with any other syntax. Changing a method from "query" syntax to any other or from any other to "query" (for example, due to forgotten or unneeded #[not_query]) is always a breaking change for consumers.

  • "request" syntax (thing~method(args)) is used for modifying methods that return a useful value you need. These include all methods taking &mut self and all methods annotated with #[not_query]. Assumption: one can use this with "useless" return values too if they wish to avoid hardcoding some specific "usefulness" check.

    Any methods using "request" syntax can also be called using "command" syntax in case you don't need its return value.

  • "command" syntax (thing method (args)) is used for modifying methods that don't return a useful value (for example, my_vec clear) or when you don't care about the return value (for example, my_vec pop). These include all methods taking &mut self and methods annotated with #[not_query].

    Any methods usable with "command" syntax can also be called using the "request" syntax if desired.

    The special "feature" of this call syntax is that the call expression returns the method call receiver, discarding its return value. This makes it possible to chain these calls just like normal method chaining (and even with a bit less sigils).

Am I missing any key parts of the initial proposal? Are the assumptions above correct?

In other words, this proposal includes two different changes to the current model:

  • force all receiver-modifying methods to use a separate method call syntax (~ instead of .)
  • allow receiver-modifying methods to use a special syntax if you don't need their return values that returns the receiver instead of the return value (and thus can be chained).

Honestly, I don't think this to be worth the churn, but others may disagree.

5 Likes

What people disagree with is the problem statement.

You yourself admit that this is a large change in how code would be written. So to justify a large change, it needs to solve a large, pervasive problem (and ideally, enable something new).

The desire of method cascading is one that could potentially be served by a language featureβ€”the existence of fn(self) -> Self and fn(&mut self) -> &mut Self methods can be easily interpreted as a desire.

But if this is the problem you're tackling, here's some of the issues standing in the way:

  • You're tying it to a query/command split for methods, which feels unmotivated (what's the problem statement here?)
  • There's also something about mut pattern bindings involved? I'm not sure.
  • Builders are increasingly transitioning to the "typed builder" approach, which doesn't use a Self -> Self signature, it uses a Builder<FieldUnset, ...> -> Builder<FieldSet, ...> signature.

This is the reason I've suggested what you have is a design for a new language: it's such a drastic overhaul of how methods are handled.

Especially for a large addition/change, it needs to start with everyone agreeing on a problem to be solved. Most things you'll see discussed have implicitly passed that bar; they're discussions of how to implement a feature that's been previously discussed as desirable, or they're of the clear form of how to improve usability of some feature. While you could argue method cascading is such a previously-discussed feature, it's still good practice to link and summarize previous discussion (especially when dealing with items that aren't already semi-active areas (e.g. const generics, async traits)), and your proposal goes much further than just addressing method cascading, anyway.

13 Likes

I think that mentioning things like a 'paradigm shift', and an #[interior_mut] attribute, points to a broader issue with your proposal. There's already an enormous amount of Rust code in existence - your proposal would require a substantial amount of that code to be modified, either now or as part of upgrading to a future edition.

Even if the was universal agreement that your cascadable method calls proposal was a good idea (which is far from being the case), what you're proposing is such a large change to how Rust code is written that it might as well be a new language (as @CAD97 and others have said). To implement this in Rust, however, it's not enough for your proposal to be a good idea - it would need to be so critical that the alternative (doing nothing) is not an option.

I think #[interior_mut] is a good example of this - you seem to view interior mutability as an important part of the API (especially with regard to the command/query distinction). However, this is explicitly not how interior mutablity is handled in Rust - it's an implementation detail that consumers can (often) be completely oblivious to, which can be quite useful when implementing things like caching. While there are some cases where specific kinds of interior mutability may 'leak' out to consumers (e.g. RefCell implements neither RefUnwindSafe nor Sync), other interior mutability types like AtomicU8 do implement those auto traits).

Regardless of whether or not your approach to interior mutability would be a good idea in general (especially for other languages), it's not the approach that Rust has chosen to take, and there's pretty much no chance of making such a drastic change to existing code at this point in Rust's development.


Separately, I'd like to second comment like this:

While the command/query distinction may be useful in some specific cases, I strongly believe that it's a bad idea to bake it into the language in this way (even if we were re-designing Rust from scratch). Introducing additional ways of calling a method (on top of the already existing receiver.method_name(arg1, ..., argN) and Type::method(receiver, arg1, ..., argN)) introduces additional complexity for anyone reading Rust code with little to no benefit.

11 Likes

That's true, I admit that the introduction was quite a mess and my responses didn't clarified a lot

This isn't complicated syntax and it's not verbose either, so I don't believe that readers would be lost in it.

I simply disagree with every statement here, and you may find why if you will re-read what I've said earlier in this thread (through, better would be to wait for the second version of this proposal)

Yes, your understanding is correct. This proposal is only about methods β€” it doesn't affects functions.

Perhaps more correct would be to say that queries don't modifies self on its original location.

That's why methods with signatures like (self, ..) -> Self are treated as commands: in current Rust most often they're used as x = x.self_consuming_method() so in fact they modifies self on its original location but that's done on the call side, hence, to keep this proposal consistent the same expression must be x self_consuming_method.

If talking about the fact that (&self, ..) -> T and (self, ..) -> T methods uses the same notation IMO there's just nothing useful in separating them. That said, we may get the local information that self was/wasn't moved into this method but why do we need it? Either in let len = vec.len() and in let slice = vec.leak() the most interesting thing is return value, so users should focus on it and not on ownership/borrowing in RHS expression. Something like vec->len() would neither reduce the number of mut in code nor it would make returning &mut Self from methods obsolete thus it's really unmotivated syntax.

It's only about modifying the receiver β€” the rest of arguments don't interact in any way with the proposed syntax. This is how method cascading works, but also it could be said that the rest of arguments already in most of cases are mutated explicitly e.g. something.fmt(&mut formatter) so any interaction is unnecessary.

Right, "query" is the default syntax: it's either familiar and encourages immutable programming style, so it's also the preferred syntax. However, I'm not sure about restricting #[non_query] to queries only: it seems that more useful attribute would be #[interior_mut] since it won't be wrong to apply it to requests and commands in order to make updates to atomics, ref-cells, etc. explicit and either to get rid of another corner case. Anyway, tweaking this behavior seems to be out of scope of this proposal, so your intuition is correct.

In theory misused method syntax is a subject for compiler warnings, and there either should be an attribute like #[allow(uniform_method_call) or something like that to make these warnings suppressable.

  • I see nothing wrong with making a query called through request notation to compile β€” user may just forgot to add #[interior_mut] or plans to add it later so we shouldn't punish him with recompiling the code
  • I see nothing wrong with making a query called through command notation to compile β€” again user may consider to change the API and there might be IDE intention to make this change more ergonomic

  • Calling requests with command notation is allowed unless it's annotated with #[must_use] β€” that's obvious, and you've already agreed with that
  • Allowing requests called with query notation to compile is important for backward compatibility β€” methods with signatures like (&mut self, ..) -> &mut Self when their return value is used (basically any builder pattern usage) would be treated as requests, and if someone migrates to edition with cascades e.g. rewriting macros to make such method calls to compile might be a lot of additional work

  • Allowing commands called with query notation to compile is important for the same reason β€” if every (self, ..) -> Self method chain in macros will stop to compile it might be a big disruption; P.S. perhaps we should allow method chaining on (&mut self, ..) -> () methods on older editions to make them compatible with APIs written in later editions β€” after migration they should compile either.
  • Compiling commands with request notation is also important for consistency β€” this means that we use the () return value for example fn x(mut v: Vec<i32>) { v~push(0) } could be either represented as fn x(mut v: Vec<i32>) { v push (0); } although the later is a proper style

Like it or not but even in current Rust changing method signatures is most likely a breaking change for users. And IMO alterations between request/command/query are important enough to force users to review every usage like we currently do with adding another enum variant for example.


On the other hand side the proposed syntax reduces a number of unpleasant breaking changes like transferring consuming builder flavor to non-consuming and vice-versa (feels bad when you've used the language where such problem doesn't exist), or changing chainable API to non-chainable and vice-versa.

This is correct, although I'm not sure what you mean with " hardcoding some specific "usefulness" check" β€” if the value is used then it's automatically useful?

This is also correct

Unfortunately, you still miss a lot of key parts where the most significant is that cascade syntax doesn't have a single special feature but many of them β€” that's why you fell in the same "not worth the churn" trap as everyone else in this thread.


At first, since command call is guaranteed to result in its receiver it implies that it's also guaranteed to result in the same type. For example we aren't sure what type Foo::new().bar() returns, in contrast Foo::new() bar makes locally visible that it returns Foo. And this property holds for arbitrarily long chain of cascaded method calls e.g. in builder usage or any other DSL this will be visible, then .build() or ~build()? will indicate that a different type will be returned at the end (and not somewhere in the middle). Moreover, the suggested notation without extra operators and without () amplifies this effect β€” with it users would focus on continuity of cascade chain and not other things that may suggest change of underlying type or anything else (perhaps this is how I could motivate the adjacent command notation?).

It's hard to argue against that knowing what type method returns is useful when learning APIs, so yet cascading isn't only about chaining non-chainable methods and revealing mutability.

Furthermore, I've made "the same type guarantee" available literally everywhere that said if method baz.qux() currently returns the same type as its receiver then with proposed syntax we 100% will see that:

  • if it mutates then it will require the command syntax so it will look like baz qux and users would see that no new type was introduced into the scope β€” we already know that
  • if it doesn't mutate that's a bit more complicated: to preserve type identity using baz qux notation would require baz to be mutable so it's no go, then it's possible to imagine something like {baz} qux but it wouldn't parse because it's ambiguous with "expression followed by identifier" (not a big problem BTW), hence, the minimum amount of symbols to keep type information might be ({baz}) qux which isn't ergonomic
    • So I've introduced (baz) qux as shorthand for ({baz}) qux which is either supposed to be enforced by compiler instead of baz.qux()
    • That shorthand doesn't work with (&self, ..) -> &Self methods although I've never seen such methods anywhere in the current Rust, so we don't care about them

And for everyone's surprise this is very consistent syntax e.g. on (x) saturating_add (z) both (x) and (z) behaves identically and unlike with x.saturating_add(z) there's a beautiful symmetry that makes it very close to x + y. So, we either have type information and a better notation!

This allows to guess how this introductory example works and why it should look exactly like that:

let x = (x) sin;
let y = (y / 2) cos;
let z = (x + y - z) wrapping_add (4) tan;

// In current Rust:

let x = x.sin();
let y = (y / 2).cos();
let z = (x + y - z).wrapping_add(4).tan();

It's still not convincing example because this is imaginary code β€” sorry about that; for a long time I've been unable to find a decent chunk of code which extensively uses mathematical operators, so this is what I've came up with. Only recently I've discovered this gist which allows to create a very representative example which IMO beats every "readability argument" over there (click on image to enlarge):

I will repeat that command notation either is guaranteed to reveal mutations made with methods plus its guaranteed to reveal type identity of subjects of command chains β€” both are very useful when learning unfamiliar APIs; keep in mind that for newcomer every API is unfamiliar! This isn't only the ability to chain non-chainable methods like Dart's .. operator was.


At second, command notation is made in a way that removes the distinction between consuming builders, non-consuming builders and temporary mutability idiom β€” these three concepts are replaced with a single. IMO currently they're just workarounds over rough edges of the language, and they're confusing for a person who learns Rust because they exists without any useful purpose. APIs either may migrate from one style to another and this creates unnecessary friction even for people who already knows Rust in perfection.

This is how cascading resolves the situation:

  • In current Rust we need (&mut self, ..) -> &mut Self methods on builders to make setters chainable but with the proposed syntax it would be enough to have (&mut self, ..) -> () β€” already some simplification.
  • Then because with current syntax builder setters usually returns &mut Self it's impossible to append fn build(self) -> Thing at the end of such chains only fn build(&self) -> Thing that clones fields of builder is possible; that's why currently setters with (self, ..) -> Self signatures sometimes are necessary to transfer ownership to .build() call. And because cascades are guaranteed to return in their input type we can obtain ownership of self at the end so only (&mut self, ..) -> () setters will have sense to use β€” another simplification.
  • Next implementing methods with (self, ..) -> Self signatures would have sense only for Copy types taking which by reference isn't wise e.g. numbers. With that such methods implies that type implements Copy and we may even turn it into guarantee β€” yet another simplification.
  • Finally APIs like Vec::push becomes chainable by default so workaround like implementing chainable wrappers or using temporary mutability are unnecessary. We also prevent people from intentionally implementing non-chainable APIs because "method chaining is confusing", since with distinct request/command/query notations such confusions cannot occur β€” the last simplification.

As we see, this notation adds a lot of defensive design into the language, so problems that people coming from garbage collected languages aren't prepared to deal with just cannot exist. IMO, it's better to spend time on learning some language features than on rewriting your code again and again.


At third, since it removes temporary mutability idiom it also removes mut annotations during data initialization, so code just becomes more ergonomic to read and write:

let vec = Vec::new()
    push (1)
    push (2)
    push (etc);

// Vs current

let vec = {
    let mut vec = Vec::new();
    vec.push(1);
    vec.push(2);
    vec.push(etc);
    vec
};

// Or with a bad style

let mut vec = Vec::new();
vec.push(1);
vec.push(2);
vec.push(etc);

Recently I've realized which another mega interesting possibility it opens: this might finally allow IDEs to insert mut automatically when it's needed: currently it's required to be specified manually because programmers must first express the intention to mutate something otherwise it may lead to accidental mutations that we don't want, so another negative effect of implicit mutations under method calls... Through selecting a mutating notation (command or request) programmer communicates this intention clearly and even if this notation was inserted by IDE it's still nearby to notice, so after that adding mut becomes a burden which IDE should take care of β€” quite a big ergonomy and productivity gain!


At fourth, there's the formatting

Foo::new()
    bar (
    baz,
    Baz::qux()
        quux (
        quux,
        Quux::new(),
        , )
    , )

which:

  • Gives a very little possibility to diverge: it can be extended only in vertical or horizontal direction
  • Could grow to gigantic scales (e.g. view in GUI DSL) while remaining easy to navigate through
  • Requires twice less indentation than the current syntax
  • Makes the descending ladder of closing braces/brackets/parentheses twice as less steep

And I think the , ) and ) ) and } ) at the end of expression are important like Ok(()) because it shows the context where you was β€” no need to scroll up and read method name in order to recall it. Moreover, it's a very good target for IDE hints: its easy to hover on so that may reveal the type which is returned by this cascade, and on click it might scroll to the beginning of cascade β€” this this should be useful in GUI programming where methods could be chained and nested very extensively and currently its easy to get lost.

However, even in simpler code this formatting has a positive effect: it by design prevents placing too many things on the same line which IMO is quite a problem in the current Rust β€” such antipatterns are just too subtle and could easily skim through code reviews. I have a very good example of that built on smithay's code:

Hence, there's something very beautiful about geometry which the proposed construct takes.


At fifth, we could see that this formatting plays very nicely with closures, so similarly it plays very nice with any other scope e.g. from control flow constructs; as we remember from .await proposal period there was some desire in the community to have some sort of general pipelining e.g. .match, .for, .if etc. β€” the proposed syntax makes this dream closer to reality. Again, a possibility like that would be mostly useful in GUI programming to avoid temporary bindings in layout trees that could disrupt declarative flow of such expressions and just looks confusing because their declaration would be placed too away from the usage.

The prior art for that feature is Dart's control flow collections feature as well as mentioned above Kotlin's scope functions, but these were prone to abuse and somewhat complex.

I have much simpler vision on that feature:

TabBar::new()
    is_scrollable (false)
    also (
    for tab in tabs {
        super tab (Tab::new() text (tab))
    } )

Here super gives mutable access to the receiver of nearest method call (not function!), so we've achieved a general pipelining that works perfectly with ownership/borrowing, looks bulky enough to be abused, doesn't require immediate concepts like functions/closures to be inserted to make e.g. branching/looping to work, and what's the most important doesn't require users to learn about currying/partial application and other complicating stuff from functional programming world β€” as long as you know about cascades, receivers and basic Rust it's obvious how it works.

Desugaring for the above expression would be the following:

{
    let mut x = TabBar::new();
    x.is_scrollable(false);
    x.also({
    for tab in tabs {
        x.tab({ let y = Tab::new(); y.text(tab); y })
    });
    x
}

While we might imagine that the signature of also method is this:

fn also<T>(&mut self, t: T) { }

Although, I plan to implement it differently because here &mut self will lock the receiver as unique so x also (dbg!(super)) would be required to written as x also ({ dbg!(super); }) which isn't ergonomic. A better opportunity exists to achieve the expected result: we make *const self receivers to compile (this doesn't work currently for some reason, perhaps it's very unsafe to dereference such receivers thus I don't propose that to allow β€” only methods should be able to take raw pointers to enable cascading/chaining)

fn also<T>(*const self, t: T) -> T { t }

And then this signature allows to unify all methods of tap crate except specialized ones like tap_ok, tap_err which anyway could be replaced with something like inspect, inspect_err on Option/Result itself. More specifically, we would have a single method with very nice name which could be altered between cascaded/call notations to select tap/pipe behavior.

We already have an example how it looks in tapping context, so here's an example how it looks in piping context which I've built on code from druid GUI framework:




So, there are at least five extra special features. Neither of them is more important than another. The main point of this syntax in how well they aligns together and how well they fits into the language and into IDE. It would be easy to discard every of them taken outside of the whole picture. That's why I insist on not focusing on a single thing.

Perhaps there's a lot of survivor bias on this forum: people get used to three or more flavors of builders, people get used to non-chainable methods, and people get used to lack of command/query separation on method calls, so they just don't see any of that as a large pervasive problem. They either know a lot of workarounds over lack of pipe-forwarding in Rust, perhaps they don't write a lot of GUI layouts in their day-to-day work, so they either don't see it as something new and interesting. How can I convince them that the problem exists?

=>

  • The problem with lack of command/query/request separation is that with uniform notation it's very easy to chain such methods and create a mess, for example expression vec.get(0).replace(x) may create the first impression that it replaces the 0-th element in the vec while in reality replace operates on temporary value returned by get thus has no effect on vec, and similarly tmp_file.create_new(true).open("~/tmp") may create the first impression that create_new is the most important method in chain thus /tmp/new_file will be created rather than ~/tmp. Thus, non-uniform method call syntax will at least introduce some "intonation" in such method chains which is supposed to motivate people to split such chains or at least to format them properly.
    • Again it's not a problem for very experienced and disciplined programmers, only beginners may struggle with it because it's not clear how methods should be chained and there's a lot of contradictory and outdated information on the internet
  • My proposal turns mut annotations into state annotations, since chaining non-chainable APIs allows to ommit mut during data initialization hence it's only required when updating some state.
    • Annoyance of mut is the problem on which people agreed in the past, so it's very surprising for me that nobody ITT focuses on it. Isn't it obvious how method cascading as well as method chaining in the current Rust allows to remove a lot of temporary mut from code?
  • It anyway doesn't make sense to turn every builder into typed because of extra complexity and extra compilation time required, moreover, methods like append that gathers something into Vec presumably shouldn't return a next builder stage. IMO, it's even better if they'd stand out from the rest of chain

This overhaul is built on top of Rust, so I think it's still an appropriate place to discuss it β€” no other language has the same mutability system. Also, I've seen that the language team had some desire for implementing "a simplified Rust" so my proposal doesn't seem to be out of scope of the project and might be useful. Anyway, implementing a whole new language (which most likely no one will use) for me isn't the best investment of time at this point in life so I rather want to share it with people to see if it works or not, and who knows maybe it will inspire someone else until I will made something out of it? That said, I'm not going to merge it into rustc tomorrow

There's always a lot of opposition for every change no matter large or not.