Pre-RFC: Cascadable method call

For almost a year I've been silently preparing this proposal and such amount of effort was invested because I really think that it has some potential and the feature has so many implications. However, it's possible to say that it's even older as in the past I've also posted a similar proposal that certainly influenced the current, but only this time I've figured out how to properly resolve all issues with symbolic noise, complexity, indistinguishability, insufficient motivation, and general inconsistency with paradigms prevailing in the current Rust, with having solution that might be appealing for everyone.

Basically, I propose to add method cascading like in Dart, but:

  • Non-optional: it should become a new standard way to call methods with (&mut self, ..) -> _ and (Self, ..) -> Self signatures alike usually applied to update mutable bindings
  • Operatorless: its syntax derived by subtracting from method call the . operator and empty () parentheses, with some formatting tricks then added to make the result practical
  • Controllable: it allows to ergonomically select between mutate-in-place and move-and-mutate behaviors while completely eliminating burdening choices like between consuming/borrowing builder flavors

And exactly these properties makes it very different from all prior art.

Unfortunately also implementing it expected to be disruptive enough change to bring a new edition and a lot of potentially chunky additions to tooling plus some paradigm shift in programming in Rust.

Although mechanically the construct is still very simple (if not primitive) and fully backward-compatible (through, may generate a lot of compiler warnings) and nonetheless it possesses a truly surprising amount of consistency with rest of the language — I would even dare to claim that it fits into Rust better than .. fits into Dart and thus it's impossible to invent any better method cascading syntax!

Before demonstrating it I must warn that it could be very easily misunderstood just because of extreme amount of novelty and it's easy to miss the point that I propose something much more than just a way to save a few symbols. Moreover, with lack of a proper syntax highlighting, with lack of people's experience in editing anything like that, with somewhat unusual alignment of expressions, and with inability to explain all corner cases at once, it's quite expected that there would be a lot of bias against it. Anyway, this proposal exists because I'm convinced that without any bias the proposed syntax should resolve much more problems than it may create and that in many aspects with it Rust seems to be much better than without it.

As you've been introduced, here it is:

// The example was built on code from
// github.com/tensorflow/rust/blob/69e56ed02722a5930f28/examples/xor.rs#L73
fn train<P: AsRef<Path>>(save_dir: P) -> Result<(), Box<dyn Error>> {
    // ================
    // Build the model.
    // ================
    let mut scope = Scope::new_root_scope();
    let scope = &mut scope;
    // Size of the hidden layer.
    // This is far more than is necessary, but makes it train more reliably.
    let hidden_size: u64 = 8;
    let input = ops::Placeholder::new()
        dtype (DataType::Float)
        shape ([1u64, 2])
        .build(&mut scope.with_op_name("input"))?;
    let label = ops::Placeholder::new()
        dtype (DataType::Float)
        shape ([1u64])
        .build(&mut scope.with_op_name("label"))?;
    // Hidden layer.
    let (vars1, layer1) = layer(
        input.clone(),
        2,
        hidden_size,
        &|x, scope| Ok(ops::tanh(x, scope)?.into()),
        scope,
    )?;
    // Output layer.
    let (vars2, layer2) = layer(
        layer1.clone(), 
        hidden_size, 
        1, 
        &|x, _| Ok(x), 
        scope
    )?;
    let error = ops::sub(layer2.clone(), label.clone(), scope)?;
    let error_squared = ops::mul(error.clone(), error, scope)?;
    let optimizer = AdadeltaOptimizer::new()
        set_learning_rate (ops::constant(1.0f32, scope)?);
    let variables = Vec::new()
        extend (vars1)
        extend (vars2);
    let (minimizer_vars, minimize) = optimizer.minimize(
        scope,
        error_squared.clone().into(),
        MinimizeOptions::default().with_variables(&variables),
    )?;

    let all_vars = variables.clone()
        extend_from_slice (&minimizer_vars);

    let saved_model_saver = tensorflow::SavedModelBuilder::new()
        collection ("train", &all_vars)
        tag ("serve")
        tag ("train")
        signature (
        REGRESS_METHOD_NAME,
        SignatureDef::new(REGRESS_METHOD_NAME.to_string())
            input_info (
            REGRESS_INPUTS.to_string(),
            TensorInfo::new(
                DataType::Float,
                Shape::from(None),
                OutputName {
                    name: input.name()?,
                    index: 0,
                },
            ) )
            output_info (
            REGRESS_OUTPUTS.to_string(),
            TensorInfo::new(
                DataType::Float,
                Shape::from(None),
                layer2.name()?
            ) )
        , )
        .inject(scope)?;

    // =========================
    // Initialize the variables.
    // =========================
    let options = SessionOptions::new();
    let g = scope.graph_mut();
    let session = Session::new(&options, &g)?;
    let mut run_args = SessionRunArgs::new();
    // Initialize variables we defined.
    for var in &variables {
        run_args target (&var.initializer());
    }
    // Initialize variables the optimizer defined.
    for var in &minimizer_vars {
        run_args target (&var.initializer());
    }
    session.run(&mut run_args)?;

    // ================
    // Train the model.
    // ================
    let mut input_tensor = Tensor::<f32>::new(&[1, 2]);
    let mut label_tensor = Tensor::<f32>::new(&[1]);
    // Helper that generates a training example from an integer, trains on that
    // example, and returns the error.
    let mut train = |i| -> Result<f32, Box<dyn Error>> {
        input_tensor[0] = (i & 1) as f32;
        input_tensor[1] = ((i >> 1) & 1) as f32;
        label_tensor[0] = ((i & 1) ^ ((i >> 1) & 1)) as f32;
        let mut run_args = SessionRunArgs::new()
            target (&minimize);
        let error_squared_fetch = run_args.request_fetch(&error_squared, 0);
        run_args
            feed (&input, 0, &input_tensor)
            feed (&label, 0, &label_tensor);
        session.run(&mut run_args)?;
        Ok(run_args.fetch::<f32>(error_squared_fetch)?[0])
    };
    for i in 0..10000 {
        train(i)?;
    }

    // ================
    // Save the model.
    // ================
    saved_model_saver.save(&session, &g, &save_dir)?;

    // ===================
    // Evaluate the model.
    // ===================
    for i in 0..4 {
        let error = train(i)?;
        println!("Error: {}", error);
        if error > 0.1 {
            return Err(Box::new(Status::new_set(
                Code::Internal,
                &format!("Error too high: {}", error),
            )?));
        }
    }
    Ok(())
}
Original piece of code
fn train<P: AsRef<Path>>(save_dir: P) -> Result<(), Box<dyn Error>> {
    // ================
    // Build the model.
    // ================
    let mut scope = Scope::new_root_scope();
    let scope = &mut scope;
    // Size of the hidden layer.
    // This is far more than is necessary, but makes it train more reliably.
    let hidden_size: u64 = 8;
    let input = ops::Placeholder::new()
        .dtype(DataType::Float)
        .shape([1u64, 2])
        .build(&mut scope.with_op_name("input"))?;
    let label = ops::Placeholder::new()
        .dtype(DataType::Float)
        .shape([1u64])
        .build(&mut scope.with_op_name("label"))?;
    // Hidden layer.
    let (vars1, layer1) = layer(
        input.clone(),
        2,
        hidden_size,
        &|x, scope| Ok(ops::tanh(x, scope)?.into()),
        scope,
    )?;
    // Output layer.
    let (vars2, layer2) = layer(
        layer1.clone(), 
        hidden_size, 
        1, 
        &|x, _| Ok(x), 
        scope
    )?;
    let error = ops::sub(layer2.clone(), label.clone(), scope)?;
    let error_squared = ops::mul(error.clone(), error, scope)?;
    let mut optimizer = AdadeltaOptimizer::new();
    optimizer.set_learning_rate(ops::constant(1.0f32, scope)?);
    let mut variables = Vec::new();
    variables.extend(vars1);
    variables.extend(vars2);
    let (minimizer_vars, minimize) = optimizer.minimize(
        scope,
        error_squared.clone().into(),
        MinimizeOptions::default().with_variables(&variables),
    )?;

    let mut all_vars = variables.clone();
    all_vars.extend_from_slice(&minimizer_vars);
    let mut builder = tensorflow::SavedModelBuilder::new();
    builder
        .add_collection("train", &all_vars)
        .add_tag("serve")
        .add_tag("train")
        .add_signature(REGRESS_METHOD_NAME, {
            let mut def = SignatureDef::new(REGRESS_METHOD_NAME.to_string());
            def.add_input_info(
                REGRESS_INPUTS.to_string(),
                TensorInfo::new(
                    DataType::Float,
                    Shape::from(None),
                    OutputName {
                        name: input.name()?,
                        index: 0,
                    },
                ),
            );
            def.add_output_info(
                REGRESS_OUTPUTS.to_string(),
                TensorInfo::new(
                    DataType::Float,
                    Shape::from(None),
                    layer2.name()?
                ),
            );
            def
        });
    let saved_model_saver = builder.inject(scope)?;

    // =========================
    // Initialize the variables.
    // =========================
    let options = SessionOptions::new();
    let g = scope.graph_mut();
    let session = Session::new(&options, &g)?;
    let mut run_args = SessionRunArgs::new();
    // Initialize variables we defined.
    for var in &variables {
        run_args.add_target(&var.initializer());
    }
    // Initialize variables the optimizer defined.
    for var in &minimizer_vars {
        run_args.add_target(&var.initializer());
    }
    session.run(&mut run_args)?;

    // ================
    // Train the model.
    // ================
    let mut input_tensor = Tensor::<f32>::new(&[1, 2]);
    let mut label_tensor = Tensor::<f32>::new(&[1]);
    // Helper that generates a training example from an integer, trains on that
    // example, and returns the error.
    let mut train = |i| -> Result<f32, Box<dyn Error>> {
        input_tensor[0] = (i & 1) as f32;
        input_tensor[1] = ((i >> 1) & 1) as f32;
        label_tensor[0] = ((i & 1) ^ ((i >> 1) & 1)) as f32;
        let mut run_args = SessionRunArgs::new();
        run_args.add_target(&minimize);
        let error_squared_fetch = run_args.request_fetch(&error_squared, 0);
        run_args.add_feed(&input, 0, &input_tensor);
        run_args.add_feed(&label, 0, &label_tensor);
        session.run(&mut run_args)?;
        Ok(run_args.fetch::<f32>(error_squared_fetch)?[0])
    };
    for i in 0..10000 {
        train(i)?;
    }

    // ================
    // Save the model.
    // ================
    saved_model_saver.save(&session, &g, &save_dir)?;

    // ===================
    // Evaluate the model.
    // ===================
    for i in 0..4 {
        let error = train(i)?;
        println!("Error: {}", error);
        if error > 0.1 {
            return Err(Box::new(Status::new_set(
                Code::Internal,
                &format!("Error too high: {}", error),
            )?));
        }
    }
    Ok(())
}

On a first sight cascadable method calls makes code noticeably less noisy and less intricate, hence, much easier to skim through. But that's not the biggest motivation to add this syntax: the real value is that in the presence of it the . operator and mut annotations became much simpler to reason about: the first started primarily indicating a possible change of expression's return type while the second started primarily indicating a shared state among non-sequentially arranged expressions — which is essentially the most useful information that we could squeeze from them, furthermore, by itself cascadable method calls encapsulates mutability without requiring any extra effort and creates a cascade of mutations.

All together yields a "tight system" where all parts complement each other and nothing could go wrong. What truly matters, this system must completely remove all the FUD associated with fluent interface pattern usage which plagues it since the beginning of its invention e.g. this Stack Overflow thread exposes the real source of that. In addition, with it we can be sure that mutpocalypse will never happen again because mut would become much less overused thus we will have much less valid reasons to dislike it. And since the system will be more reliable, predictable, and requiring less maintenance that should make only positive impact on developer's productivity.


So, let take a look from a slightly different perspective. Another very demonstrative but rather not convincing example is this:

// The example was built on code from
// github.com/bitshifter/glam-rs/blob/20fe1be60caf99f/benches/support/mod.rs#L10
impl PCG32 {
    pub fn seed(initstate: u64, initseq: u64) -> Self {
        (PCG32 {
            state: 0,
            inc: (initseq << 1) | 1,
        } )
            next_u32  // https://crates.io/crates/tap
            .tap_mut(|x| x.state wrapping_add (initstate))
            next_u32
    }

    pub fn next_u32(&mut self) -> u32 {
        let xorshifted = ((self.state >> 18) ^ self.state) >> 27;
        let rot = self.state >> 59;
        self.state
            wrapping_mul (6364136223846793005)
            wrapping_add (self.inc | 1);
        ((xorshifted >> rot) | (xorshifted << ((rot) wrapping_neg & 31))) as u32
    }
}
Original piece of code
impl PCG32 {
    pub fn seed(initstate: u64, initseq: u64) -> Self {
        let mut rng = PCG32 {
            state: 0,
            inc: (initseq << 1) | 1,
        };
        rng.next_u32();
        rng.state = rng.state.wrapping_add(initstate);
        rng.next_u32();
        rng
    }

    pub fn next_u32(&mut self) -> u32 {
        let oldstate = self.state;
        self.state = oldstate
            .wrapping_mul(6364136223846793005)
            .wrapping_add(self.inc | 1);
        let xorshifted = ((oldstate >> 18) ^ oldstate) >> 27;
        let rot = oldstate >> 59;
        ((xorshifted >> rot) | (xorshifted << (rot.wrapping_neg() & 31))) as u32
    }
}

Almost every expression here allows to say something about the proposed syntax:

  • The PCG32 constructor is wrapped in cascadable parentheses with distinguishable extra whitespace at the end because cascadable method calls aren't allowed after braces otherwise expressions like if x {} y () would be ambiguous — fortunately, this doesn't cause any trouble on practice
  • The next_u32 method call lacks () parentheses because there's no need in disambiguating between method and field namespaces; in this way it also looks better — almost like postfix operator
  • The tap_mut usage means that in the presence of cascadable method call there would be much higher demand for something like that with the necessity to break the chain becoming more unpleasant; although here it's possible to take a better approach which will be described a bit further
  • The order of expressions in next_u32 method is also a bit more logical in comparison with original — with cascadable method calls we're able to get rid of extra oldstate entity
  • wrapping operators became much closer to their overflowing counterparts and what's interesting we use them as implying compound assignment underneath
  • The (rot) wrapping_neg operand isn't mutable because it's wrapped in parentheses and that syntax is special — it moves/copies the thing into cascade with turning it into a local value

A lot of new information! But here's more to add:

  • Cascadable parentheses looks beautifully consistent in mathematical expressions:

    let x = (x) sin;
    let y = (y / 2) cos;
    let z = (x + y - z) wrapping_add (4) tan;
    
    // In current Rust:
    
    let x = x.sin();
    let y = (y / 2).cos();
    let z = (x + y - z).wrapping_add(4).tan();
    
  • As a future possibility we may provide a first-class language support for tapping (however, this feature is so powerful that it certainly deserves a separate proposal and we shouldn't focus on it too much here):

    pub fn seed(initstate: u64, initseq: u64) -> Self {
        (PCG32 {
            state: 0,
            inc: (initseq << 1) | 1,
        } )
            next_u32
            also (super.state wrapping_add (initstate))
            next_u32
    }
    

And next will be the last example. It shows that providing and using eDSLs could become extremely convenient with cascadable method calls with experience being very close to syntax-sugar-rich languages:

// The example was built on code from
// github.com/flutter/gallery/blob/a3838/lib/demos/material/tabs_demo.dart#L164
#[override]
fn build(&self, context: &BuildContext) -> impl Widget {
    let tabs = [
        GalleryLocalizations::of(context).colorsRed,
        GalleryLocalizations::of(context).colorsOrange,
        GalleryLocalizations::of(context).colorsGreen,
    ];

    Scaffold::new()
        app_bar (
        AppBar::new()
            automatically_imply_leading (false)
            title (
            Text::from(
                GalleryLocalizations::of(context).demo_tabs_non_scrolling_title
            ) )
            bottom (
            TabBar::new()
                controller (self.tab_controller)
                is_scrollable (false)
                also (
                // we may use `tap` here
                for tab in tabs {
                    super tab (Tab::new() text (tab))
                } )
            , )
        , )
        body (
        TabBarView::new()
            controller (self.tab_controller)
            also (
            for tab in tabs {
                super child (Center::new() child (Text::from(tab)))
            } )
        , )
    , )
}
Original piece of code
@override
Widget build(BuildContext context) {
  final tabs = [
    GalleryLocalizations.of(context).colorsRed,
    GalleryLocalizations.of(context).colorsOrange,
    GalleryLocalizations.of(context).colorsGreen,
  ];

  return Scaffold(
    appBar: AppBar(
      automaticallyImplyLeading: false,
      title: Text(
        GalleryLocalizations.of(context).demoTabsNonScrollingTitle,
      ),
      bottom: TabBar(
        controller: _tabController,
        isScrollable: false,
        tabs: [
          for (final tab in tabs) Tab(text: tab),
        ],
      ),
    ),
    body: TabBarView(
      controller: _tabController,
      children: [
        for (final tab in tabs)
          Center(
            child: Text(tab),
          ),
      ],
    ),
  );
}

Such Rust code should be truly compact, modular, inspectable, fast to compile and should play very nice with developing tools — that's why macros aren't nearly a close alternative to cascadable eDSLs. Also, it's by no means possible to achieve the same result with builder pattern, since with cascadable eDSLs we don't need to think whether &mut self or Self must be returned from setters as well as we don't need to worry whether snippets like x = x.with_y() will cause problems downstream. Ultimately, no other similar syntax allows to interwine language constructs so closely together and give the same amount of useful guarantees or reveal the same amount of relevant information without turning it into a complete mess.


So, these were only the most important points; there's so much could be said about this syntax e.g. that x add (1) could be used for incrementing numbers, that condition not and cond bitxor (true) could be used to flip booleans (!), that almost all mutations could become explicit, and that Rust indeed is expected to become easier to learn; a lot of questions about desugaring, prior art, and interaction with the rest of the language remain still to answer. It's hard to fit everything into a single post, hence, I've compiled a really big pre-RFC document that certainly should be reviewed by someone before being published to a wider auditory — I'm struggle with writing a lot, still some important parts might be missing, and at this point it's really hard to predict what community response it might cause in its current state.

Particularly these remains the biggest concerns:

  • How much burden cascadable method call will create for being parsed properly
    • In rust compiler
    • In rust-analyzer and other de-facto standard tools
    • In websites and text editors that aren't so tightly integrated with the language
  • How will people perceive it
    • It may have a potential of being confusing much higher than is expected
    • No one may like changing their code to comply with it
    • It isn't obvious to what degree weird the syntax is
  • How hard it would be to provide a proper IDE experience
    • Without "power of dot" autocompletion can be the hardest part
    • An extra formatting assistance might be also required
    • Autoformatters must be aware of it
  • How it would interact with metaprogramming
    • There are some precedence issues in macros
    • With it previously valid macros may issue a lot of warnings

That said, I want to gather some constructive feedback in this thread and then if it would be mostly positive to find someone willing to review my RFC, help with rewording sentences if something is unclear, fixing grammar errors etc. Besides of that the majority of work (without implementation) is done and design of the feature is complete. Whatever the outcome would be, at least now we will have the most practical way to implement method cascading in Rust and some problems in its current design will be better expressed; also, the same syntax might be considered for implementation in other languages as well.

I'm having an incredibly hard time understanding what exactly you're proposing here. You start out by listing a previous post of yours which seems to suggest something that looks different, and you mention the language "Dart" which I'm not familiar with.

You contrast with Dart (again, not useful at all to me), then list a bunch of disadvantages of / counter-arguments for your proposal (hard to grasp for a reader who has no idea what the proposal is about), and explain why they're perhaps not a problem.

Then follow lengthy "examples" where I'll have to play "find the difference" and "guess the meaning", and there's comments after those examples, that point out consequences on a high / big picture level, etc, as if the reader has by this point already magically gained some deep knowledge on what's proposed and what the all the small technical pros and cons are... really weirdly organized; I've yet to find the place where you tell me what in the world this proposal is about?


Note that I did acknowledge your point

but after this, all your demonstration seems to be about re-writing code in a different way, saving a few symbols. At least that's all I got from playing my "find the difference"+"guess the meaning" game so far.


But perhaps, the fact that this feature is impossible to understand without any proper explanation, as I've just experienced first-hand while reading your post, is enough to deduce that there's a very close to 0% chance that this will ever find it's way into Rust. It's just going to be too big a change, causing too much churn.

But I'm open to be convinced otherwise, provided you focus less on vague non-technical qualities like "less noise", "beautiful" or "truly compact, modular, inspectable", and demonstrate some actual benefits of this new .. what is it - syntax?.. language feature?.. and demonstrate your points on shorter examples, and also give (at least a rough version of) an abstract description of what exactly it is you propose here? What the new syntax and what's it's meaning/semantics?

18 Likes

I've got lots of sympathy towards @160R and would like to encourage him/her to articulate more clearly. Because @steffahn's points also sound true. @160R would it make sense to share a link to the half-baked RFC?

2 Likes

I believe the proposal is a pipeline operator, i.e: built in syntax for the .tap method. From the code examples:

 let mut builder = tensorflow::SavedModelBuilder::new();
 builder
        .add_collection("train", &all_vars)
        .add_tag("serve")
        .add_tag("train")

// becomes
tensorflow::SavedModelBuilder::new()
        collection ("train", &all_vars)
        tag ("serve")
        tag ("train")

Its syntax derived by subtracting from method call the . operator and empty () parentheses

I agree that some smaller examples, and more details would be helpful. Namely, what is the signature of the tag method? Does it have to take self by value, or by reference?

... either, e.g. as in method's signature?

... but in by value case method has to return a value of same type for further chaining to be possible?

P.S. putting "things" next to each other - separated by a space - as a sort of operator - is used in ML family of languages to indicate fn application to parameters. I guess Rust is not using this syntax except for keywords. The insight the syntax could be used for something sounds interesting.

1 Like

I thought so, too, for a moment, but then I read that "tapping" is only a future possibility, i.e. not the main point of this proposal.


Regarding the example you quoted

it seems like collection might simply be a different method than add_collection, with a different signature. Or whatever, I really don't know what's going on at all, which is my main problem here of course. All I can see is that lots of .s are removed for no apparent reason. If the feature is "remove lots of '.'s because we can", then I say "that's a bad feature".

4 Likes

I'm not sure I completely understand the proposal here, but I'll make a general statement based on the examples:

Adjacency is the most valuable syntax in a programming language. So much that you can separate entire paradigms based on it: functional languages are where adjacency is function application, concatenative languages are where adjacency is function composition.

So I'm skeptical of any proposal that uses it without being incredibly well motivated.

(And, as a Rust-specific note, I think defining adjacency to mean something in an expression is a breaking change to macro_rules, because follow sets aren't enforced properly today.)

12 Likes

I thought so, too, for a moment, but then I read that "tapping" is only a future possibility, i.e. not the main point of this proposal.

From what I can understand, this proposal allows method chaining fn(A -> B), fn(C -> D):

let x = tensorflow::SavedModelBuilder::new()
        collection ("train", &all_vars)
        tag ("serve")
        tag ("train");
// =>
let x = {
    let x = tensorflow::SavedModelBuilder::new();
    let x = x.collection("train", &all_vars);
    let x = x.tag("serve");
    let x = x.tag("train");
    x
};

While "tapping" from the proposal would allow different parts of the original expression without passing the return value along the chain (i.e: fn tap_mut(self, impl Fn(&mut Self -> ())) -> Self:

let x = (PCG32 {
    state: 0,
    inc: (initseq << 1) | 1,
})
   next_u32
   also (super.state wrapping_add (initstate))
   next_u32;
// =>
let x = {
    let mut x = PCG32 {
        state: 0,
        inc: (initseq << 1) | 1,
    };
    x.next_u32();
    x.state = x.state.wrapping_add(initstate);
    x.next_u32();
    x
};

That said, I have no idea if I understood correctly, and hope @160R can chime in with some more concrete semantics.

Well, I'm terrible at explaining concepts, sorry for that. I assume readers here are familiar with modern features in programming languages and saying "method cascading like in Dart" would be sufficient introduction, or they'll just do their own research on that topic — it's very googleable and other people explained it much better than me. I think just becoming familiar with this would remove the need in guessing too much at once.

Probably I should have provide more links to make the feature better discoverable...

I'm also surprised that examples provided here are confusing and people asking for something simpler.

So, here's summary section from my pre-RFC that should make the mechanics behind the proposed syntax more obvious (also requires a lot of guessing but still that's the best summary I've been able to provide)

Summary

EDITED: @SkiFire13 found a mistake — now it's fixed.

Add cascadable method call syntax that performs operations like "move|copy|mutably borrow here => mutate => then result in if required" and basically allows to group mutations made in sequence at the same value with methods by that making them more explicit while isolated in a separate scope. Unlike with the traditional concept of method cascading, its usages will be imposed by the compiler as Rust's type system is apt to infer always appropriate places, while another substantial difference is that the syntax is much simpler — there's no symbol-operators whatsoever and it's rather reminiscent of either an infix method call or of a postfix operator:

// Dummy struct that provides methods for demonstration:
#[derive(Clone, Copy)]
struct X;
impl X {
    fn new() -> Self { Self {} }

    // Some cascadable signatures:
    fn foo(&mut self) {}
    fn bar(self) -> Self { self }
    fn baz(&mut self, _: i32) {}
    fn qux(self, _: i32, _: i32) -> Self { self }
    fn quux(&mut self, _: i32, _: i32) -> bool { true }
}

// Possible usages

let on_mut_ref = &mut X::new();
let on_mut_ref: &mut X = on_mut_ref foo bar baz (0) qux (1) quux (2,3) foo bar;
on_mut_ref foo bar baz (0) qux (1) quux (2, 3) foo bar;

let mut on_mut_val: X = X::new();
let mut on_mut_val: X = on_mut_val foo bar baz (0) qux (1) quux (2, 3) foo bar;
on_mut_val foo bar baz (0) qux (1) quux (2, 3) foo bar;

let on_val: X = X::new();
let on_val: X = (on_val) foo bar baz (0) qux (1) quux (2, 3) foo bar;
(on_val) foo bar baz (0) qux (1) quux (2, 3) foo bar;

let on_temp = X::new() foo bar baz (0) qux (1) quux (2, 3) foo bar;
let on_temp = (X {}) foo bar baz (0) qux (1) quux (2, 3) foo bar;

// Note: 
// - A heavy reliance on syntax highlighting is expected
// - Using a regular method call will result in compiler warning
// - Without `X:Copy` methods `bar` and `qux` would be unavailable `on_mut_ref`

This RFC also introduces annotation-like formatting for cascadable method calls that are spanned for multiple lines, where alignment of blocks may somehow distantly remind the same of markdown and configuration files:

// These structs provides a primitive eDSL for demonstration:
struct A;
struct B;
struct C;
impl A {
    fn begin() -> Self { Self {} }
    fn multiline(&mut self, _: B) {}
    fn multiline_multiarg(self, _: B, _: B) -> Self { self }
    fn finish(self) {}
}
fn function(b: B) -> B {
    b
}
impl B {
    fn new() -> Self { Self {} }
    fn oneline(&mut self, _: C) {}
}

// This formatting is impossible to break:
A::begin()
    multiline (          // <- "header"
    B::new()
        oneline (C {})
        oneline (C {})
        oneline (C {})
    , )                  // <- a closing "tag"
    multiline (
    function(B::new()
        oneline (C {})
    ) )
    multiline_multiarg (
    B::new()
        oneline (C {})
        oneline (C {}),
    function(B::new()
        oneline (C {})
        oneline (C {})
    )
        oneline (C {
            //inner block expanded
        } )
    , )
    .finish();

An important difference from regular method calls is that cascadable method calls aren't allowed after {} braces in order to preserve backwards compatibility of "scope followed by a function call" ambiguous expressions like if x {} y () — an extra parenteses like (if x {}) y (z) are required in such situations to make them available.




I've put a lot of effort into that and the motivation guide level explanation section contains 7 subsections with examples, explanations, real world use cases, etc. etc and it's so big that I don't remember any other RFC like that and I'm really anxious to post it all at once without being sure that people are ready for it.

Thank you for support. Probably I'll post it here but only by parts, since it's very large and complicated and that may cause a lot of chaos which I do not want. People may ask questions, I'll respond, and in this way we may also gain a valuable knowledge on how a complete newcommer will react to it without knowing enough context.

Already it's clear for me that for many people the syntax isn't as intuitive as I've expected and that it should have been introduced in quite a different way

I think I understand what you're describing, and unfortunately: I really don't think there's any way Rust is going to adopt adjacency to mean pipelined application in this manner.

What you describe reads as a very different language. It would perhaps be a nice language to use, but it's significantly different than Rust.

Part of the appeal of Rust is that it uses a mostly familiar C descended syntax, even though most of its semantics are ML descended. If you know how to read C/C++/C#/Java/anything then you will have a general understanding of what Rust code does when you see it. There'll be some differences in specifics, of course, but the general idea of what the code does is familiar.

There's a concept of "complexity budget" in programming language design. Rust spends basically the entire thing on its concept of ownership and borrow lifetimes; this is not a concept seen anywhere else, and is something distinctly new that developers have to learn and get used to. (I'd argue that an unfortunate chunk of the complexity budget is then also spent on the module system not being file autodiscovery.) Adding an entirely novel syntax for doing method cascading basically blows any complexity budget out of the water, and what you have is an entirely unfamiliar language that few will put the effort into learning to get the benefit from.

If you feel strongly about this concept, I encourage you to explore implementing it. Perhaps you'll plant the seed to grow the next popular new language. But I don't think that trying to graft it wholesale into an existing language is doing anyone (including you) any favors.

Transpiling code into Rust is an interesting way of handling it, and having a working example of what you're proposing will make it immensely easier to make an argument for it (not even mentioning the increased understanding of the system you'd have after implementing it).


(It's worth adding as a postscript that a long RFC does not necessarily a good RFC make; a shorter RFC that has clear explanations is better than a long one that relies on giant examples for communication.

The goal of a feature RFC like this is to tell an existing average Rust programmer what is new about the language with your feature, and how they can use it to improve their code. A comparative example serves no purpose to that sort of informative communication.)

7 Likes

My understanding of the root feature addition (replacement? you say it's non-optional) is that you write

The expression $expr $ident which translates to

  • let T represent the resulting type of $expr
  • evaluate $expr.$ident(), notably including autoref behavior
  • the full expression evaluates at type T
  • based on the signature of fn $ident:
    • if it's fn(&mut T), $expr is borrowed for the method call, and the expression evaluates to the mutated value
    • if it's fn(T) -> T, the expression evaluates to the return value of the method
    • otherwise it's a compiler error

The expression $expr $ident ($args,+) which translates to

  • let T represent the resulting type of $expr
  • evaluate $expr.$ident(), notably including autoref behavior
  • the full expression evaluates at type T
  • based on the signature of fn $ident:
    • if it's fn(&mut T, $args,+), $expr is borrowed for the method call, and the expression evaluates to the mutated value
    • if it's fn(T, $args,+) -> T, the expression evaluates to the return value of the method
    • otherwise it's a compiler error

The rest of the post is just examples and corner case adjustments to the grammar to (allegedly) make the syntax non ambiguous and backwards compatible.

The alleged benefit is for the use of builder pattern method cascading and eDSLs, as well as that now

  • if a method is called .$ident it is known to return a different type than the type of its receiver
  • fewer bindings are made mut

This effectively makes methods without further arguments into custom postfix identifier operators, and methods with further operators into pseudo infix operators, which are required to return the same type as their lhs.

4 Likes

But have you motivated that adjacency is the best syntax for this feature in Rust?

For example, case in SML doesn't have braces, but in Rust it's match with braces.

What you're describing seems similar to With statements in VB. And of course Rust wouldn't have End With for them, but maybe it could translate to

let variables = with Vec::new() {
    .extend(vars1);
    .extend(vars2);
};

Instead of this from the OP:

let variables = Vec::new()
    extend (vars1)
    extend (vars2);

Using braces there, as well as spacing that fits the rustfmt conventions for function calls, just feels more like Rust to me. And having a distinct construct with a keyword and braces surrounding the special behaviour also avoids a bunch of the complications.

I think focusing on the goals here, rather than the current syntax which many find unusual, would help you have a more productive conversation on this forum.

11 Likes

It's interesting why you are thinking that this syntax is so much different. It still has almost the same shape and it still would be used in the same contexts, and there's already a lot of variety in calling methods in other languages e.g. C has ->, C# has ?., Kotlin has infix methods, Java allows creating closures with :: ultimately Dart added .. operator which does almost the same as the proposed construct, and there's much more divergence between them... Of course all of these languages share the . operator but I either don't agitate to get rid of it — only to make it more specific.

Okay, things like foo bar; implying "call method bar on foo" would be too different from C/C++/C# and whatever we've seen before, I agree, but still anyone who does things like git push on a daily basis could find a familiar pattern. Even multiline formatting with ) ), , ) etc. could be compared to closing tags in HTML or annotations on modules, so there's really nothing unreasonably different in this syntax.

Ironically, a big chunk of motivation behind the proposed syntax is to simplify Rust and I'm strongly convinced it really does that. First of all because it integrates with ownership and borrowing system exceptionally well and with that reveals a lot of things that going on under the hood.

To not repeat myself, here is part from motivation section which addresses that:

Motivation

...

But probably the biggest change the cascadable method call is supposed to bring to Rust is the removal of allegedly the most confusing and unpleasant part in the current ownership and borrowing system: the inability to locally determine whether the primary intention of a method call is to "take" something or to "give" something — which either makes Rust harder to work with and creates most likely the deepest valley on the learning curve of the language.

E.g. currently vec.push(x) isn't grammatically different from vec.get(x) as well as vec.len() isn't grammatically different from vec.clear() etc. and the same holds for all take/give method pairs on familiar data strucutres like String, so when newcommers approach Rust the whole move/own/borrow semantics is explained to them in terms of indirect and most likely unfamiliar concepts like scopes, references, dropping, etc. where it's still evident how data moves around. This already may give the first impression that everything in Rust is structured mostly around that which is quite scary for learners comming from high level languages like C# and Java. Through, what's worse is that up until they get the &mut self and Deref interaction then method calls on anything looks like a magic or even fraud if they've managed to get information that unsafe can be hidded somewhere around. Such feelings may occur especially because the current book gives some information about &self but defers explanation about how exactly methods interacts with their receiver as if it's something insignificant or shameful, like if &mut self was just ommitted. Moreover, what's even worse is that on top of such incomplete mental model all their knowledge about fundamental concepts like structs, enums, slices, etc. is built, so just a lot of trust and patience is required to proceed through that and then not give up when trying to structure data in a useful way in their first program.

That said, cascadable method call allows to teach Rust on more familiar and representative data structures like vectors and strings and defer more low-level concepts like memory management and scopes a little bit further:

// We can imagine this on `learn X in Y minutes` website:

// By default bindings are immutable
let greeting = String::from("Hello world!");

// we can only temporarily borrow them
print!("{}", &greeting);

// or rebind (move) at a new, possibly mutable location
let mut greeting = greeting;

// Then bindings can be reassigned
greeting = String::new();

// modified with cascadable method call syntax
greeting push_str (" Hello ");

// or modified in other place while temporarily mutably borrowed
std::io::stdin().read_line(&mut greeting).unwrap();

// We can borrow them as immutable as well
print!("{}!!!", &greeting);

// and then update again
greeting clear push_str (" Hello everyone?");

// The . syntax allows to query a different type of data from the existed
let position_of_questionmark = greeting.len() - 1;

// while [..] allows to refer to some "range" of it
print!("{}!", &greeting[0..position_of_questionmark]);

// Examples of arrays, enums, nested scopes, etc. follows...



My argument is that by making Rust appearing simpler than it is we cause more harm than good, since the result is confusing and hence takes more from complexity budget. Agree or not, but I think there are a lot of people who just become frustrated because Rust looks like their previous language but not behaves in the same way, and even I think that it get its reputation of extremely overhyped language because of that. Moreover, paradoxes like "mutations are explicit but sometimes we close eyes on that" don't feel honest — you either know what they exists or you're too overwhelmed to see them or you just blindly trusts the authority that they're necessary. IMO there's also budget of that and in Rust it's more exhausted than the budget of complexity.

Also from @winksaville reply below

There is link on top to my previous proposal without adjacent syntax, and there were proposals to implement .. from Dart verbatim in Rust — all of them unsuccessful. There's no way that we would reuse any symbol for a different kind of method call because "Rust is already verbose". So, what I propose is the best syntax possible.

I've did a lot of prototypes like that and there are two conclusions:

  • You should apply this syntax to a really big chunk of code in order to understand whether it has sense or not; four lines might have sense but e.g. if you will try to rewrite any examples I've provided above with something like that the result will certainly look disgusting
  • Because Rust is already verbose the syntax which I propose is the best syntax possible

Sorry, not interested in implementing new languages. I already know how it could be implemented in Rust — I'll post desugaring section after gathering feedback on motivation section

A couple of observations:

  • Trying to read your example the first thing I noticed is the difficulty of parsing the syntax for the user. Whenever I see an indented line starting with an identifier I immediately think of a function/method parameter, while here it's not. This means I need to look more at the surrounding context to understand what that line is, and this hurts readability.

  • Allowing to call fn(T) -> T methods on a &mut T is tricky. What should happen if the method dropped the T and then panicked? You can't put back a valid T in the mutable reference, but the original owner of the T expects it to be valid. Some crates like replace_with and take_mut solved this by ending the process if the closure panics, but this is not acceptable for the language or the stdlib.

2 Likes

If Rust would support named, default, variadic arguments then thing you've seen most likely would be method/function parameter, so perhaps nothing wrong with that either.


It wouldn't allow fn(T) -> T on &mut T or any other unsafe magic:

Desugaring

Surprisingly, the desugaring mechanism isn't too complicated, although, it depends on whether the receiver is propagated, on its type and mutability, on signature of cascaded method, and after the type checking some adjustments must be also uplifted to the desugared code. Fortunately, the most of that is already used behind the current method resolution while the rest doesn't seem to be too hard to add.

Since understanding desugaring often is very helpful when learning Rust features any complicating of already complicated method resolution even a bit further may seem unacceptable. But still the added here complexity isn't what expected to make Rust any harder: the new syntax sugar must not hide anything exceptionally new or surprising, and anyway already the majority of users don't understand exactly how the method resolution in Rust works; so, examples that covers only basic usage, then intuition, trial and error, and a lot of practice — it's already the most popular way to master it which with cascadable method calls could be even improved.

Reference mechanism

Next might be considered as MVP which allows to comprehend that examples in the reference section are valid and whose purpose primarily is to be easily understood. For simplicity it's assumed that cascadable method calls desugaring/resolution happens before desugaring/resolution of regular method calls — in the final implementation these probably would be merged into one.

It splits into four stages:

  1. Creation of a temporary representation depending on four "types" of receiver
  2. Adjustment applies depending on whether receiver is propagated further
  3. Adjustments applies for each method with (&mut self, ..) -> .. signature
  4. Method call resolution runs with excluded immutable borrow of receiver

That's unfolded below to demonstrate what are possible types of receiver then how temporary representation looks for each of them and how succeeding adjustments may affect it. The unfold may look quite repetitive but it's still seems to be the cleanest way to represent each case — not a proper time yet to build abstractions.

So, let imagine that there are two cascaded method calls where the first has fn a(self) -> Self signature and the second has fn b(&mut self, o: O) signature (where o argument doesn't affect desugaring but was added just to demonstrate how arguments would look on it) and then if their receiver is:

1. Local value: x() a b (o)

    // Temporary representation is
    { let mut _x = x(); 'ADJ1 _x.a(); 'ADJ2 _x.b(o); 'ADJ3 }
  • 1.2. If receiver propagates further e.g. z(x() a b (o))

    // The last 'ADJ is replaced with internal variable
    { let mut _x = x(); 'ADJ1 _x.a(); 'ADJ2 _x.b(o); _x    }
    
    // Else receiver must be dropped at the end,
    // so the last 'ADJ is removed from the code
    { let mut _x = x(); 'ADJ1 _x.a(); 'ADJ2 _x.b(o);       }
    
  • 1.3. For each (&mut self, ..) -> .. method invoked

    // Corresponding 'ADJ is removed and then the rest are
    // replaced with assignments to the internal variable
    { let mut _x = x();  _x = _x.a();       _x.b(o);  ...  }
    
    // NOTE: here and further ... implies 'ADJ3 result
    
  • 1.4. Method resolution happens with excluded &Self

2. Moved-in value: let x = x(); (x) a b (o)

    // Parenteses around receiver are mandatory and indicates
    // that receiver was moved or copied in here

    // Temporary representation is
    { let mut _x = x; 'ADJ1 _x.a(); 'ADJ2 _x.b(o); 'ADJ3 }

    // NOTE: this representation as well as further
    // adjustments are almost copy-paste of the same from
    // the local value receiver type, with the only
    // difference is that the internal variable binds value
    // from external scope and not the locally created
  • 2.2. If receiver propagated further e.g. let x = x(); z((x) a b (o))

    // The last 'ADJ is replaced with internal variable
    { let mut _x = x; 'ADJ1 _x.a(); 'ADJ2 _x.b(o); _x    }
    
    // Else receiver must be dropped at the end,
    // so the last 'ADJ is removed from the code
    { let mut _x = x; 'ADJ1 _x.a()); 'ADJ2 _x.b(o);      }
    
  • 2.3. For each (&mut self, ..) -> .. method invoked

    // Corresponding 'ADJ is removed and then the rest are
    // replaced with assignments to the internal variable
    { let mut _x = x;  _x = _x.a();       _x.b(o);  ...  }
    
  • 2.4. Method resolution happens with excluded &Self

3. Mutable place: let mut x = x(); x a b (o)

    // Here desugaring operates directly on the receiver
    // without need in any internal variables
    { 'ADJ1 x.a(); 'ADJ2 x.b(o); 'ADJ3 }
  • 3.2. If receiver propagated further e.g. let mut x = x(); z(x a b (o))

    // The last 'ADJ is just replaced with receiver
    { 'ADJ1 x.a(); 'ADJ2 x.b(o); x     }
    
    // Else receiver shouldn't be moved into
    // cascade, so the last 'ADJ is removed
    { 'ADJ1 x.a(); 'ADJ2 x.b(o);       }
    
  • 3.3 For each (&mut self, ..) -> .. method invoked

    // Corresponding 'ADJ is removed and then the rest are
    // replaced with assignments to the receiver itself
    {   x = x.a();       x.b(o);  ...  }
    
  • 3.4. Method resolution happens with excluded &Self

4. Mutable reference: let x = &mut x(); x a b (o)

    // Here desugaring also operates directly on receiver
    // without need in any internal variables
    { 'ADJ1 x.a(); 'ADJ2 x.b(o); 'ADJ3 }

    // NOTE: this representation as well as further adjustments
    // are almost copy-paste of the same from the mutable place
    // receiver type, with the only difference is that assignment
    // to the receiver adds dereference as well
  • 4.2. If receiver propagated further e.g. let x = &mut x(); z(x a b (o))

    // The last 'ADJ is just replaced with receiver
    { 'ADJ1 x.a(); 'ADJ2 x.b(o); x     }
    
    // Else receiver shouldn't be moved into
    // cascade, so the last 'ADJ is removed
    { 'ADJ1 x.a(); 'ADJ2 x.b(o);       }
    
  • 4.3 For each (&mut self, ..) -> .. method invoked

    // Corresponding 'ADJ is removed and then the rest are
    // replaced with deref-assignments to the receiver itself
    {  *x = x.a();       x.b(o);  ...  }
    
  • 4.4. Method resolution happens with excluded &Self

Here it's very critical to instruct cascadable method resolution to exclude &Self from the list of "candidate" types for which a suitable method will be searched among traits and inherent impls because cascadable method call should mutate and hence is fundamentally incompatible with immutable borrowing — any coercion from &mut Self to &Self also shouldn't happen here. APIs must be written with that in mind.

Interestingly, that potentially could lead to some compilation speed improvements as the method lookup becomes more direct. It's possible to imagine some optimizations to be added during the implementation e.g. methods compatible with cascading may be internally listed separately from methods compatible with regular method calls and their usage might be prioritized accordingly. Autocompletion in IDEs also may benefit from something like that, so irrelevant results wouldn't be queried and listed.

What's also important is that (&self, ..) -> Self signature which in some cases could have been compared to the (self, ..) -> Self isn't cascadable intentionally, otherwise that will enable postfix clone operator that allows to write nonesense like something clone; which is basically clonning without clonning and mutation without mutation. Anyway, cascading such methods in most of cases would be strange solution.

For the rest of cases, in the future we could introduce #[internal_mutability] annotation (naming may change) for (&self, ..) -> T methods that internally uses Cell, Arc etc. or for any other reason are considered as mutating (e.g. IO). That's outside of the scope of this RFC because something like that should be proven first as useful/necessary/non-harmful and then separately overgo its own proposal and RFC processes and there might be a lot of extra complexity involved.

Is the following a good explanation of method cascading?

https://en.m.wikipedia.org/wiki/Method_cascading

2 Likes

Yes, I've even added that link to the topic.

The concept is so simple and it's so strange to me that so many people aren't familiar with it... I think I should create a followup post explaining this syntax, why it matters, and why I try so hard to embed it in Rust

Not everyone shares the same experiences with you. The RFC process is there to help you explain your reasoning and motivations in as clear as possible way so that others can get to the same place you are. Remember that any proposed feature is eventually going to be the first time someone runs into the concepts involved. It helps a lot if the docs explain things from the basics. Experienced devs (usually) know what they can skip; newer ones need more hand-holding. And where prior experiences differ from whatever gets decided, experienced devs can go back and clear things up.

5 Likes

Simple or not, it seems uncommon, as Wikipedia says:

Method cascading is much less common than method chaining – it is found only in a handful of object-oriented languages, while chaining is very common.

5 Likes

Indeed the concept is simple and my personal take on it is that having the ".." syntax makes it more obvious than relying on "nothingness" (i.e. spaces). The Dart language-tour explanation of Cascade notation is also informative and shows the "null-shorting cascade operator (?..)". Obviously Rust would need something like this to handle Option and Result return values. Personally, I like the explicit nature of using visible characters rather than spaces, I'd suspect it would make IDE support easier too.

1 Like