[Closed][Pre-RFC]: Sub function

Summary

A “sub function” that behaves just like a normal function but it has access to everything that was declared before it in the caller.

Motivation

Often I find myself wanting to divide up a function to organize code better and/or to have parts in different files. The current ways (that I know of) to do this is either to create a Struct containg the data or to include all the data I need as arguments. Which can be tedious if theres a lot and if you want to do this multiple times, and it kinda clutters the code.

Explanation

I think the easiest way to explain is with examples:

Note: I made up the keyword “sb” here to describe the new function type, but feel free to suggest a different keyword or maybe syntax for this.

Note: Examples here a simple, the meaning behind this is to use it with many variables and bigger code.

In this example the “sub function” (for my lack of coming up with a better name) Automaticly has access to the variable named value in the caller function without the developer having to pass it as a argument.

    fn calc() {
        let value = 100;
        increment();
        println!("{}", value); // 101;
    }

    sb increment() {
        value += 1;
    }

The “sub function” behaves like a normal function that it can also have arguments and returns.

    fn calc() {
        let value = 100;
        let i = add(100);
        println!("{}", value); // 200;
        println!("{}", i); // 10;
    }
    sb add(x : u32) -> u32 {
        value += x;
        return 10;
    }

And will shadow.

    fn calc() {
        let value = 100;
        increment();
        println!("{}", value); // 100;
    }
    sb increment() {
        let value = 0; // the value in calc() gets shadowed by local variable
        value += 1;
        // local value dies here
    }

Potential Issues:

  • Multiple calls to the “sub function” and variables declared in between:
    fn calc() {
        let value = 100;
        increment();
        let other_value = 2;
        increment();
    }
    sb increment() {
        value += 1;
        other_value += 1; // Error, needs to be declared before the first call
    }
  • Should a “sub function” be limited to only one caller?
    fn calc() {
        let value = 100;
        increment();
    }
    fn calc2() {
        let value = 10;
        increment(); // Possible, because it has declared value, but there could be a rule that says "sb increment" can only
                    // belong to one function
    }
    sb increment() {
        value += 1;
    }

This could also open up a special name scheme (example syntax):

    fn calc() {
        let value = 100;
        increment();
    }
    sb calc::increment() {
        value += 1;
    }

Can you elaborate on what the difference between this and closures (Closures: Anonymous Functions that Capture Their Environment - The Rust Programming Language) are?

The value variable is not defined in the sb increament() function, but defined in all of its possible callers.

It seems like a kind of dynamic scoping, and I do not thinke there’s a feasible way to introduce it in Rust.

I guess the sb function he want could have multiple callers, while a closure has at most one caller.

This is introducing dynamic scoping, but as far as I know, Rust is using lexical scoping.

Yes, a function that uses dynamic scoping is what i am suggesting (I didn’t know about this term). A closure would have to be inside the function? and wouldn’t help with restructuring

This just reads to me as a sort of “friend closure” with spooky C++-friend scoping. I question whether we want to bring friend-semantics into Rust; though I definitely understand the motivation. It feels quite cumbersome to write a local closure like

let my_local_fn = |..| {
    // ..
};

where, in other languages with nested functions, you just declare a local function.

Unfortunately, this sort of scoping is beyond the borrow checker’s understanding. If my_local_fn captures anything the borrow checker will hate you: either you’ll capture by reference, essentially locking up those captures until the the closure’s last use (i.e., no moves or &muts), or, if you used a move closure, completely making those values disappear… which is also confusing.

So… I think you want to be able to write something that captures like a closure, but which only holds the borrows when it’s called, since you can prove it’s local? Imagine a desugaring like…

fn foo() {
  let y = String::new();
  sub fn local(s: &str) {
  // signature desugars to
  // fn local(/*capture*/ y: &mut String, s: &str)
    y.push_str(s);
  }
  local("foo"); // local(&mut y, "foo")
  y.push("bar");
  local("baz"); // local(&mut y, "baz")
}

Unfortunately, I can’t imagine what diagnosing borrowck errors that result from this might be, and it might ultimately be more confusing that is intended. To be clear, the only reason this works is because local never escapes foo's scope; this means, among other things, that local is not coercible to a function pointer or to impl Fn, and, obviously, local must be declared inside of foo.

1 Like

I suppose you could think of it like this

Without sugar:

fn nested(x : &mut i32) {
    *x += 1;
}

fn foo() {
    let mut x : i32 = 5;
    nested(&mut x);
    println!("{}", x);
    nested(&mut x);
    println!("{}", x);
}

With:

sb nested() {
    x += 1;
}

fn foo() {
    let mut x : i32 = 5; // Because the variable is mutable here
    nested();           // its mutable in nested();
    println!("{}", x);
    nested();
    println!("{}", x);
}

You can almost use macros for this, so I don’t see the point.

4 Likes

If that's all you want, then closures are perfectly fine and able to achieve this goal. (In fact, that is the point of closures.)

Good to know. However, my short advice is simply "please don't".

Both dynamic scoping in general, and the specific example you cited, severely impede local reasoning. And by "reasoning", I mean "reasoning by readers of the code". Even if this could be technically possible by teaching the compiler new tricks, I would very strongly oppose the idea, because it hides important information that is essential for a complete and correct understanding of the code.

Effectively, what this does is introducing an implicit global variable (or, when used generously, many of them). Implicit globals have been tried in many programming languages before, and the general professional consensus today is that the concept was a mistake and an anti-pattern. I think it would be a step backwards if Rust adopted anything like this.

The main reason: since there can be arbitrary (textual) distance between the definition and the use of a function, this feature would also allow that there is arbitrarily large distance between the definition (and call site) of a "sub function" and the variables it refers to. This means that a part of the state the function captures might not even be in sight when one is reading the body or a use of the function. This most likely results in confusion and a misunderstanding of the code, which is exacerbated if the implementation of said functions are subsequently modified.

Thinking about it, what you describe here is basically a plain old object. If you have a bunch of state that you want to reuse across functions multiple times, the right way to do it is creating a type for it and making the functions that use it methods of the type. This way, the relationships and dependencies between code and data are clearly visible, and the compiler can also prove the correctness of the involved code more easily.

To be clear, it is usually the case that the burden of producing clear code should be on the writer, and not that the burden of understanding complex code making use of arbitrarily many language features should be on the reader. Therefore, any advantage in ergonomics of reading the code should be weighted much more than writing convenience. Ad-hoc external state accumulation is a prime example of this principle in action.

Therefore, I suggest that instead of adding action at a distance to the language, take a step back, and revise the design of your code. You might spot some opportunities for cleanup and refactoring, including better organization of data in appropriate types.

3 Likes

I understand that readability is important for rust which is why i suggested a naming scheme that shows where it belongs to, but I understand that it is not good enough.

The reason I suggested this over just using plain objects is that in rust you need to have all the data when creating a object. I could create multiple objects but it felt weird to create objects that I only have one instance of.

Imagine you had to create 20 types, but every type needed the previously created types to be made, and this was not possible to do in a different way. Instead of having a very long function I wanted to move each individual creation to its own function because it would be easier for me to see where I needed to go to do specific changes. I could create a struct where all the types are wrapped in a Option or create smaller struct stepwise but those smaller structs wouldn't have a logical meaning to them except preventing me to have to type 19 arguments for the last function, 18 for the one before that and so on..

But I understand that readability would be a huge problem, and I will try to organize the code in a different way that makes sense to me. Thank you

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.