Proposal: explicit reference to `self` for closures


#1

Recursive closures is possible in Rust, but it cannot mutate its captured variables as to make it recursive it have to keep a separate reference of the closure itself.

I am thinking a way to get around this restriction and make recursion easier. Like this:

let mut i1 = 1;
let mut i2 = 1;
let fib: &FnOnce(i:i32) -> i32 = &move |mut captured,i| 
    match i {
        0 => i1,
        1 => i2,
        i => {
              (captured.i1, captured.i2)  =
                  (captured.i2, captured.i1 + captured.i2);  
              captured(i-1)
         }
      }
};

We make captured work as self in impl blocks, and making it the closure itself. we can also allow it to be decorated like &captured, &mut captured etc. If captured appears in the variables, it have to be the first one, and every access to the captured variable must have captured. prefix.

Note: The move keyword is not necessary here as in the above example we receive captured by value, so it forces to move everything captured to the closure.

The move keyword is still needed as in the above case we have option to copy or move. But move will be redundant in cases when there are any captured variable is not Copy.

Once we have this we may need to relax the Fn trait structure to allow Fn not necessary FnMut and FnMut not necessary FnOnce, because once we use the above syntax, the result type will have only one Trait being implemented, and it may not be an easy task to generate other variations for calling.

Of cause, for backward compatibility we will not change anything if the first variable of the closure is not captured, and the result closure will have FnOnce if only FnMut is needed, FnMut if only Fn is needed, as usual. And there is no prefix when accessing captured variables.

Another benefit of this is it supports FnBox the same way as others:

let mut i1 = 1;
let mut i2 = 1;
let fib: Box<FnBox(i:i32) -> i32> = box move |mut box captured,i| 
    match i {
        0 => i1,
        1 => i2,
        i => {
              (captured.i1, captured.i2)  =
                  (captured.i2, captured.i1 + captured.i2);
              captured(i-1)
         }
      }
};

Even when we later decided to make Box<FnOnce> just work, stabilizing FnBox is still a good option as it makes the above works the same way as the other variants. EDIT (Except that we might want to remove the closure grammar we use today being able to generate FnBox automatically. This means if you still want to use FnBox you have add box self in the parameter list, explicitly.)

It is too early to formalize this into a RFC, I am just want to see how people feel about this.


#2

This would break any closures that named the first argument captured.

It’s also still trivial for Fn to imply FnMut and FnMut to imply FnOnce:

impl<F: Fn> FnMut for F {
    fn call(&mut self, args: Args) -> Self::Output {
        Fn::call(&*self, args)
    }
}

impl<F: FnMut> FnOnce for F {
    fn call(self, args: Args) -> Self::Output {
        FnMut::call(&mut self, args)
    }
}

Maybe this could be done by using the fn keyword like self is used? Though that would feel very weird. I’d still recomend leaving capture rules as they are (in fact, you missed two required captured. in your example.)

(I’m not really in favor of this as written. I see the hole – it’s not possible to write a recursive function that captures scope – but I don’t see any case where this is needed or improves readability of the code.)


#3
error[E0424]: expected unit struct/variant or constant, found module `self`
 --> src/main.rs:3:16
  |
3 |     let foo = |self| 1;
  |                ^^^^ `self` value is only available in methods with `self` parameter

This error message entails that self could refer to the closure with zero breakage.


#4

My original concern of using self is that it would not allow self being used in the closure body. But I think in that case we can let the user to move self to another variable, then captures it into the closure.

using self have other benefits: it makes the closure looks more like the form FnXXX traits being defined. So when new users looking at the signature of the trait they know how to use them immediately.

So yes, I agree self might be better.


#5

You are right about Fn and FnMut.

But once we are able to do the above we may look forward to support not just FnBox but also FnCell, FnRefCell, FnRc… etc, in exactly the same way. And it will be hard to decide which one implies another, so it would be good to keep everything consistent.


#6

Technically, that may sound very plausible. In reality, however, it’s a huge change in the mental model: whenever one sees self in a closure from now on, they will have to think hard about whether it’s a captured self or it refers to the closure. Sounds quite dangerous to me, it’s very easy to confuse these two, it makes the code surprising to read.

This also makes the code heavily context-dependent. I often refactor code moving it out of and into a closure. Maybe a closure grows too big and needs its own method, or maybe a method becomes trivial and can be inlined. This proposal makes it impossible to perform such refactoring, because the meaning of self changes as you move code from a self-referring closure to a method and vice versa.

@earthengine I don’t really see why it is useful in realistic code. When was the last time you needed a recursive closure that mutates its captures, and why wouldn’t you refactor it immediately to something clearer? The usage of such closures seems to be very bad style to me.


#7

Note: I’m not saying we should necessarily do this; I’m only saying what a non-breaking solution might be :wink:


#8

I’m 100% with @H2CO3 on this. Changing the meaning of self is not a good idea.


#9

@earthengine Can you provide a better use case where you actually need a such a closure? IMO your example would be less complicated if it was written with a normal function that passes the state:

fn fib_internal(curr: i32, next: i32, remaining: i32) -> i32 {
    if remaining > 0 {
        fib_internal(next, curr + next, remaining - 1)
    } else {
        curr
    }
}

pub fn fib (n: i32) -> i32 {
    fib_internal(0, 1, n)
}

Try in playground

IMO that’d be more explicit and easier to understand.

Using a boxed function for your code example would have a severe performance impact. You’d get thousands of unnecessary heap allocations if you boxed it. Boxing should only be used if required.


#10

Needless to say, it can also be done in today’s rust closures. Only that you will need a fixed point combinator, like the following.

Playground link

fn fix<F,T,R>(f:F, t:T) -> R 
    where F: FnOnce(&Fn(T) -> R, T) -> R + Clone
{
    (f.clone())(&|t| fix(f.clone(), t), t)
}

fn main(){
    let fib = |i|fix(move |s:&Fn((u32,u32,u32)) -> u32, v:(u32,u32,u32)|
        match v.0 {
            0 => v.1,
            1 => v.2,
            i => s((i-1, v.2,v.1+v.2))
        }, (i,1,1));
    println!("{}",fib(46));
}

Compare to the solution in the top post, this is less intuitive and harder for new users. Moreover, it depends on the fact that u32 is Copy and so mutation can be simulated. But for other types, this may not be that easy and may need to introduce Cell or RefCell things.

All I want to see for this feature, is not to change anything that works today, it is to allow a new way to express the same intention. So if you don’t like it, don’t use it! This apply to @H2CO3 's refactoring argument, and the @MajorBreakfast 's FnBox argument. (and actually, if we do need FnBox, my proposal is an improvement because the user have full access to the Box and the closure code can decide to move the content out of the box, which would not be allowed in today’s rust closures)

When talking about real usage, well, it is depending on how would you define “complicated”. In the fib case, it seems easy to have a separated function. But this applies to all closures, if you always prefer to have separately defined functions, why do you need a closure? But then closure was proved its value in all modern programming languages.

Having said that, let me just try to give another use case though. Let’s say that an application will listen on a port for connections. It only handles one connection at a time, but each time it receives a connection and handles it, it changes the way it authenticate the client.

let mut password = gen_password();
accept_client(listen_socket, move |mut self, client| {
    handle(client, &self.password);
    next_password(&mut self.password);
    accept_client(self.listen_socket, self)
});

As I have said, you can decide to not use the self part and you don’t have to use self in the body, so you can extract the function easily. But even with self you can still do it like this:

let mut password = gen_password();
accept_client(listen_socket, move |mut self, client| {
    let closure = self;
    handle(client, &closure.password);
    next_password(&mut closure.password);
    accept_client(closure.listen_socket, closure);
});

#11

You’d reach for a fix-point combinator? I’m impressed.

Call me a simple man, but I’d just go for this:

let fib: Box<Fn(i32) -> i32> = Box::new(|i| {
    struct Closure { i1: i32, i2: i32 }
    impl Closure {
        fn call(&mut self, i: i32) -> i32 {
            assert!(i >= 0);
            match i {
                0 => self.i1,
                1 => self.i2,
                i => {
                    let old_i1 = self.i1;
                    self.i1 = self.i2;
                    self.i2 += old_i1;
                    self.call(i - 1)
                }
            }
        }
    }
    Closure { i1: 1, i2: 1 }.call(i)
});

#12

Sorry, but this argument still doesn’t work. If a language feature is introduced, it will be used, and eventually anyone will encounter and have to deal with code relying on (and bugs arising out of) that feature.


#13

I mean, you can’t prevent someone else writing bad code, but at least you can decide to not writing them.

Back to the discussion of this feature, I didn’t see how it should change your view of a piece of code. It is true that when you look at self you need to ask yourself what it refers to, as its meaning depends on context. But this is already happening right now. Check @ExpHP 's code. self 's meaning is also changed in the method body.


#14

In what way is it changed? Isn’t it the method’s receiver, as it always was?


#15

If the outer let is inside a method, then from the outside of the block, self is the outer method’s receiver, which will be a different object than the inner self.

This is exactly the same as in my code: self inside the closure is now refer to a new object, regardless the value being defined from outside or not.


#17

His code is structured something like let fib:... = ...;, so it is a single let binding, which can be part of a function/method body.


#18

That’s not where the problem lies though. In that case, you could still always tell that self belongs to the innermost method.

Instead, the problem is found in code like this:

impl Foo {
  fn bar(&self) {
    self.do_thing(|&self| qux(self.some_member))
  }
}

Here, the self inside the closure refers to the captured variables of the closure. Now if it becomes longer and more complex, I might want to refactor it into a method:

impl Foo {
  fn bar(&self) {
    self.do_thing(Self::frobnicate)
  }

  fn frobnicate(&self) {
    qux(self.some_member)
  }
}

Here, the meaning of self changed from “closure capture” to “method receiver”. It is no longer the case that it is always the receiver of the innermost method; instead, it’s sometimes that, and sometimes a closure capture. And if some_var happens to be in both, the code can still compile and silently do the wrong thing, which is the worst possible scenario.


#19

you could still always tell that self belongs to the innermost method.

True for the closure as well - you can still always tell once you see the |&self...| heading.

Here, the meaning of self changed from “closure capture” to “method receiver”.

True again. But it happens every time when you attempt to refactoring unwisely. To perform a safe refactoring, you have to be careful for all names the original code referring to. This feature didn’t make it worse or better, just be careful


#20

The problem is not with the syntax or that I don’t know what self is; the problem is that its meaning changes.

By that reasoning, we wouldn’t need any of Rust’s safety features, and we could just go back writing Heartbleed or goto fail in C. Dangling pointers or double-free? “You weren’t careful enough!” Race conditions? “You should have known better and used atomics!”

50+ years of computing history shows that “just be careful” doesn’t work. Programmers make mistakes no matter how careful they are, and a good language – especially one that claims to be correctness- and safety-oriented – tries to minimize the amount of errors one can make. That is ergonomic. A marginally-useful, but sometimes dangerous special case is not ergonomic.


#21

Right. So in terms of Rust safety features, your code below

fn bar(&self) {
    self.do_thing(|&self| qux(self.some_member))
}

should not compile, as self was changed meaning. To compile with the proposal, you will have to write:

fn bar(&self) {
    self.do_thing(|&self| qux(self.self.some_member))
}

The first self is the closure itself, and it captures the method self (outer one). So for the proposal, you will need to write self.self to refer to the outer self.

Once you later decided to refactoring it, the compiler shall again prompt you with an error that the outer self does not have a member called self, so your code will not compile if you didn’t remove the extra self