[Pre-RFC] Explicit Closure Type


  • Feature Name: explicit_closure
  • Start Date: 2024-01-13

Summary

Instead of generate inplace-types for each closure, generate an explicit type and data layout for each locally generated |x| { ... }.

Motivation

Explicit closure types allow passing a closure out of a function without wrapping the returned value inside Box<dyn ...> . Existing closure types become tricky when you try to return them from a function.

// without explicit closure type
fn curry_add(x: usize) -> Box<dyn Fn(usize) -> usize> {
    Box::new(|y: usize| { x + y })
}

Introducing explicit closure type may solve this problem smoothly, while the old method is still applicable when two different data layouts are used for possibly returned closure type.

// with explicit closure type
fn curry_add(x: usize) -> impl Fn(usize) -> usize {
    |y: usize| { x + y }
}

Of course, for explicit closure, all the captured data must be moved, and reference to stack-allocated variables should not be returned from a function.

Guide-level explanation

A closure is a function with attached ('captured') data. The captured data can stored an passed along with the function. For example, in the following function:

fn foo(x: X, y: Y) -> impl Fn(Z) -> W {
    let t = x + y;
    |z: Z| { z.mul(t).mul(x) }
}

The expression |z: Z| have type Closure<(X, T), (Z), W> , which means X and T are captured (and stored), (Z) is an argument, and W is the return type.

Reference-level explanation

For sequentially captured value X1, X2, ..., XN , argument types Y1, Y2, ..., YN and return type Z, the compiler automatically generates a closure type looks like: Closure<(X1, X2, ..., XN), (Y1, Y2, ..., YN), Z>, together with a function-typed value. The closure type is declared as:

struct Closure<X, Y, Z> {
    // captured data
    data: X,
    // points to an automatically generated function 
    primitive: fn (X, Y) -> Z
}
impl<X, (Y1, Y2, ..., YN), Z> Fn<(Y1, Y2, ..., YN)> for Closure<X, (Y1, Y2, ..., YN), Z> {
    type Out = Z;
    fn apply(&self, args: (Y1, Y2, ..., YN)) -> Z {
        self.primitive(self.x, (y1, y2, ..., yn))
    }
}

This cannot work. Closures are more than their type signature and the data they capture. The type of a closure also must determine the behavior when called. This is the reason why the types of functions like |x: i32, y: i32| x + y and |x: i32, y: i32| x - y is distinct; as are the types of actual fn items like fn add(x: i32, y: i32) -> i32 { x + y } and fn sub(x: i32, y: i32) -> i32 { x - y }. And all of these are distinct from the type of function pointers fn(i32, i32) -> i32.


Edit: I see your explanation goes on to show a concrete layout of struct Closure<X, Y, Z>; in line wiht what I’m saying above, just note that this is different from actual closures in Rust which do not contain any function pointer.

5 Likes

Can you elaborate a bit why function type don't work even if there is a function pointer inside the captured closure?

Alright, maybe “doesn’t work” isn’t quite accurate. It’s more that this is entirely different from how closures currently work. Your Pre-RFC description makes it sound like you want every closure expression to be of this kind of type, which would be an undesirable change if it means that it makes closures generally less efficient by introducing virtual function calls.

Now I see what you mean after the edit. From your information, current closure implementation view the inner function as part of its type (dependent type of some encoding type, as I see it). They can be directly inlined and may trigger other optimizations, which can be hindered if it is provided as a function pointer because there is no inner information. But I believe there still exists a way to get equivalent performance: if you have good enough const propagation, the inner function pointer can still be recovered where the function pointer still works.


Edit: As Rust is not a dependently typed language (for users), I believe explicit closure types are more friendly for common users to understand.

Optimizations for devirtualization can never take you the whole way. For example one thing that closures are currently very good at is guaranteeing they’ll be inlined if used only once, since any use-site is already monomorphized again using a distinct closure type, anyway. Also, the zero-sized nature of non-capturing closures has advantages. To name just one thing that comes to mind:

For example if you create a Box::new(|x: i32, y: i32| x + y) as Box<dyn Fn(i32, i32) -> i32> then because the original closure is a zero-sized type, the Box<dyn Fn(…)> created from it does still not involve any heap allocations.

3 Likes

Emmm. In zero-sized cases, I believe coercing it to a fn is a better choice. For functions return zero-sized types in some cases and return some sized type in the other, then you still have to allocate heap space, and use virtual function calls anyway.

May be another type called ClosureOnce can be introduced in this case. And calls to ClosureOnce can be inlined immediately, similar to some data layout optimization as seen in some literature.

Can you elaborate a bit more what the use case is? You are only showing how such an explicit closure type is generated. Not how it is used and what advantage it has over Impl trait in type aliases - Impl trait initiative

May I guess that you want an abstracion only over the behaviour and not the arguments and captured state? To what level can you implement that in a library, i.e. via a macro?

1 Like

Functions can directly return explicit closures that use the same data layout. This helps avoid virtual function calls by providing further chances for optimization and helps avoid heap allocations when you try to use impl Fn types in your argument.

I don't know what 'abstraction over behavior' means. And I don't see an explicit connection between this topic and the Impl trait in type aliases - Impl trait initiative you mentioned. I'm really confused. I will be more than grateful if you can elaborate on that.

Function arguments and function return types that involve “impl Fn…” types already involve no virtual function calls or heap allocations (beyond what the closure type being used already does). The language feature of function arguments with “impl Fn…” work very differently from function return types with “impl Fn…” by the way, but that’s a separate concern.


The connection is that you’re seemingly proposing a feature that gives a name to closure types, and #![feature(type_alias_impl_trait)] is a feature addressing the issue of giving names to closure types.


If you want to learn more about how closures in Rust work, what impl Trait return types are and how they’re represented at run-time, and what #![feature(type_alias_impl_trait)] is used like and what problems it solved, feel free to do some online research (e.g. by consulting existing documentation, and maybe even RFC texts) and/or to seek help over on users.rust-lang.org.

1 Like

Just to add, this doesn't even need to be a language feature, since it can be built on top of coercing uniquely-typed captureless closures to function pointers, e.g.

// option 1 (stable)
struct YjijiClosure<I, A, O> {
    captures: I,
    function: fn(&I, A) -> O,
}

// unstable as-is but stable with a regular method
impl<I, A, O> Fn(A) -> O for YjijiClosure {
    fn call(&self, args: A) -> O {
        (self.function)(&self.captures, args)
    }
}

fn close_over<I, A, O>(captures: I, function: fn(&I, A) -> O) -> YjijiClosure<I, A, O> {
    YjijiClosure { captures, function }
}

// option 2 (tait)
type Closure<I, A, O> = impl Fn(A) -> O;
fn close_over<I, A, O>(captures: I, function: fn(&I, A) -> O) -> Closure<I, A, O> {
    move |args| function(&captures, args)
}

This does require listing the captures instead of inferring them, but that's already required anyway if you want to fully name the type and unify distinct closures.

What I would find interesting is related but distinct: self-aware closures which can call themselves recursively. (This can again be shimmed for &self dispatch, but mut/once dispatch requires language extension to borrowck captures as field access on closure-self.)

2 Likes

The motivation section seems to miss that you can already do this:

fn curry_add(x: usize) -> impl Fn(usize) -> usize {
    move |y: usize| { x + y }
}
 // array of 3 closures with the same type:
let adders = [3,5,7].map(curry_add);
assert_eq!(adders.map(|f| f(10)), [13,15,17]);

You only need trait objects, and thus boxing, once there are multiple different closure types.

8 Likes

Thanks for this informative example. I think this topic can be closed.

2 Likes