Idea: let closures capture with argument passing semantics

Consider this snippet:

fn parse_line(line: &str) -> Result<Inst, String> {
    let mut words = line.split_whitespace();

    let mut parse_val = || {
        let expr = words.next();
        ...
    };

    let mut parse_label = || {
        let expr = words.next();
        ...
    };

    match words.next() {
        "mv" => Mv(parse_val()?, parse_val()?),
        "j" => J(parse_label()?),
        ....
    }
}

This does not work because both closures hold a mutable reference to words. However, there's no reason why this can't work, since we could make it work by passing words as an argument at each call site (and that's what I'm going to have to do, I guess). But what if we could get the same semantics as passing at each call site without having to write the argument explicitly?

Of course, if we pass the closure to another function, then it would have to take its references with it. So we'd have to keep the existing concept of a closure that's implicitly a struct holding references (or moved values), but we could implicitly destruct and restruct the closure each time it's used. I guess the reason we could do this with non-move closures and not with other values is that creating and destroying non-move closures has no side effects.

In any case, this is only useful for closures that are used in the same scope where they're created, which is not a very common use case. Just seems to be a case where we could be more permissive at no cost.

7 Likes

Why not make the "struct holding referneces" part explicit? Something like this:

let mut parser = impl {
    // Bikeshedding the syntax
    parse_val: || {
        let expr = words.next();
        ...
    },
    parse_label: || {
        let expr = words.next();
        ...
    },
};

match words.next() {
    "mv" => Mv(parser.parse_val()?, parser.parse_val()?),
    "j" => J(parser.parse_label()?),
    ....
}

Having parse_val and parse_label methods of parser makes it possible for the borrow checker to use its regular reasoning instead of adding complicated behind-the-scenes analysis rules for how they can be moved/borrowed relative to each other.

I also see value in being able to use this to implement traits with all the niceties of closures:

function_that_accepts_impl_trait(impl {
    some_method: |argument_that_has_its_type_inferred| {
        argument_that_has_its_type_inferred + value_from_closure_i_dont_need_to_define_a_field_for
    },
    method_that_consumes_self: move || ...,
    type NeedToSupport = AssociatedTypes,
    const AlsoNeed = AssociatedConstants,
});
11 Likes

That's a good point, having anonymous trait implementers would be more generally useful.

For the record, passing words as an argument turned out to be hard bc the compiler couldn't infer the type. I guess for some reason the call site was far enough away that it couldn't be used in type inference. So what I did instead was I moved the mutability out of the closures:

match words.next() {
    "mv" => Mv(parse_val(words.next())?, parse_val(words.next())?),
    "j" => J(parse_label(words.next())?),
    ...
}

Well, at that point, why not make "struct holding references" really, really explicit?

// NB: i think annoyingly you might well need two lifetimes
// because i'm pretty sure `SplitWhitespace` is invariant in the first parameter.
struct Parser<'a, 'b> {
    words: &'b mut SplitWhitespace<'a>
}

impl<'a, 'b> Parser<'a, 'b> {
    fn parse_val(&mut self) -> _;
    fn parse_label(&mut self) -> _;
}

let mut parser = Parser { words: &mut words };
// ... 

when terse syntax is introduced, i worry when the terseness comes primarily at the loss of clarity/explicitness.

3 Likes

With explicit syntax around this it should also be reasonable to provide access from outside the closure itself:

let mut parser = impl {
    // Bikeshedding the syntax
    parse_val: || {
        let expr = words.next();
        ...
    },
    parse_label: || {
        let expr = words.next();
        ...
    },
};

consume_line(&mut parser);
parser.words = the_next_line.split_whitespace();
consume_line(&mut parser);

With likely a different syntax direction, it would also be fun to do this for generators with the caveat that explicit attributes must not be borrowed across suspension points (they would be within a UnsafePinned). But then we could pass a context value that can be modified by the environment between reentries, a model for effects. This can currently be achieved by misusing the Context argument but only in a frankly insane manner. and I'd really like a similar interface with sane language support.

And for generators this has a much more obvious answer, having the compiler generate those Future impls through the transformation is the point of the async syntax.

1 Like

I think that could be implemented with a macro..

I’m pretty sure this can’t be implemented with a macro; at least not without having to manually specify all captured variables for the macro… and specifying the ways in which each captured variable was captured (mutable borrow / immutable borrow / move)… and the type signatures of all the methods and/or the definition of the trait (or have a separate macro per trait).

2 Likes

(playground link)

Tangential to the discussion, you can achieve what OP wants without turning these into bare functions by exploiting a Cell<Option<T>>.

use std::cell::Cell;

struct LazyMut<T>(Cell<Option<T>>);

impl<T> LazyMut<T> {
    fn new(value: T) -> Self {
        Self(Cell::new(Some(value)))
    }

    fn with<U>(&self, f: impl Fn(&mut T) -> U) -> U {
        let mut value = self.0.take().unwrap();
        let out = f(&mut value);
        self.0.set(Some(value));
        out
    }
}

Then your access turns from words.next() to words.with(Iterator::next) and your code looks like

fn parse_line(line: &str) -> Result<Inst<'_>, &'static str> {
    let words = LazyMut::new(line.split_whitespace());

    let parse_val = || {
        let expr = words.with(Iterator::next).ok_or("missing value")?;
        expr.parse().map_err(|_| "value is not a number")
    };
    let parse_label = || {
        let expr = words.with(Iterator::next).ok_or("missing label")?;
        Ok(expr)
    };
    let parsed = match words.with(Iterator::next) {
        Some("mv") => Inst::Mv(parse_val()?, parse_val()?),
        Some("j") => Inst::J(parse_label()?),
        Some(_) => return Err("unknown instruction"),
        _ => return Err("empty line"),
    };
    Ok(parsed)
}

More on topic I like @197g's explicit syntax suggestion, since it conceptually maps fairly simply to a struct that has various &mut self methods, in the exact same way that a regular closure maps to a struct with a single call(&mut self) method. There's not much additional magic compared to the magic that a closure already produces. So good teachability.

Maybe to give an fleshed out suggestion to demonstrate what I'd like to have explicitly:

let coro_with_attr = async type {
    pub input: Cursor<Arc<[u8]>>,
    // non-move from the environment
    pub ref borrowed_name: &'_ str,
    .. // mandatory to say: here is more compiler generated things
} {
    // coroutine body
    input.set_position(..);
    make_env_do_something().await;
    input.set_position(..);

    let _not_valid = &input;
    something_else().await;
    act_on(_not_valid) // error: borrowed across await. something with lifetime
        // bounds mentioning that the lt of `_not_valid` must end at the await.
}

// valid:
coro_with_attr.await;
// also valid:
coro_with_attr.input = Cursor::new(…);

Then the reference captured by the closure would alias with the local variable words. So you are basically proposing to allow aliased mutable references in Rust without interior mutability. This is probably unsound, and also breaks a fundamental principle of the language.

That's why using Cell or RefCell you can actually implement this, but the compiler won't do it implicitly.

1 Like