Pre-RFC: a :value matcher for macro_rules!

CAD97 · March 28, 2021, 12:52am

Summary

Add a new macro_rules matcher, $name:value, with nearly^[1] identical semantics to that of a function capture.

Motivation

Value arguments to function-like macros are tricky to deal with. While macro_rules macros don't suffer from the common and most egregious pitfalls of C-style preprocessor macros, such as misnested brackets and operator precedence, using an $:expr capture more than once still evaluates the expression more than once, duplicating side effects.

Additionally, we have the additional wrinkle of the drop timing of temporaries complicating matters further, if your intent is to write a macro invocation with equivalent-to-function-call semantics. Suffice to say, let arg = $arg; has the incorrect drop behavior, and the current best practice is to instead expand to

match ( $arg0, $arg1, ) {
    ( arg0, arg1, ) => { /* macro body */ }
}

instead. We can simplify this and make getting the correct behavior easier on macro authors.

Guide-level explanation

(in a section explaining macro_rules matchers)

While $:expr is good for capturing an expression and copying that expression into the macro-expanded code, it does exactly that: it duplicates the captured expression to every expansion of the capture. If, for example, you wrote the trivial min! macro,

macro_rules! min {
    ( $a:expr, $b:expr ) => {
        if $a <= $b { $a } else { $b }
    };
}

then both $a and $b are evaluated twice, once at each expansion point, as opposed to a single time, as would be the case if min were a function. If you want the arguments to the macro to be evaluated a single time, as if they were simple function arguments, you can use the $:value matcher:

macro_rules! min {
    ( $a:value, $b:value ) => {
        if $a <= $b { $a } else { $b }
    };
}

This time, $a and $b are evaluated a single time upon invoking the macro, and each expansion of the capture refers to the same value, just like function arguments.

Reference-level explanation

A new macro matching mode, $:value, is added. It captures the same grammar, has the same follow set, and can be expanded in the same positions as $:expr.

A macro_rules macro capturing an expression as $:value can only be used in expression position, not any other position (item, type, etc.). As such, extra information is provided to the compiler that it MAY use for nicer error messages. (When expanding an expression-position-only macro in item position, the current 1.51 rustc says "the usage of mac! is likely invalid in item context" (emphasis mine), which could be strengthened if all macro arms capture $:value.)

For a given capture $name:value, the captured expression is evaluated a single time upon entry into the macro expansion, whether $name is mentioned in the macro expansion zero, one, or any number times. Every expansion of $name within the macro expansion refers to the name of the temporary where the captured expression was evaluated. If more than one $:value capture is present, they are evaluated from left to right. The intent is that this has identical semantics to that of a function argument capture.

A compiler MAY implement this by expanding to a match expression:

macro_rules! mac! {
    /* other arms */
    ( /* other captures */ $name:value /* other captures */ ) => {
        /* macro body */
    };
    /* other arms */
}
// "desugars" to
macro_rules! mac! {
    /* other arms */
    ( /* other captures */ $name:expr /* other captures */ ) => {
        match $name {
            name => {
                /* macro body, $name replaced with name (hygienically) */
            }
        }
    };
    /* other arms */
}

but the compiler is expected to also handle the case where $:value is inside of a macro repetition, which cannot be directly implemented by just a desugaring of the macro_rules! invocation.

Drawbacks

$:value is another thing that macro_rules authors have to learn, but a small one, and replaces having to learn the match trick to get correct drop timing. The drawbacks of $:value seem to be exclusively in (potential) complexity of implementation, and in just adding more to the language.

Rationale

This simplifies the authoring of macro_rules macros, as authors now no longer need to learn and remember to use the match trick to bind macro value arguments, and instead can just use the $:value matcher to get function-argument semantics. Thus, while adding to the semantics provided by the Rust compiler, it reduces the needed complexity to write correct macro_rules!.

Additionally, it is impossible to have a repetition of expr captures that has function-argument like drop timing through use of the match trick alone, as it requires knowing the airity of the captures ahead of time to name each capture. $:value directly unlocks properly and fully variadic macros that act like function calls w.r.t. temporary lifetimes.

Alternatives

`$:place`

For exposition, we use macro_rules! m { ( $x:value ) => ( &$x ); } to explain functionality.

$:value as described above generates a new named temporary for the captured value, to match the behavior of function arguments exactly. That is, m!(array[0]) would call Deref::deref and return a reference to a copy of the deref'd to value (which would be dropped immediately, causing a borrowck error).

An alternative semantic, which I call $:place, captures the place in this situation, not the value. That is, m!(array[0]) would call Deref::deref and return that reference directly. If the capture were used as &mut $x, then Deref::deref_mut would be called. If the capture is expanded only once, it behaves identically to an $:expr capture, except for the evaluation timing of side effects.

This adds a new concept to Rust, that of capturing a place directly. This is entirely impossible in surface Rust today. This form of capturing may be more intuitive to macro authors (who are already used to and use similar behavior from $:expr). However, as this is much more complicated on the implementation side than a simple $:value, and $:value offers most of the benefit of $:place without the extra implementation complexity, $:place is just offered here as an alternative.

`macro fn`

Another possibility that's been discussed is macro fn. Basically, these would be fn, and have the semantics of fn, but be duck typed (like macros) and semantically copy/pasted into the calling scope. This is basically the exact feature that "macro_rules! with function-like captures" is trying to serve, except for one important thing: a macro fn is likely still a fn in that it has one fixed airity, and can't be overloaded like a macro can. Basically, macro fn is asking for "macros 2.0," which is still desirable, but still a long ways off. $:value offers a small improvement in the status quo without adding a completely new system into the compiler.

Prior art

TBD: do you know any other rich macro systems with similar evaluated-once capture semantics? Maybe Kotlin's inline fun?

Unresolved questions

Do we want $:value or $:place semantics?
- How much more complicated would $:place captures be to implement in the compiler than $:value? If the difference in effort is reasonably small, $:place becomes more appealing.
How should $:value/$:place expansions show up in cargo expanded code? cargo expand isn't required/guaranteed to maintain semantics exactly due to e.g. hygiene which isn't represented in the expanded code, but it's currently accurate so long as name clashes are avoided, with minimal manual cleanup ($crate paths, mostly). Ideally, we should preserve the meaningfulness of cargo expanded code in the face of $:value/$:place.
- $:value is simple enough in theory: just use the match expansion described above. That reduces the inaccuracy to name collisions, which is already expected.
- $:place is much more complicated (maybe even impossible in general), due to "capturing a place" not being a surface Rust semantic (and my own loose spec). Probably the best available option would be to expand it as a $:value capture of x, &x, or &mut x depending on usage.

Future possibilities

The $:place matcher (mentioned in the alternatives section) could potentially be added later and live alongside $:value as two options with different applicability. The author believes $:place offers a strict superset of the use-cases that $:value serves, but also that it may be more subtle than desired, thus its place as an alternative.

The oft-mentioned possibility of postfix macros (e.g. value.unwrap_or!(expr) as an alternative to value.unwrap_or_else(|| expr) that is TCP preserving for e.g. ?) also benefits from and can extend $:value (or $:place) semantics. It's (potentially) desirable that expr.mac!( ... ) cannot rewrite or otherwise impact the evaluation of epxr, and only refer to the receiver's value as $self. Having this functionality already available to macros in the form of $:value (or $:place) would smooth the on-ramp for postfix macros.

(NOTE: this pre-RFC is not about postfix macros. Please do not bikeshed them here.)

Footnotes

[1]: I haven't gone through and verified that the described semantics are actually identical to function arguments, and instead rely on the fact that match is the current practice. The intent is for $:value to actually be equivalent to a function argument. However, it is not unreasonable that macro authors may instead want/expect $:place semantics (see the alternatives section), in which case the capture would differ w.r.t. capturing places, not values.

(Also, pre-RFC note from the author: I use place and value here, but I don't know if these are actually correct terminology as used by the rustc compiler. I hope that my explanation here is clear enough, and if it's not, please suggest ways to improve it for the real RFC.)

josh · March 28, 2021, 3:25am

I like the idea of the :value matcher, as a convenient shorthand to avoid multiple evaluation.

As you observed, :value does work the same way that RFC 2442 proposed postfix macros could handle their receiver. If this is accepted, I'd likely rework RFC 2442 to use :value. (I would suggest not tying the two together, though.)

CAD97 · April 3, 2021, 6:58am

If there are no further objections or notes, I'll go about filling out the last bits and putting this up as a proper RFC early next week.

jjpe · April 3, 2021, 10:05am

CAD97:

Additionally, we have the additional wrinkle of the drop timing of temporaries complicating matters further, if your intent is to write a macro invocation with equivalent-to-function-call semantics. Suffice to say, let arg = $arg; has the incorrect drop behavior, and the current best practice is to instead expand to
match ( $arg0, $arg1, ) {
    ( arg0, arg1, ) => { /* macro body */ }
}
instead. We can simplify this and make getting the correct behavior easier on macro authors.

On the whole I think this is a desirable feature to have. I just have a question about the snippet above. As someone who's written his fair share of fn-like macros that do more or less what the quote above says (including the let arg = $arg; trick), I've never had any (observable) issues with drop timing. Could you explain how that could go wrong when using this tactic, and how the match expression version improves on that?

felix.s · April 3, 2021, 11:47am

I feel like this kind of breaks the mental model of macros as a purely syntax-tree transformation. While the match trick does demonstrate it may be possible to explain in those terms, it becomes somewhat more involved.

Right now, if I have a macro definition like

macro_rules! min {
    ( $a:expr, $b:expr ) => {
        if $a <= $b { $a } else { $b }
    };
}

I can think of it as working in two phases: first the invocation of the macro is matched against the pattern in its definition, then the captured tokens are substituted into the macro body. Matchers are only needed to define which grammar production they capture, and can be forgotten afterwards. Hygiene aside, I can think of macro substitution as simply splicing the captured grammar productions into the macro body as-is; with this change in place, this would no longer be the case. I would have to remember whether an expression was captured ‘as an expression’ or ‘as a value’, which adds to the cognitive overhead of reading macro code.

So I think that would make it harder to explain how pattern-macros work in full generality.

(I would also like some elaboration on why let arg = $arg is insufficient.)

But not to be entirely negative: I think I’d prefer a ‘syntactic inlining’ solution analogous to inlining in Kotlin in the long term (what I believe macro fn refers to). And in the meantime we can simply have a warn-by-default lint triggered when an expr-matched capture is used twice in a macro, which I believe should catch most of the pitfall cases.

CAD97 · April 3, 2021, 5:47pm

There's a number of subtle differences IIRC, not all of which I can recount off the top of my head, but the one that's easy to demonstrate is temporary lifetime extension (which is an oft-overlooked convenience feature of Rust):

[playground]

macro_rules! mlet {
    ( $e:expr, $x:expr ) => {{
        let e = $e;
        LoudDrop($x);
        e
    }};
}

macro_rules! mmatch {
    ( $e:expr, $x:expr ) => {
        match $e {
            e => {
                LoudDrop($x);
                e
            }
        }
    };
}

fn mfn<T>(e: T, x: &'static str) -> T {
    LoudDrop(x);
    e
}

struct LoudDrop(&'static str);
impl Drop for LoudDrop {
    fn drop(&mut self) {
        println!("{}", self.0);
    }
}

// borrowck error
fn a() {
    let _a = mlet!(&LoudDrop("a"), "b");
}

// compiles
fn b() {
    let _b = mmatch!(&LoudDrop("a"), "b");
}

// compiles
fn c() {
    let _c = mfn(&LoudDrop("a"), "b");
}

The more involved ones include the lifetime and drop timing of temporaries involved in a big().method().chain(), if my memory serves me well.

SlightlyOutOfPhase · April 3, 2021, 10:34pm

This is a great idea. The more powerful macro_rules! macros can be made without introducing any of the trade-offs proc macros have to make (long compile times, complete separation from "normal code", additional dependencies, etc.) the better I'd say.

CAD97 · April 8, 2021, 3:00am

I opened the RFC at

system · July 7, 2021, 3:00am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
`:generics` macro_rules matcher language design	5	1291	July 3, 2021
Pre-RFC: named capture groups for macros	5	648	September 1, 2024
Concept: Resolving macro_rules! and proc_macro tokens language design	2	593	November 16, 2020
Extending `matches!` to handle different return types	11	1156	January 24, 2022
Macros by example: splicing repetitions which don't have any captured fragments language design	5	524	April 15, 2024