Macro metavariables matching an empty fragment?

I want to implement a macro something like:

foo! {
    fn foo(&self)
    fn bar(&mut self)
}
// expands to:
fn foo(&self) -> &'static str {
    "foo"
}
fn bar(&mut self) -> &'static str {
    "bar mut"
}

AFAIK the only way to implement this is to use a tt metavariable and split branches,

macro_rules! foo {
    ( $( fn $fn_name:ident ( & $($params:tt)+ ) )* ) => {
        $(
            fn $fn_name (& $($params)+) -> &'static str {
                foo!(@params $fn_name $($params)+)
            }
        )*
    };
    (@params $fn_name:ident self) => {
        stringify!($fn_name)
    };
    (@params $fn_name:ident mut self) => {
        concat!(stringify!($fn_name), " mut")
    };
}

which is verbose and makes it hard to track errors.

I think introducing the empty metavariable that matches an empty fragment would be helpful for reducing branches and improving readability in such cases. For example, the macro above can be implemented like the following:

macro_rules! foo {
    // `empty` metavariable must come after the macro pattern to prevent local ambiguity
    ( $( fn $fn_name:ident ( & $(mut $mut_used:empty )? self ) )* ) => {
        $(
            fn $fn_name (& $(mut $mut_used)? self) -> &'static str {
                concat!(stringify!($fn_name), $(" mut" $mut_used)?)
            }
        )*
    };
}

Thanks to $mut_used, the compiler knows that the fragments mut or " mut" must be expanded only when the token mut is given. Also, since we used mut directly instead of $($params:tt)+, it is also easier to know that this macro optionally needs mut.

Any feedback is appreciated!

8 Likes

Unironically, I planned on starting work on an RFC for an :empty matcher on Monday! It is something I have wanted more times than I can count.

2 Likes

I've sometimes achieved this by placing a dummy variable within a further optional fragment, but it's far from ideal (it isn't particularly clear, can be fussy about preceding or following tokens, can break in nested repetitions, etc):

macro_rules! foo {
    // `empty` metavariable must come after the macro pattern to prevent local ambiguity
    ( $( fn $fn_name:ident ( & $(mut $($mut_used:lifetime)? )? self ) )* ) => {
        $(
            fn $fn_name (& $(mut $($mut_used)?)? self) -> &'static str {
                concat!(stringify!($fn_name), $(" mut" $($mut_used)?)?)
            }
        )*
    };
}
2 Likes

An empty matcher sounds confusing. What should $($e: empty)? match, for example? (EDIT: and $($e: empty)* is an immediate infinite loop) Imho it should be a "token matcher". What you really want to do is to bind mut to some matcher, and splice it in if it was matched. But there is no way to capture a match of literal tokens. You could use $tt: tt, but of course it matches way to much than a literal mut.

So maybe a better alternative would be something like $m: tokens(mut)? Like this:

macro_rules! foo {
    ( $( fn $fn_name:ident ( & $( $mut_used: tokens(mut) )? self ) )* ) => {
        $(
            fn $fn_name (& $( $mut_used )? self) -> &'static str {
                concat!(stringify!($fn_name), $(" ", stringify!($mut_used) )?)
            }
        )*
    };
}

Why tokens and not token? Because you may want to match several specific tokens, and introducing a separate matcher for each one, when they must be treated as an indivisible multi-token sigil, is just error-prone syntax noise.

And yes, a feature like that would massively simplify certain macros (e.g. if you just want to forward any async, const, mut qualifiers from a function-shape input to transcribed functions.

This is exactly what I wanted. I always tried this with something like $($dummy:tt)? other, and after several failures, I thought this is not possible in the current version. Nevertheless, I believe having an empty matcher would be helpful to improve usability of macros :slight_smile:

Thanks for the feedback! This is actually similar to something I've initially come up with (my version was $var_name:(...), without tokens), but I was led to the conclusion that the empty matcher would be better.

In my example, some occurrences of $mut_used is to be expanded to mut, while others are to be expanded to " mut". If the matcher is matched to the fragment mut, we need another macro to convert mut to " mut" as follows:

macro_rules! foo {
    ( $( fn $fn_name:ident ( & $( $mut_used:tokens(mut) )? self ) )* ) => {
        $(
            fn $fn_name (& $( $mut_used )? self) -> &'static str {
                concat!(stringify!($fn_name), $( foo!(@stringify $mut_used ) )?)
            }
        )*
    };
    (@stringify mut) => { " mut" };
    (@stringify ) => { };
}

which is the opposite of my original intention to introduce a syntax useful to remove macro branches.

Of course, $var_name:(...) would be better and less verbose if this feature is used more to forward the matched sequence, but I personally encountered more cases that need some conversion from the match result, like the following:

// Details omitted for brevity
bar! { fn foo() -> i32 } // expands to CallMethod("()I")
bar! { static fn foo() -> i32 } // expands to CallStaticMethod("()I")

Another reason why I think the empty matcher is better is that the token matcher needs more changes to the macro syntax. Indeed, $($e:empty)? and $($e:empty)* make some contradiction, but this can be resolved by making the compiler emit the same error that occurred by $()? or $()*. The current version does not allow the use of repetition operators to an empty list.

The original macro branches are rather complex, and it's not even the most complex code that needs to be written due to the lack of something like the proposed feature. Worst case one needs to resort to a recursive parser based on tt-munching.

The token replacement macros, on the other hand, are trivial to write and maintain. It's just literal tokens in - literal tokens out.

The specific example with a string is also easy to solve using concat! and token matching. I edited my previous post with a code snippet fix.

Thing is, the only reason to use the proposed empty matcher is because you want to splice in a repetition matcher $(..)?, $(..)* or $(..)+, and its contents consists only of literal tokens, so there is nothing to repeat on. Introducing a new ambiguous matcher just to hack around this issue doesn't seem like a good solution. We should solve the problem directly.

In fact, the token matcher is probably not a good solution either, for the same reasons. Maybe add a way to capture the entire contents of a repetition?

You don’t always want the contents of the repetition; sometimes you want to do something based on its presence or absence but not actually use any of its tokens. So even if you could name the whole repetition, I’d still want some way to refer to it without actually using its tokens. :empty immediately has the “right” behavior by analogy with the other matchers, and the cases where it doesn’t make sense are as easy to detect and diagnose as, say, the tokens that aren’t allowed to follow :expr matchers.

EDIT: I think you could limit :empty matchers to being the last element in an otherwise non-empty repetition, which does suggest they’re “part of the repetition” rather than something independent. But I’d still want a way to iterate through a repetition without including its tokens, so…

3 Likes

Sounds like you want a no-op splicer instead of a no-op matcher. An RFC for that was already accepted ($ignore(foo)).

1 Like

Not “instead of”, but it could be two composable features, yeah.

With regard to repetition, it's worth noting there is precedence with :vis, as that can also match an empty tokenstream.

In the proposal I have planned for a few days from now, I am planning to propose that :empty is ignored when it comes to determining whether something can be repeated. So $($x:empty)* is obviously not allowed, as it would be equivalent to $()* for repetition purposes.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.