Pattern matching by delimiters in macro definitions and invocations

rin · March 1, 2023, 9:07am

Currently, Rust absorbs the differences of delimiters in both macro definitions and invocations. This is quite a waste, as these differences are syntactic assets, especially when defining domain-specific languages or parsers:

macro_rules! delimited {
    ($tt:tt) => {
        f('(', $tt, ')')
    };
    {$tt:tt} => {
        f('{', $tt, '}')
    };
    [$tt:tt] => {
        f('[', $tt, ']')
    };
    // TBD
    "$tt:tt" => {
        f('"', $tt, '"')
    };
}

delimited!(pattern);
delimited!{pattern};
delimited![pattern];
delimited!"pattern";

Hi, to confirm, I'm talking not about the harm in having an extra set of delimiters, but the very opposite, about the waste of Rust's macros being indifferent to the difference between them; the pattern matching in my example macro converges only to the first case with whatever delimiters you invoke it. I want to exploit their extra-ness.

afetisov · March 1, 2023, 3:03pm

That's a breaking change, since macros can currently be used with arbitrary bracket style.

josh · March 1, 2023, 6:00pm

It wouldn't be a breaking change to allow new macros to opt into distinguishing between the kind of brackets they were invoked with. Whether we want to do that is another question, but it wouldn't be a breaking change.

(Changing that for an existing macro would be a breaking change, but that's up to the semver handling of the crate providing the macro.)

rin · March 1, 2023, 7:49pm

The generalisation of delimiters in macros has another pole, that is we use no delimiters:

macro_rules! undelimited {
    $tt:tt => {
        ..
    }
}

undelimited!pattern;

Just to show how we may use it:

u!000A;  // '\u{000A}’ or '\n’

But this is also ambiguity-prone as in:

undelimited!(0, 1, 2);  // Is this (0, 1, 2) for $tt:tt or 0, 1, 2 for ($e:expr, $f:expr, $g:expr)?
undelimited![0, 1, 2];  // Same

Ratatouille · March 2, 2023, 12:18pm

This is not only a non-breaking change but also a very useful one.

One example of its use is the ability to create a macro that mimics Python's comprehension using different symbols, such as '[.]' for a vector, '{}' for a set/map, '()' for a plain iterator, and possibly ' "" ' for a string (although this could potentially cause issues).

Personally, I find this feature incredibly helpful, and I am not alone in this opinion. Cargo.io is full of comprehension macros that perform similar tasks, and this change would make them even more useful for writing logic compactly (for those of us who appreciate this kind of macros).

afetisov · March 2, 2023, 1:34pm

It doesn't make sense to say "sometimes you can use whatever brackets with a macro, sometimes different brackets cause entirely different results". From a practical perspective, this is a breaking change. Suddenly I can no longer write macros in a way which is more convenient, but must lookup for every single macro in existence how brackets affect its behaviour. All documentation, including printed books, nowadays claims that macro brackets don't matter. All of those would become invalid overnight if this feature is stabilized. It would just be endlessly confusing to the users, for no good reason.

It certainly doesn't make sense to leave bracket dependence to each individual macro. Either it matters for all of them, or for none of them, anything else will just be a source of errors. Currently both macros and their consuming code assume that bracket style doesn't matter, thus changing it to "it always matters" is a breaking change.

The claim about usefulness are overblown. It's trivial to distinguish different cases: make macros with different names, introduce a distinguishing sigil at the beginning of token stream, enclose all contents into another layer of brackets. All of these decisions are simple, common and don't require major syntax overhead.

Ratatouille · March 2, 2023, 3:19pm

Changing documentation to account for a new rule is not a breaking change. Otherwise, every rule change is, by definition, a breaking change.

This makes sense because it allows differentiation between types of macros - in the sense that there is a default way (written for {} for example), and every other bracket runs the default implementation unless it has another implementation. This is the definition of default implementation (for traits, for example) which we all use already.

The compactness of logic is crucial for some applications. For instance, lambda functions in java are significantly more concise than writing a separate class just to provide a function as an argument. While they are technically the same, the reduced boilerplate of concise logic can make it much easier to comprehend.

Ratatouille · March 3, 2023, 7:05am

Changed the name, also yes, this is why half of the proposals are to remove a bracket or something similar, it's very important.

system · June 1, 2023, 7:05am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Add `\|` to allowed macro separators	10	700	January 14, 2025
Make macro syntax similar to function language design	14	1643	June 6, 2021
On macros future-proofing, FOLLOW sets, and related stuff language design	33	5485	March 25, 2019
`:generics` macro_rules matcher language design	5	1289	July 3, 2021
Idea: escaping macro separators language design	16	3155	March 25, 2019

Pattern matching by delimiters in macro definitions and invocations

Related

Related topics