Pattern matching by delimiters in macro definitions and invocations

Currently, Rust absorbs the differences of delimiters in both macro definitions and invocations. This is quite a waste, as these differences are syntactic assets, especially when defining domain-specific languages or parsers:

macro_rules! delimited {
    ($tt:tt) => {
        f('(', $tt, ')')
    };
    {$tt:tt} => {
        f('{', $tt, '}')
    };
    [$tt:tt] => {
        f('[', $tt, ']')
    };
    // TBD
    "$tt:tt" => {
        f('"', $tt, '"')
    };
}

delimited!(pattern);
delimited!{pattern};
delimited![pattern];
delimited!"pattern";

Related

3 Likes

What's the harm in having an extra set of parentheses, braces, or brackets? It's only two characters.

This has already been rejected.

Hi, to confirm, I'm talking not about the harm in having an extra set of delimiters, but the very opposite, about the waste of Rust's macros being indifferent to the difference between them; the pattern matching in my example macro converges only to the first case with whatever delimiters you invoke it. I want to exploit their extra-ness.

1 Like

That's a breaking change, since macros can currently be used with arbitrary bracket style.

It wouldn't be a breaking change to allow new macros to opt into distinguishing between the kind of brackets they were invoked with. Whether we want to do that is another question, but it wouldn't be a breaking change.

(Changing that for an existing macro would be a breaking change, but that's up to the semver handling of the crate providing the macro.)

5 Likes

The generalisation of delimiters in macros has another pole, that is we use no delimiters:

macro_rules! undelimited {
    $tt:tt => {
        ..
    }
}

undelimited!pattern;

Just to show how we may use it:

u!000A;  // '\u{000A}ā€™ or '\nā€™

But this is also ambiguity-prone as in:

undelimited!(0, 1, 2);  // Is this (0, 1, 2) for $tt:tt or 0, 1, 2 for ($e:expr, $f:expr, $g:expr)?
undelimited![0, 1, 2];  // Same

This is not only a non-breaking change but also a very useful one.

One example of its use is the ability to create a macro that mimics Python's comprehension using different symbols, such as '[.]' for a vector, '{}' for a set/map, '()' for a plain iterator, and possibly ' "" ' for a string (although this could potentially cause issues).

Personally, I find this feature incredibly helpful, and I am not alone in this opinion. Cargo.io is full of comprehension macros that perform similar tasks, and this change would make them even more useful for writing logic compactly (for those of us who appreciate this kind of macros).

1 Like

It doesn't make sense to say "sometimes you can use whatever brackets with a macro, sometimes different brackets cause entirely different results". From a practical perspective, this is a breaking change. Suddenly I can no longer write macros in a way which is more convenient, but must lookup for every single macro in existence how brackets affect its behaviour. All documentation, including printed books, nowadays claims that macro brackets don't matter. All of those would become invalid overnight if this feature is stabilized. It would just be endlessly confusing to the users, for no good reason.

It certainly doesn't make sense to leave bracket dependence to each individual macro. Either it matters for all of them, or for none of them, anything else will just be a source of errors. Currently both macros and their consuming code assume that bracket style doesn't matter, thus changing it to "it always matters" is a breaking change.

The claim about usefulness are overblown. It's trivial to distinguish different cases: make macros with different names, introduce a distinguishing sigil at the beginning of token stream, enclose all contents into another layer of brackets. All of these decisions are simple, common and don't require major syntax overhead.

Changing documentation to account for a new rule is not a breaking change. Otherwise, every rule change is, by definition, a breaking change.

This makes sense because it allows differentiation between types of macros - in the sense that there is a default way (written for {} for example), and every other bracket runs the default implementation unless it has another implementation. This is the definition of default implementation (for traits, for example) which we all use already.

The compactness of logic is crucial for some applications. For instance, lambda functions in java are significantly more concise than writing a separate class just to provide a function as an argument. While they are technically the same, the reduced boilerplate of concise logic can make it much easier to comprehend.

2 Likes

Changed the name, also yes, this is why half of the proposals are to remove a bracket or something similar, it's very important.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.