[Pre-RFC] Macro-rules FIRST sets future-proofing


#1

Hello everyone,

This is a follow up to my previous post about future proofing in which I present the results of my work so far. I’m reaching a point where I think I cannot improve much the existing analysis so I started writing this pre-RFC about it, in order to have the comments of the other people of the core team about the state of the thing, the things that we could address, and how we could land it.

It’s far from being ready I think, a lot of corner cases are still to be precised…

  • Feature Name: first_sets_future_proofing
  • Start Date: 2016-08-09
  • RFC PR:
  • Rust Issue:

Summary

Extend the restriction on macro_rules! macros using the concept of FIRST sets to be able to check more complex macros (that is, multiple arm macros or macros using sequence repetitions).

Motivation

macro_rules! are not future-proof, which means there exist valid programs where macro invocations could change behaviour if some grammar productions were added to the language.

An existing solution based has been described and implemented. It aims at avoiding such macros by restricting what macros can and can’t rely upon on one side, and what kind of grammar productions we can add on the other side.

It relies on the concept of FOLLOW sets. It restricts macros matchers by specifying what tokens can occur after a non-terminal matcher in a macro pattern.

There are complex problems with this solution that make it insufficient because it fails to check macros that feature two “parallel” parsing possibilites, that is:

  • Multiple arm macros
  • Single-arm macros with sequence repetitions, where an input could match either the matchers inside the sequence or what comes after. This is currently blocking to be able to “backtrack“ when parsing macro-rules.

The problem with multiple arm macros have been known for a long time. An example of an issue with sequence repetitions can be found here.

Detailed design

This new solution is intended to be added as a complement to the current solution based on FOLLOW sets, not replace it.

It uses FIRST sets, that is, for each non-terminal, the set of tokens that are allowed to begin a sequence that matches this non-terminal, or will possibly be in the future (this is the same, by computing the difference with the set of all Rust tokens, as specifying a set of tokens that will never be allowed to start a sequence that matches the non-terminal).

Using this, we can look for “obvious disambiguations”, that is, places in the macro matchers where the langages accepted by each (or that may me accepted in the future) are mutually exclusive (their intersection is empty).

[Insert a description of the algorithm here. For now, you can look here. The current version has more special cases builtin and a few bugs fixed but is still quite close.]

Since this will generate a lot of false positives, and that even legitimate errors will probably be hard to understand and to fix, we will first land it as a warning. We could then periodically use crater to gather breakage statistics to see how fast people are adapting and decide when to turn it into an error (if we ever do, the other option being to wait for macro_rules! to be replaced by something else).

Drawbacks

  • This analysis is potentially slow. It’s quadratic with the number of arms in a multiple-arms macros since it must compare them two-by-two. When encountering a sequence repetition, it becomes exponential in the number of tokens in the other matcher (until the end of the matcher).

  • It’s very conservative, prefectly valid macros will be rejected. The first results tell that approximately a third of the regressions are false positives.

Alternatives

  • Not do this and either:

    • Keep the status quo, that is, keep macro-rules not future proof.
    • Search another solution, based on the knowledge of grammar productions by the algorithm (see thoughts here).
  • Only add this after investigating the FOLLOW set trick (see below).

  • Another interesting possibility to reduce the number of regressions is to look to the $(foo:tt)* pattern. Similar to the _ pattern in match and let, this pattern effectively matches all the possible inputs. Therefore, when it’s used in the second arm, it works as a “catch-all” and seems to be used like that in most the cases when we can encouter it. For example, take the following macro (taken from the log crate):

    macro_rules! foo(
        (target: $e:expr, $(tt:tt)*) => ...
        ($(tt:tt)*) => ...
    

    If expr ever starting matching new input, then macro invocations containing the new form will indeed switch from the second arm to the first. So this macro is legitimately not future-proof and reported as such by the algorithm. But clearly, it’s the intention here that the second arm acts as a “catch-all” rule and that if new forms are added to expr, then they should indeed start matching the first arm.

    We could thus add a builtin knowledge of this pattern to the algorithm, to specially recognize it and accept a macro such as the above. The main problem with that is that it moves the responsibility of ensuring that the macro is future-proof from the compiler to the user who must know check that such behaviour is indeed its intention.

    This in turns brings a new unresolved question which is: where to recognize this pattern? Do we only recognize the pattern when a whole arm solely consists of a catch-all matcher, such as the foo macro above? Or can we also recognize it in the middle of an arm, provided that what comes before is accepted, for example:

    macro_rules! foo(
        (foo bar baz, target: $e:expr, $(tt:tt)*) => ...
        (foo bar baz, $(tt:tt)*) => ...
    )
    

    This looks fine but probably has a few corner cases. Same question regarding single-TT catch-all, for example, can this:

    macro_rules! foo(
        (foo bar ( a lot of complex matchers ) baz) => ...
        (foo bar $(tt:tt) baz ) => ...
    )
    

    be accepted to, considering that there is a ”local catch-all” of what’s inside the parenthesized delimited-TT matcher in the first arm (and provided that what comes before and after is unambiguous). This has a few corner cases, then again when combined with sequence repetitions, but I think they’re easy to address. This too needs a bit of thinking. It can still be added to the analysis afterwards, though.

    Another option is to not specially-recognize $tt but introduce a new matcher explicitly designed for this goal, such as $all. This is exactly the same, but cleaner maybe, and this way we’ll be sure that the behaviour we will give it will be intended in every use case. We can’t be sure of that with $(tt). The problem, though, is that while it gives a simple easy-fix for existing use cases, it will still cause breakage and require user intervention.

  • Land it as an opt-in analysis pass that must be invoked explicitly some -option. This however reduces greatly the impact of such an analysis since most users probably won’t invoke it.

  • Land it as an error. This will cause a lot of breakage.

Unresolved questions

  • In order to turn this RFC into a true RFC, we must address the question of what we do with the FIRST sets. Currently, the FIRST sets are defined to be the exact sets of tokens that can start a given non-terminal, as computed from the grammar. So for any token that cannot start right now a non-terminal, it is considered that it will never be allowed to start this NT. It means we cannot restrict the FIRST sets, only relax them by saying “maybe we will reserve the possibility of expanding the language in the future by saying that this token will maybe allowed to start such NT”. This will add more regressions, though few I think. The T vs NT pattern doesn’t represent a huge share of the regressions. Of course we could also leave the things as they are, but this means that we cannot add new prefix operators in the future, for example. It’s not up to me to decide wether this is a restriction worth considering or not. That being said, it should be possible to restrict a bit the FIRST sets: currently, whenever a NT can start by an identifier, then it’s also considered as being allowed to start with any keyword. Maybe we could state that while expr (for example) can start with any arbitrary ident, it will never be allowed to start by some reserved keyword. However, the current Crater test results show that only half a dozen crates could benefit from this.

  • Investigate the trick imagined by @nmatsakis here to be able to recover after a “T vs NT” or “NT vs T” pattern, and, presumably, eliminate all remaining false positives. However, I still need to convince myself that there are no tricky cases (especially regarding sequence repetitions) that should be taken care of if we do not want this trick to introduce unsoundness in the algorithm.

    Also, this trick relies on specific parser semantics that should be implemented. I do not think such changes in the parser implementation would be breaking but then again, this needs a bit more investigation.

  • What to do in the future. I think @nrc wants to implement a new system of macros-by-example from scratch to eventually replace (and deprecate) macro_rules!. It’s still unclear wether this new system will actually suffer the same problems as macro_rules! regarding future-proofing. It might just use tt or other simpler systems instead of complexe NT-matchers. But if it was to use a similar matching system, then it would probably be better to directly embed this analysis in it right from the beginning, as an error, to avoid more future breakage. But then again, @nrc is probably the right person to ask.


#2

None of the links work. I think the markup is incompatible.

Are you trying to detect potential ambiguities the way LL parsers detect them? Why is the analysis quadratic? I can’t understand it without links to code.

This is currently blocking to be able to “backtrack“ when parsing macro-rules.

Are you saying that the ability to backtrack is blocked because it wouldn’t be future-proof?


#3

None of the links work. I think the markup is incompatible.

Oops sorry, I will fix that.

Are you trying to detect potential ambiguities the way LL parsers detect them?

Not really. It’s an overly conservative approximation and it’s not really the same goal, either: I don’t care if two arms match the same input, I want to be sure that there are no arms so that the second matches and the first doesn’t but should in the future (making the macro behaving differently). This is a complex problem, hence the very approximate solution.

Why is the analysis quadratic? I can’t understand it without links to code.

It’s quadratic in the number of arms because it can only compare arms one against the other, two by two, that is: 1 against 2, 1 against 3, … 1 against N, 2 against 3, … 2 against N, … N-1 against N. The number of runs of the pairwise comparison algorithm is simply N(N-1)/2, which is obviously O(n²). Fortunately, no macros have much more that 10 arms, and the huge majority of them have only 1 or 2 arms.

Are you saying that the ability to backtrack is blocked because it wouldn’t be future-proof?

Yes, totally. It would add a future-proofing issue to single-arms macros that are currently considered future-proof. See this post.


#4

In retrospect, I think we should not have had matchers that reference the rust grammar – instead, we should only have had tt matches and perhaps some of token-based regular expression (e.g., some way to say "suck up everything until the next , or unpaired )). For macros 2.0 I think we should consider something like that.

Of course, it’s not an option right now.

As for the pre-RFC itself, I think it makes a lot of sense, though I feel like I want to understand better the alternatives and how wide-spread the warnings would be. Do you have any statistics on the latter point?


#5

In retrospect, I think we should not have had matchers that reference the rust grammar – instead, we should only have had tt matches and perhaps some of token-based regular expression (e.g., some way to say "suck up everything until the next , or unpaired )). For macros 2.0 I think we should consider something like that.

This is also more or less my opinion on the topic.

As for the pre-RFC itself, I think it makes a lot of sense, though I feel like I want to understand better the alternatives and how wide-spread the warnings would be. Do you have any statistics on the latter point?

I only have statistics for the analysis as it’s implemented right now, that is: with the FIRST sets already restricted at their maximum, without the $(tt)* thing and without the FOLLOW set trick to recover in case of T vs NT errors. With this setup, crater reports ~275 root regressions (the messages are hard errors for now because it’s the only way to have crater report them but obviously they should be turned into warnings if we were to land this), of which ~200 are not actually root regressions, that is, they’re regressions because of a regressed dependency (no clue why they’re flagged as root, version issue maybe). On the actual remaining ~75 regressions, about 25 are false positives (that is, macros that should be fixed using the FOLLOW sets trick, for example, and other non-trivial examples which I have no idea how to make the analysis accept), half a dozen that could be accepted if we restricted the FIRST sets to reserve some keywords, and the rest of them (~35) are legit failures, that is: macros for which I can actually produce an example of changes to the Rust grammar that would make them behave differently.

Of course, there are potentially much more regressions that we cannot see because the dependencies of the packages are broken… If there is something to learn from these results, it’s that macros are used very pervasively in Rust code. Everyone uses them, for very various purposes. Much more than I expected.