[Pre-RFC] Macro improvements


#1

The macro system is in somewhat of a dead-end at the moment: everyone seems to agree that there needs to be an overhaul at some point, but noone’s suggested one yet.

This is my attempt to describe the macro system I’d like to see:

Summary

There are three parts to this RFC: the first deals with solving the problems with the import, export and namespacing of macros; the second deals with name resolution and hygiene within macros; the third adds several missing features whose absence make the current system unnecessarily restrictive. In combination, they result in cleaner, more structured macro system, which is also significantly more useful.

Motivation

A reform of the macro system has been needed for a long time, with prior RFCs stating the need for more comprehensive changes post 1.0. Specific motivations are listed below:

Lack of proper namespacing for macros

While the grammar does accept qualified names during parsing, they are rejected later on by the compiler. eg. std::panic!(...) As such, macros can only be imported at the top level.

Unhygienic context for macro definitions

Names used in a macro definition are resolved relative to the expansion site. This has resulted in the need for work-arounds such as $crate.

Current hygiene rules only apply to local variables

Items, lifetimes, and anything other than a local variable can be shadowed or otherwise conflict with similarly named items used within a macro. The current hygiene system isn’t living up to its expectations.

Insufficient specification of the macro system

It’s often difficult to determine whether problems with the macro system are problems with the specification or just bugs in the implementation.

Macro system is less useful than it should be

The macro concat_idents!, is essentially unusable as macros cannot appear in identifier positions. In combination with the lack of hygiene, this makes it impossible to generate new items which won’t conflict with any outside the macro.

Detailed design

The changes described in this RFC are intended to be implemented in a backwards compatible way by using macro! instead of macro_rules! to define macros under the new scheme. macro_rules! would be left unchanged.

Import, export and namespacing of macros

The aim is for macros to be namespaced in the same way as other items. To achieve this, the current name resolution algorithm is modified:

  1. Name resolution is run on the un-expanded AST, only gathering enough information to resolve macro names.
  2. The macro expansion pass is run. Importantly, the expansion of a macro is defined to not affect name resolution of macros outside the macro, ie. a use statement within a macro cannot bring a new macro into scope outside it. This allows name resolution to remain mostly separate from macro expansion: each macro expansion runs a new pass of macro name resolution on the AST it generates, before checking if any inner macros need expanding.
  3. Name resolution is run on the fully expanded AST, resolving everything other than macros.

The attributes which are currently used to control the import and export of macros will continue to work as before, but will be deprecated in favour of either qualifying the macro name, path::macro!(...), or importing the macro into the local scope, use path::macro!;. Glob imports will not import macros in order to preserve backwards compatibility, although that doesn’t preclude the addition of new syntax for macro glob imports, if that’s desirable, eg. use path::*!;.

It’s not expected for there to be any difficulty parsing qualified macros, since the grammar rules already allow for it.

Name resolution and hygiene

When expanding a macro, all named tokens (ie. identifiers and lifetimes) are tagged with the context in which they were written in the original source code. For example:

mod Foo {
    macro_rules! my_macro {
        ($name: ident) => (
            // The identifiers `myvar`, `println` and `$name` are tagged with the
            // current context (scoped to Foo), but are not yet resolved
            let myvar = $name;
            println!("{}", myvar);
        )
    }
}
fn main() {
    // The identifiers `bar`, `myvar`, `Foo` and `my_macro` are tagged with this context
    let myvar = 1;
    let bar = 42;
    Foo::my_macro!(bar);
}

During the expansion of Foo::my_macro!, bar is substituted for $name, and since it is tagged with the outer context, it is resolved to the local variable bar within fn main(). On the other hand, the definition of myvar within the macro is tagged with the macro definition’s context, and so does not resolve to the same myvar within fn main(), and further uses of myvar within fn main() will not be affected.

Whenever a declaration such as this occurs (let myvar), where the context of the identifier doesn’t match the context into which it’s being declared, the variable or item being declared is treated as completely fresh, and is anonymous (and thus inaccessible) outside the particular macro expansion to which it belongs. In this case, it means that fn main() will contain an anonymous local variable.

An implementation detail: while useful conceptually, it’s not required in practice to tag all identifiers with a full context. Instead, they can be tagged with the recursion depth to which they belong, and a stack of active contexts can be maintained separately. For all identifiers not created as a result of macro expansion, the correct value for this index is zero.

New features

The addition of a lifetime fragment specifier

Current fragment specifiers are ident, block, stmt, expr, pat, ty, path, meta, tt and item. A new fragment specifier life will be added, which will allow macros to accept lifetimes as parameters. It has identical semantics as ident in that it matches a single token. The name life was chosen because lifetime is set to become a keyword as a result of the associated items RFC.

Macro invocation as substitution

Macros such as concat_idents! are essentially useless, because macros can only appear in expression, item or statement position. Rather than complicate parsing by allowing macros in additional positions, a new construct is added, $!(...), which may be used like this:

struct $!(concat_idents!(foo, bar)) { ... }

This construct expands the macro early, at the same time as macro substitution happens, to produce a token stream which is spliced into the RHS of the macro before it gets parsed into an AST and substituted into the expansion site. The excessive number of brackets is probably unnecessary, but were the simplest option.

Since the parser already handles $substitutions specially, it is not expected to cause any ambiguities when parsing to also detect $!(...).

This construct is limited to being used within a macro: this prevents non-macro code becoming obfuscated by its use, allows the construct to be treated as a “normal” substitution, and can also be implemented by the same syntax extension as macro!, avoiding further complexity within the compiler itself (other than an additional parsing rule).

Tag manipulation

It will be useful to have more explicit control over how named tokens are tagged. The following may be added to provide this functionality:

  • with_tag!(token, tagged_token) - returns token tagged with the same context as tagged_token
  • $caller - returns the path used to invoke the macro, tagged with the caller’s context

It should be noted that in general these are not required, as it’s usually bad practice to generate items visible in the caller’s scope, whose names were not passed in as parameters to the macro. However, there may be exceptions where the macro writer knows better.

Concatenation of identifiers

concat_idents! will tag its output with the outermost context of its inputs’ tags.

Drawbacks

The largest drawbacks are increased complexity in the macro system, as well as being a significant amount of work to implement. There’s also the potential to negatively impact compiler permance as a result of name resolution being split into multiple passes.

Alternatives

Alternatives include taking only a subset of this RFC, or doing nothing and waiting for a different RFC to improve the macro system; there appears to be universal agreement that an overhaul is needed at some point.

All names and syntactic details are of course bikesheddable.

Unresolved questions


#2

One of the reasons people hate macros in C++ so much is because they’re used in random places to create unintelligeble DSLs. I’m personally okay with having macros in expression-land, it’s pretty clear what they can do there (modify passed in variables, return/break, and stuff like that). But having macros in function names and whatnot might affect readability. Similarly with item-land it’s usually still readable, though DSLs do happen.

However, I do see the need for stuff like this. But most of the use for this will be inside macros, why not restrict $! to only work inside an existing macro_rules?


#3

However, I do see the need for stuff like this. But most of the use for this will be inside macros, why not restrict $! to only work inside an existing macro_rules?

I think I agree with you here: that would also make it less ‘special’ as well (it’s behaviour could be implemented by the macro! syntax extension as part of expansion, rather than being intrinsic to the compiler).

edit: I’ve updated the first post to add that.


#4

My first reaction upon reading this was that some of it sounds non-controversial, but other parts are probably impossible to add in a 100% backwards-compatible manner (i’m thinking in particular of the proposed fixing of “Unhygienic context for macro definitions”; some macros today may well be relying on this artifact, for better or worse.)

Is the goal to actually propose changes to be made to macro_rules! itself? Or is it to develop the criteria that we would like for the next macro system, which may well choose not to use the name macro_rules in order to not be hindered by compatibility constraints?


A completely orthogonal question: Something does not work (that I would like to see corrected): AFAICT, one cannot define a macro-defining macro via macro_rules! (and thus cannot do so in stable Rust today), or at least not a macro-defining macro where the defined-macro takes arguments.

(The essence of the problem is that there is no way to escape the dollar-signs, and so the attempt to introduce bindings in the inner (generated) macro are treated as substitutions by the outer (generating) macro.

Any interest in adding this feature to your list?


#5

Is the goal to actually propose changes to be made to macro_rules! itself?

At the moment, I’m not proposing any changes to the existing macro_rules! system - the two systems would exist in parallel, and that includes the hygiene for macro definitions. However, several of the features (import/export of macros and some of the extra features could be added backwards compatibly to macro_rules! at some point)


#6

AFAICT, one cannot define a macro-defining macro via macro_rules! (and thus cannot do so in stable Rust today), or at least not a macro-defining macro where the defined-macro takes arguments.

Under this scheme it’s only possible in a limited way - the way name resolution works, macros expansion cannot result in a new macro definition visible outside that expansion, as otherwise you can’t separate the name resolution/macro expansions passes at all. You can define new macros, as long as they are only used within the macro which defines them. (Reminds me a bit of the Scoped API)

Is that sufficient for your use-case?

Also, with regard to escaping dollar-signs, @eddyb has said there’s no intrinsic problem with adding something like $, or more generally \token for escaping.