The macro system is in somewhat of a dead-end at the moment: everyone seems to agree that there needs to be an overhaul at some point, but noone’s suggested one yet.
This is my attempt to describe the macro system I’d like to see:
Summary
There are three parts to this RFC: the first deals with solving the problems with the import, export and namespacing of macros; the second deals with name resolution and hygiene within macros; the third adds several missing features whose absence make the current system unnecessarily restrictive. In combination, they result in cleaner, more structured macro system, which is also significantly more useful.
Motivation
A reform of the macro system has been needed for a long time, with prior RFCs stating the need for more comprehensive changes post 1.0. Specific motivations are listed below:
Lack of proper namespacing for macros
While the grammar does accept qualified names during parsing, they are rejected later on by the
compiler. eg. std::panic!(...)
As such, macros can only be imported at the top level.
Unhygienic context for macro definitions
Names used in a macro definition are resolved relative to the expansion site. This has resulted
in the need for work-arounds such as $crate
.
Current hygiene rules only apply to local variables
Items, lifetimes, and anything other than a local variable can be shadowed or otherwise conflict with similarly named items used within a macro. The current hygiene system isn’t living up to its expectations.
Insufficient specification of the macro system
It’s often difficult to determine whether problems with the macro system are problems with the specification or just bugs in the implementation.
Macro system is less useful than it should be
The macro concat_idents!
, is essentially unusable as macros cannot appear in identifier
positions. In combination with the lack of hygiene, this makes it impossible to generate new
items which won’t conflict with any outside the macro.
Detailed design
The changes described in this RFC are intended to be implemented in a backwards compatible way by using macro!
instead of macro_rules!
to define macros under the new scheme. macro_rules!
would be left unchanged.
Import, export and namespacing of macros
The aim is for macros to be namespaced in the same way as other items. To achieve this, the current name resolution algorithm is modified:
- Name resolution is run on the un-expanded AST, only gathering enough information to resolve macro names.
- The macro expansion pass is run. Importantly, the expansion of a macro is defined to not affect name resolution of macros outside the macro, ie. a
use
statement within a macro cannot bring a new macro into scope outside it. This allows name resolution to remain mostly separate from macro expansion: each macro expansion runs a new pass of macro name resolution on the AST it generates, before checking if any inner macros need expanding. - Name resolution is run on the fully expanded AST, resolving everything other than macros.
The attributes which are currently used to control the import and export of macros will continue to work as before, but will be deprecated in favour of either qualifying the macro name, path::macro!(...)
, or importing the macro into the local scope, use path::macro!;
. Glob imports will not import macros in order to preserve backwards compatibility, although that doesn’t preclude the addition of new syntax for macro glob imports, if that’s desirable, eg. use path::*!;
.
It’s not expected for there to be any difficulty parsing qualified macros, since the grammar rules already allow for it.
Name resolution and hygiene
When expanding a macro, all named tokens (ie. identifiers and lifetimes) are tagged with the context in which they were written in the original source code. For example:
mod Foo {
macro_rules! my_macro {
($name: ident) => (
// The identifiers `myvar`, `println` and `$name` are tagged with the
// current context (scoped to Foo), but are not yet resolved
let myvar = $name;
println!("{}", myvar);
)
}
}
fn main() {
// The identifiers `bar`, `myvar`, `Foo` and `my_macro` are tagged with this context
let myvar = 1;
let bar = 42;
Foo::my_macro!(bar);
}
During the expansion of Foo::my_macro!
, bar
is substituted for $name
, and since it is tagged with the outer context, it is resolved to the local variable bar
within fn main()
. On the other hand, the definition of myvar
within the macro is tagged with the macro definition’s context, and so does not resolve to the same myvar
within fn main()
, and further uses of myvar
within fn main()
will not be affected.
Whenever a declaration such as this occurs (let myvar
), where the context of the identifier doesn’t match the context into which it’s being declared, the variable or item being declared is treated as completely fresh, and is anonymous (and thus inaccessible) outside the particular macro expansion to which it belongs. In this case, it means that fn main()
will contain an anonymous local variable.
An implementation detail: while useful conceptually, it’s not required in practice to tag all identifiers with a full context. Instead, they can be tagged with the recursion depth to which they belong, and a stack of active contexts can be maintained separately. For all identifiers not created as a result of macro expansion, the correct value for this index is zero.
New features
The addition of a lifetime fragment specifier
Current fragment specifiers are ident
, block
, stmt
, expr
, pat
, ty
, path
, meta
, tt
and item
. A new fragment specifier life
will be added, which will allow macros to accept lifetimes as parameters. It has identical semantics as ident
in that it matches a single token. The name life
was chosen because lifetime
is set to become a keyword as a result of the associated items RFC.
Macro invocation as substitution
Macros such as concat_idents!
are essentially useless, because macros can only appear in expression, item or statement position. Rather than complicate parsing by allowing macros in additional positions, a new construct is added, $!(...)
, which may be used like this:
struct $!(concat_idents!(foo, bar)) { ... }
This construct expands the macro early, at the same time as macro substitution happens, to produce a token stream which is spliced into the RHS of the macro before it gets parsed into an AST and substituted into the expansion site. The excessive number of brackets is probably unnecessary, but were the simplest option.
Since the parser already handles $substitutions
specially, it is not expected to cause any ambiguities when parsing to also detect $!(...)
.
This construct is limited to being used within a macro: this prevents non-macro code becoming obfuscated by its use, allows the construct to be treated as a “normal” substitution, and can also be implemented by the same syntax extension as macro!
, avoiding further complexity within the compiler itself (other than an additional parsing rule).
Tag manipulation
It will be useful to have more explicit control over how named tokens are tagged. The following may be added to provide this functionality:
-
with_tag!(token, tagged_token)
- returnstoken
tagged with the same context astagged_token
-
$caller
- returns the path used to invoke the macro, tagged with the caller’s context
It should be noted that in general these are not required, as it’s usually bad practice to generate items visible in the caller’s scope, whose names were not passed in as parameters to the macro. However, there may be exceptions where the macro writer knows better.
Concatenation of identifiers
concat_idents!
will tag its output with the outermost context of its inputs’ tags.
Drawbacks
The largest drawbacks are increased complexity in the macro system, as well as being a significant amount of work to implement. There’s also the potential to negatively impact compiler permance as a result of name resolution being split into multiple passes.
Alternatives
Alternatives include taking only a subset of this RFC, or doing nothing and waiting for a different RFC to improve the macro system; there appears to be universal agreement that an overhaul is needed at some point.
All names and syntactic details are of course bikesheddable.