The future of syntax extensions and macros

I'm not sure where you want feedback on this, but assuming here is OK:

Actually, this would fall out naturally from having support for eager expansion and tt-producing macros. With this, you wouldn't need to actually support macros in ident position, and the ident-generating macros would be invalid outside of macro expansions by virtue of not expanding to grammar elements.

As a bonus, this would also make supporting certain constructs a hell of a lot easier, and hugely reduce the need for things like push-down accumulators.

Is there a reason that the syntactic macro builder shouldn't be implemented as a procedural macro? It seems like the best practice would be to have an interface for syntax extensions which will become a defined and stable part of the language and implement the syntactic builder using that interface, making it a component that's more analogous to std than a language feature.

This would be "prove out" the syntex interface, increase the severability of macros from the rest of Rust, and reduce the scope of the definition of Rust the language. The benefits of the first seem self-evident and the latter two are important to making Rust changeable and reimplementable (I assume that it would be in Rust's long term interest to have multiple compilers someday).

Would it be possible for format_args! to similarly be moved to a preluded syntax extension instead of a compiler builtin?

2 Likes

An overview of plans: http://www.ncameron.org/blog/macro-plans-overview/

My intention is to flesh these out in further blog posts and eventually an RFC. Feedback would be appreciated!

Iā€™m not actually sure of the technical issues preventing implementing macro_rules as a procedural macro. I believe it should be theoretically possible, but the implementation is currently so built-in to the compiler that it is hard to tell.

I would like to be able to implement pattern based macros as procedural ones, but it would likely be a lot of work and require a unique macro form, which I am not sure we should support (a macro with an identifier before its parameters).

I agree it would be a good test of the power of the procedural macro system. However, I donā€™t think it is useful to consider macros by example as a library, rather than a part of the language - they are widely used and it feels like it would lead to fragmentation.

I would hope that any compiler built-in macro could be moved to a procedural macro with enough effort. Whether that is a worthwhile use of that effort or not is debatable.

This is new, but not necessarily unique. A lot of DSLs (rspec's describe, context, and it features come to mind) take something analogous to an ident and a delimited token tree.

My thoughts are the opposite. In the long term, when there will be more implementations of Rust than just rustc, they will be able to share the same standard library. Anything else implemented as a pre-imported library they will also be able to share. So if macro_rules! or its successor are implemented as libraries, they can be shared by different Rust compilers, reducing portability issues. In general, anything that makes the core language smaller seems like it will help with fragmentation in the long term.

I would like a syntactic form for macro_rules macros which only matches a single pattern and is more lightweight than the current syntax.

What about macros without parameters? Usually you just want a const or a static, but maybe not. Could we have macro! foo => { ... }, invoked as foo! ?

This proposal sounds great, also!

So if macro_rules! or its successor are implemented as libraries, they can be shared by different Rust compilers, reducing portability issues.

I agree. Scheme's portable syntax-case is a good example of this.

I would like a syntactic form for macro_rules macros which only matches a single pattern and is more lightweight than the current syntax. The current syntax would still be used where there are multiple patterns. Something like,

macro! foo(...) => {  
    ...
}

Can you please go into more details on the proposed new macro meta-syntax? Will it still be possible to skip to the end of a macro without having seen its definition?

I will, but I haven't worked out the details yet :slight_smile: It should still be possible to skip macro definitions - you just need to balance the {} for that. The difference I see is just that you skip a little boilerplate:

macro! foo(...) {
    ...
}

desugars to

macro! foo {
    (...) => {
        ...
    }
}

At the moment macro syntax looks something like this:

delim_seq ::= '{' ... '}' | '(' ... ')' | '[' ... ']'
macro ::= ident '!' delim_seq
        | ident '!' ident delim_seq

Sounds like you want to add another option:

        | ident '!' ident delim_seq delim_seq

Iā€™m just wondering if it will be possible to disambiguate the second and the third forms in all cases.

fn nope() -> some_ty_macro!() { ... }

Ta-da! Parsing ambiguity! :smiley:

But thatā€™s not an issue if you introduce a macro with macro rather than macro!; i.e. it should be a first-class syntax. If anything, it should be macro foo! ..., so that the ! can be considered part of the macroā€™s name.

That would prevent using a macro expanding to an ident as the name of the macro to define. Are macros that define other macros a valid use case? :wink:

Sure, why not? Might as well question whether functions that return functions are a valid use case. :stuck_out_tongue:

Also, it should have been some_ty_macro! foo() { ... } but close enough. To be clear, Iā€™d love a syntax that allowed for trailing blocks, but Iā€™m not sure how to do it unambiguously at this point.

This case at least is solved by not dropping the requirement for =>. Which gives us a grammar:

macro ::= ident '!' delim_seq
        | ident '!' ident delim_seq
        | ident '!' ident delim_seq '=>' delim_seq

This, though, would introduce an ambiguity in for a macro in the pattern position of a match statement.

There may not be a way around this, and maybe its for the best. Pattern-matching macros have a very a different language from the rest of Rust, and it makes sense that that language be enclosed in its own scope.

I donā€™t necessarily subscribe to the position that the syntax for declaring macros must be the syntax for using them, which I believe makes the problem easier (since macros can only be declared in item position). Though I havenā€™t thought through the details of this.

If macro! is a procedural macro (something I still like a lot; to be clear td_ was my account on this forum), then macro! is an invocation of a macro. But even then, I think this would probably be solved by only allowing the ident '!' ident delim_seq delim_seq macros in the item position.

Something like this?

delim_seq  ::= '{' ... '}' | '(' ... ')' | '[' ... ']'
macro_expr ::= ident '!' delim_seq
             | ident '!' ident delim_seq
macro_item ::= macro_expr
             | ident '!' ident delim_seq delim_seq

Another blog post, on the hygiene algorithm: http://www.ncameron.org/blog/sets-of-scopes-macro-hygiene/

2 Likes

And another blog post, on syntax - http://www.ncameron.org/blog/macro-plans-syntax/

Thanks to everyone in this thread for this one - it helped shape my ideas here a lot

2 Likes

I donā€™t understand what the advantage of not having a bang after macro is. Just that its one character shorter? I get that implementing syntactic macros as procedural macros would not be a high priority, but I just donā€™t see any argument in favor of foreclosing on it.

I guess probably people feel like it is syntactically more consistent, but in contrast: procedural macros will require an attribute to declare, a kind of syntax which can be defined using macros. It seems appropriate to me that a macro declaration looks like a macro, even if it isnā€™t.

Iā€™m not sure the ! matters much either way. The important thing is whether a macro declaration behaves like an item (function, struct, enum, etc.) or behaves like a macro use. If the latter then we have to be more restrictive about the syntax because it can appear in more contexts. Picking the former means we can have nicer syntax for the declaration.

I think macro makes sense if a decl is like an item (following from fn, struct, etc.) and macro! makes sense if a decl behaves like a macro use. It seems reasonable to disagree on that though.

1 Like

The $m! syntax means expand the macro m! eagerly, ...

That already has a meaning: substitute the capture $m and, assuming it expands to an identifier, expand the result as a macro. Changing this would make callback macros impossible. Please don't.

I'm totally in favour of some kind of eager expansion syntax, mind you.

Regarding parsing of the new form, why not simply require either macro $name:ident {$($body:tt)*} or macro $name:ident ($($pat:tt)*) {$($body:tt)*} exactly. That is, don't allow arbitrary matchers, require braces or parens followed by braces.

Also, is this the extent of the syntactic changes you have in mind? Because there are a few things more I'd want to see changed, given the chance.