Evolution of procedural macros


#1

I am using procedural macros for more than one year now and will use these for other projects. As far as I know there is not much information on procedural macros and even less on how it will evolve. I know that even Rust-core developers have little idea of the future but since I am using this feature for my research work, I would like to discuss about what we should expect of it.

(1) If I understand it well, a procedural macro is expanded before any analysis or transformation of the surrounding code, is that right? It implies that code inside the procedural macro can not query information on Rust code declared outside this macro.

Let me demonstrate that with a project I work on: a parser generator. Currently, semantics actions are Rust functions declared inside the procedural macro; I actually need the return type of these functions. Therefore, my first question is:

  • Given a function name, can I access its type signature even if it is declared outside the procedural macro?

Of course I can generalize this question to any named-items. If my statement in (1) is true, then it should not be easily possible.

Now, for another project, I have another problem. I try to design a constraint modelling language as an EDSL within Rust. It could be do-able if I defined an entire new language completely independant of Rust, however what I want to do is to use Rust constructions such as loops or alternatives inside this language so I do not have to re-implement these. Consider the following example:

let mut space = FDSpace::default();
let mut queens = vec![];
// ...
tell! space {
  queens[i] + i != queens[j] + j
};

Right now, from inside the macro tell there is no way to choose the correct constraint since we do not know the type of the variables. Actually, this is not that bad because I compile these constraints into generic structures and picking the right implementation is done by Rust later. However, it is a big problem for optimizations: we could imagine the macro to rewrite this constraint into a more efficient version if we are sure we are manipulating integers. Of course, it could probably be done at the Rust level through trait overloading and trait specialization. However, coming from the C++ world, I really know that using genericity for meta-programming leads to code difficult to write, read and maintain… This is why I believe that syntax extension are a smart replacement of “hard” meta-programming. My second question is:

  • In addition of the syntax tree, could we imagine a list of the currently accessible (local) variables along with their types?

It is possible that types have not been infered yet when we expand the macro. There are several possible strategies for this, but I think the simplest is to give the type _ to the variable (as it is done now) and to let the macro cope with it. After macro expansion, Rust inference will give the correct type to that variable. It would required an “expansion on-demand” mechanism, I think that expansion is currently done before (nearly) everything else.

Maybe that what I described is a kind of procedural macro working on MIR instead of working on HIR?

If I missed RFCs or interesting documentation/discussion about procedural macros, please let me know.


#2

I think what you want is a customised compiler, which is a use-case we want to support, but not with procedural macros. It seems you want two things which are outside the scope of procedural macros:

  • access to the whole program - macros are inherently local and context free. It is possible to imagine loosening this restriction, but IMO the costs (in complexity of the system) outweigh the benefits.

  • semantic information - i.e., the results of name resolution and type analysis. Macros are syntactic and happen before these phases of the compiler. We’ve thought about integrating name resolution and macro expansion to some extent, but doing all of name resolution would be complex. Likewise, I think it would be impossible to do type analysis at the same time as macro expansion.

So, on both counts it seems that procedural macros aren’t a good fit for what you want to do, and they probably won’t be in the future. However, we do support integration with the compiler at a deep level and this is probably what you want (i.e., to create a custom compiler). This tutorial explains how to do this - it is aimed at creating a tool, but creating a full compiler would use the same techniques. You get full (and easy) access to type and name info via the save-analysis API, see src/librustc_trans/save/mod.rs. This stuff is all work in progress and there are no concrete plans at the moment. But your use cases are exactly what these things should help with, so if it doesn’t fulfil your needs, then let me know and we’ll try to figure something out.


#3

Many thanks for this answer and your work on syntax extension and so on.

I better understand now, macros will stay context-free and you propose a mechanism to “patch” the Rust compiler for context-dependant stuff. I read the tutorial, as I understand it I can extend the Rust compiler with my own feature. My question is:

  • How does it compose? I could want to use a parser generator along with the EDSL constraint modelling language in the same crate.

Also, I have the feeling that for syntax extension it is a bit hardest than procedural macros. Say that I add a new construct inside the language, I can only handle this before the parsing (otherwise a parse error will occur). However it means that I need to parse regular Rust code too. Maybe we could overcome this problem by allowing a procedural macro to return arbritrary string data along with the Rust AST so it can be processed later with the custom compiler API. I do not really know but I think what you propose is better suited for context-dependant AST transformation rather than for syntax extension.