[Pre-RFC] Generation of item idents in macros


#1

One long-standing problem of the Rust macro system is the inability to generate item identifiers inside macro expansions (1 2 3 4 5). So far, macro reforms had failed to address this issue, but I think this is important enough, that we should try again before v1.0 is finalized.

Here are a few solutions, that I’ve seen proposed over the years (plus one that I came up with). I would like to see if we can reach a consensus about the best way to proceed. Other ideas are welcome too, of course!

  1. Add token pasting operator in MBEs, for example “+". We could then have an MBE RHS like "`... => ( foo_ + $x )`” expand into “foo_bar” identifier ($x being “bar”).
    A significant drawback of this scheme, IMO, is that it wouldn’t be possible to abstract away identifier generation into a separate macro; all concatenation has to be inline.
    Also, this doesn’t cover any operations other than concatenation. What if somebody wants to write a syntax extension to uppercase idents?
  2. Change Rust parser and AST to allow macros to appear in ident positions.
    There was an RFC and an implementation PR of that last year, but they were rejected on the grounds that this might interfere with a possible future “macro methods” syntax. Admittedly, this extension was broader than absolutely needed, because it would allow to use macros in place of idents even outside of the context of macro expansion, which could easily be confusing.
  3. Add a new standard syntax extension for eager expansion of macros. For example:
... => ( eager_expand!( concat_idents!(foo_, $x) as #x, 
                        concat_idents!(foo_, $y) as #y in
                        mod #x {
                             fn #y (...) {}
                        }
                      ) )
The idea here is to eagerly expand macros on the LHS side of "in", and substitute results into the RHS, in place of the corresponding identifiers. The RHS is then the result of expansion of this macro.

The # prefix is to prevent the outer macro from messing with eager_expand’s internal identifiers. Another way to achieve the same would be to create an escape for $'s in Rust MBEs, say $$. In this case, these idents would be spelled as $$x and $$y.


#2

Looks like somebody already proposed a more restricted version of my idea: 4. Create macro invoke_with_concatted_idents!(other_macro, ident1 + _ident2, arg1, arg2, ...), which invokes other_macro!(ident1_ident2, arg1, arg2, ...)


#3

A more conservative variation of #1 would be to create a token-pasting syntax extension, so that the “token-paste” operator would only need to be defined in the context of this extension. For example:

macro_rules! make_func {
($a:ident) => {
paste_tokens! { fn aaa_ ## $a ## _bbb () {} }
}
}

make_func!(xxx); // expands to ‘fn aaa_xxx_bbb () {}’ (1.1)


#4

A couple more thoughts on #3:

  • Binding expansions to identifiers is probably unnecessary.
  • Macro expanders need to know what kind of a fragment they are being expanded to.

Here’s a possible alternative syntax: #frag#macro!(...), ‘frag’ may be of : ident, expr, pat, items, methods, stmt.

Example:

macro_rules! make_func{
($a:ident) => {
eager_expand! {
fn #ident#concat_idents!(aaa_, $a, _bbb) () {}
}
}
}

(3.1)

#5

cc @nikomatsakis @sfackler

Is there any way to do this backwards compatibly later?


#6

The current plan is to deprecate macro_rules and make a new system, so we could always add this kind of facility to the new macro system and remain 100% backwards compatible. Since hygiene is so important and it is unclear how exactly hygiene would work with this kind of system, plus the obvious abundance of design work required, I would recommend postponing this until post-1.0 and thinking of it as part of macros 2.0.


#7

Oh, that’s a pity. I am trying to write a library for interop with Windows COM, and for each COM interface I need to generate a bunch of code artifacts based on a single user-provided identifier. The inability to create new idents in macros is maddening! I could have the user provide all idents, but that’d be awfully un-ergonomic…

re hygiene: I think that for items (i.e. functions, structs and modules), hygiene will be unwanted more often than the other way round. After all, what use is a generated item if you cannot refer to it from outside?


#8

I had the same problem with my COM bindings, which was why I switched to using a procedural macro for the generation. Made the parsing part much more flexible, too.

But, with procedural macros being unusable in stable Rust, I’d probably just switch to a separate code generator program. Before I ran out of time, I was looking at just generating Rust interfaces from either IDL or TLB files. Sadly, I got distracted by COM requiring so much of Win32, so then I tried to modify rust-bindgen to work properly with windows.h… and I never quite got to the bottom of the rabbit-hole. :stuck_out_tongue:


#9

Hygiene doesn’t necessarily mean that you can’t refer to things created in the macro from outside. I’m afraid I can’t give you a good definition of what hygiene is right now - I read a lot and understood a few months ago, but since we fixed up macros and put off the ‘proper’ fix till post-1.0, I’ve paged it out a bit.

Anyway, I think even items must be hygienic, but that the rules of hygiene will be different due to the different rules around naming and name resolution. I believe that there can be a hygienic solution to creating idents, you just need some way for the ‘caller’ of the macro to signal to the macro that it is ‘aware’ of the names being created. Sorry that is a bit hand-wavey, I really don’t have a precise solution, only an intuition.