Macro expansion points in attributes

This is an accompanying text to PR 67121.

Motivating examples.

People want things looking roughly like this

#[doc = include_str!("my_doc.md")]
struct S;

#[path = concat!(env!(”OUT_DIR"), "/generated.rs")]
mod m;

to work.

General description of the problem.

To make this work we need to somehow expand macros in arguments of attributes.
To expand them we need to

  • first identify macro calls in the arguments, because the arguments are an arbitrary token stream in general, then
  • expand the identified macro calls.

Note that the attributes in question are not macro attributes, they are inert, so they don't perform expansion themselves.

Macro attributes could process their input themselves in ways specific to each individual macro using some kind of eager expansion interface for proc macros or using the full power of compiler APIs like some built-in macros currently do. (So let's temporarily forget about macro attributes for now.)

Inert attributes cannot do that in general.
For built-in attributes we could in theory hard-code this kind of expansion individually, but that's somewhat horrifying from the implementation point of view.
It would be good to have some more general mechanism.

Current situation.

There's currently a way to perform macro expansion in attribute arguments, but it's using an outrageous hack.
Expansion infrastructure currently identifies interpolated expr tokens (from macro_rules) with a macro call inside them as expansion points.

So code like this works:

macro_rules! define_s_with_doc {
    ($expr: expr) => {
        #[doc = $expr]
        struct S;
    }
}

define_s_with_doc!(concat!("a", "b"));

The history behind this feature is pretty usual - "let's support this specific use case without thinking about larger picture".
I would like to remove this hack from the compiler if only possible, but its most common use with key-value attributes must continue working.

Possible solutions: general solution.

The general solution would be to define some "signaling sequence" of tokens that would have a special meaning in all token streams (attribute arguments are tokens streams too as you could probably notice) and notify macro expansion infrastructure about a macro call start. The expansion infrastructure would then expand the identified macros and put the tokens into their parent token stream.

E.g. if the signaling sequence is $$ (for illustrative purposes only), then these examples would work:

#[doc = $$include_str!("my_doc.md")]
struct S;

#[allow($$lint_list!())] // Non key-value attributes are supported
struct Z;

#[clippy::foo(a, b(c($$bar!()), d))] // The macro expansion can be identified arbitrarily deeply inside the arguments

(Perhaps some escaping would also be added to pass the signaling sequences without attempting to identify them as macro calls.)

Actually, this approach can be used for proc macro attributes as well - if your proc macro doesn't need some highly custom treatment of the input tokens it can just use $$ for delegating "eager expansion" in its arguments to the compiler.
If it does need custom treatment, then some eager expansion API a la RFC 2320 is a possible solution.

Note that this solution does not provide compatibility with the #[doc = $expr] hack because existing code using it doesn't use any signaling sequences.

Possible solutions: less general solution.

Identify some specific position as a macro expansion point, for example the value position in key-value attributes #[KEY = VALUE].

This solution doesn't introduce any new syntax, so it does provide compatibility with the #[doc = $expr] hack, and also covers a common use case. It also doesn't prevent introduction of a more general solution with some new syntax in the future.

The catch is that it restricts what we can do with key-value attributes in the future.

Right now the VALUE in #[KEY = VALUE] is either a literal or an interpolated expression token $expr (which is a subset of the expression grammar), with a restriction that it must be a literal after expansion.

PR 67121 extends it to full expression grammar (while keeping the "literal after expansion" restriction).
This makes it impossible to extend the VALUE in some other way in the future, for example accept types or just arbitrary token streams after the =.
It may be acceptable though, since arbitrary token forms are still available with #[KEY(VALUE)], #[KEY[VALUE]] and #[KEY{VALUE}].

Conclusions

So, what do you think about all this stuff? :slight_smile:

1 Like

Personally, I think accepting an expression in value position of #[key = value] attributes is about the best solution we have available.

Ultimately, I don't think this matters too much (to non-std macro authors) in practice, as most macros seem to prefer using #[mac(..)] "pseudo namespacing" for inert helper attributes in derives anyway. (Do we even allow key-value form for proc macros yet? I'm not sure, and I know I've not seen one.)

Though I'd at least like to record another option: make e.g. #[doc] not an inert attribute, and use whatever (unstable/internal) APIs for eager expansion, re-emitting the annotated item with a new inert #[__doc_internal] attribute.

1 Like

#[macro_attr = "value"] is not currently allowed.
(Mostly because the proc macro author cannot currently discern between macro_attr = "value" and macro_attr("value"), not due to some fundamental reasons.)

That's kind of similar to "hardcoding expansion points for built-in attributes" mentioned above.
The downside is that we'd need a new delegating macro attribute for every built-in attribute we'd want to support, it would be nicer to have some more general mechanism dealing with all of them at once.

With a less general solution, there might be more IDE features supported inside the attributes. If the RHS is an expression, it can provide completions/assists/check syntax/etc for the expressions. If it's arbitrary something, the IDE can only check that parenthesis are balanced. That is the reason why I would have slightly prefered to stick with the old "meta item" grammar of attributes, and avoid arbitrary token trees.

But that is probably not an important consideration, as having IDE features inside attributes themselves is not that helpful.

1 Like

One possible tweak to make this more conservative is to accept LITERAL | MACRO_CALL during parsing and LITERAL after expansion.
Then expressions become an implementation detail.

(This variant still restricts what we can do with key-value attributes in the future though.)

An general alternative without modifying syntax cf. cfg_attr (but IMO uglier and less intuitive):

#![expand_attr(doc = include_str!("README.md"))]
#[expand_attr(path = concat!(...))]
mod s;

That said, given the prior art of #[doc(include(…))], perhaps we could also introduce a very specialized solution like

#[path = "generated.rs", in_dir_by_env("OUT_DIR")]
1 Like

I didn't mention this in the post, only in the Github PR, but eliminating #[doc(include(…))] while it's still unstable is one of the primary reasons for me to work on this (https://github.com/rust-lang/rust/issues/44732#issuecomment-526560063).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.