Procedural macro access to dependency items

I understand that procedural macros cannot retrieve information about types used in the macro input because they are run halfway during parsing, before typeck is performed.

But what about items declared in dependencies? When a procedural macro is run, the compiler can already resolve paths referencing items from dependency crates, the same way the procedural macro themselves are referenced. And since the metadata of the dependency crates are already known (even in pipelined compilation) before a crate is compiled, they can indeed be reliably exposed to the procedural macro.

The macro can pass the path in its parsed input to some proc_macro API, which could resolve the path into type information using the macro call context. In the case of nested macro calls, this behaviour should use the first macro call in the stack, ini consistency with other type resolutions.

This is useful for crates like auto_enums, which need to derive the current item in conjunction with some upstream trait, etc.

Most of the metadata can't be parsed until TyCtxt is created. Creating TyCtxt requires the already expanded AST.

Aren't those information already available for dependency crates?

On top of other issues, you can't do name resolution until all macros are expanded, because macros can (and do) emit use directives and new items which impact name resolution.

What about resolving them in the same scope the macro itself is expanded? That would be fair since you need to resolve the path of the macro itself in the first place.

While they are available in serialized form, they can't be deserialized until TyCtxt is created, as types are stored in TyCtxt. Pretty much only the things necessary for name resolution, macro expansion and other things happening before TyCtxt is created can be deserialized without TyCtxt.

Is it very challenging for the compiler to pre-initialize TyCtxt at macro expansion time, even though no type information from the current crate needs to be populated?

I don't know how hard it would be, but I am pretty sure that doing so would take a non-neglectable amount of time, as it would have to be recreated every time a new crate is discovered to be used due to macro expansion. TyCtxt is read-only.

Why does it need to be recreated? In the context of the desired functionality in this thread, the deserialized metadata are most likely going to be used in compile time after the macro is expanded since the procedural macro would generate code related to those types (e.g. generate implementations of those traits).

The TyCtxt is by definition read-only. This is necessary to make it possible to cache queries. This means that if new crates are found as dependencies, the TyCtxt would have to be recreated to accommodate for them.

Note that a proc-macro could expand to mod foo { extern crate bar; } or use bar; (which would add bar as dependency if it is a crate)

1 Like

I think a lot of this talk of compiler implementation details is missing the more fundamental concerns. We haven't even established that we want a feature like this in the language, however easy or hard it might be to implement.

At a language design level, the issue is that if procedural macros have access to type information, that creates a new kind of circular dependency where some types depend on macros depending on other types. The language already has some "design cycles" of this sort, to my knowledge 1) name resolution and macros, see RFC 1560 and 2) compile-time function evaluation (CTFE) in general. Adding another is not out of the question, but it is a very high cost with huge, non-trivial design questions that aren't being discussed here at all (as is the norm with requests for "types in proc macros").

In particular, both of the features I just mentioned involved an awful lot of design headaches over what to do with order dependencies and detecting infinite cycles and the like. The simplest example is probably RFC 1560's "Avoid 'time-travel' ambiguities". The possibility of similar ambiguities at the type level is... well honestly my type fu isn't good enough to judge it, but it feels potentially terrifying, and I'd need an awfully thorough technical argument to convince me we'd managed to make it harmless, like the one in that RFC about iterating to a fixed point of stable resolutions.

And then we have the motivation question. Because CTFE is coming, and CTFE obviously has "access to types" in some of the relevant senses, it's not clear to me that there's anything "types in proc macros" would unlock that macros plus full-blown CTFE wouldn't also be capable of. Or at least, I've yet to see anyone post a concrete use case with an argument that CTFE will never be good enough for it. Note that I'm assuming "full-blown CTFE" would eventually include some compile-time reflection APIs. I think that's more than enough for the auto_enums case cited in the OP, but I'm not familiar with that crate so if I'm wrong please do explain what's missing.


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.