Allow proc. macros to query limited contextual information

It might be advantageous to some procedural macros to have limited knowledge about their context. So far a procedural macro can only inspect the AST that it is allowed to operate on without any contextual information.

Limited contextual information, for example the module_path! at which the procedural macro has been invoked, could potentially bring a lot to the table for what a procedural macro can achieve.

In one of my code bases the feature to query the call-site module_path! would solve an ambiguity problem we currently have to work around with "hacks". On an abstract level we try to find unique serializable identifiers to call methods of traits for communication between different end points.

Now imagine we'd have a TraitMethod<const ID: u64> trait with which we can query some useful information about a trait method. (Reflection-like) The problem is that we need to calculate this unique ID: u64 at proc. macro compilation time.

pub trait TraitMethodInfo<const ID: u64> {
    /// Is true if the trait method has a `&mut self` receiver.
    const IS_MUT: bool;
    /// The number of inputs without the `self` receiver.
    const NUM_INPUTS: usize;
}

Now let us consider the following code:

mod foo1 {
    #[derive_trait_method_info]
    pub trait Foo {
        fn foo(&self);
    }
}
mod foo2 {
    #[derive_trait_method_info]
    pub trait Foo {
        fn foo(&self);
    }
}

Note: Using a work around we can indirectly implement traits on trait definitions. (At least in our own use case.)

In this example, both foo1::Foo and foo2::Foo have absolutely the same internal layout. So a procedural macro without this contextual information about their module path would not be able to properly disambiguate them and would end up generating the same non-unique identifiers ID for both foo1::Foo::foo and foo2::Foo::foo trait methods.

This is a problem to us. I happily appreciate other ideas for potential solutions to our problem or a nice discussion whether a feature like this was useful to other Rust projects. If deemed useful I'd be willing to write an RFC.

We could solve this by adding another parameter to procedural macros that is similar to syn::Path in order to provide the proc. macro with this kind of limited context information.

If you read until this point, thank you a lot for your attention!

1 Like

From having been one of the persons interacting the most with the #macros channel in the Rust community Discord, which besides these forums, is the most "official" / centralised place where people discuss about macros, I can attest there is an important need for eager expansion of macros (or, at least, some specific subset of it, such as include…!, env!, module_path!). One very desired use case, for instance, is that of allowing macros to know whence they are called (module path, and file path) –the OP is one such example–, as well as being able to cleanly / properly load certain file contents.

That is, you may consider trying to implement println! yourself without magic compiler help, and you will notice that you can't reproduce the following behavior:

macro_rules! prefixed {() => (
    "{greeting}, {name}!"
)}
println!(my_template!(), greeting = "Hello", name = "World");

// or:

macro_rules! em {( $fmt:expr $(,)? ) => (
    concat!("<em>", $fmt, "</em>")
)}
println!(em!("Hello, {}!"), "World");

As to a possible API for this, I created a PoC a few months ago:

Basically, by featuring a special callback / preprocessor pattern (much like paste! does), it is possible to let macro authors operate off "eagerly expanded macros", without requiring any special language support, besides for that info being available / such callback macros having been written.

So, for instance, in the case of the OP, they could have the attribute expand to:

with_builtin!( let $module_path = module_path!() in {
    their_real_proc_macro!( ($module_path), $trait_def );
});

Or some other syntactic variations / ideas:

  • magic match! macro (to eagerly expand macros):

    match! module_path!() {
        ( $($module_path:tt)* ) => (
            their_real_proc_macro!( ( $($module_path)* ), $trait_def );
        );
    }
    
  • callback-style (CPS):

    with_module_path! {( $($module_path:tt)* ) => (
        their_real_proc_macro!( ( $($module_path)* ), $trait_def );
    )}
    

And so on and so forth. The key idea is that this could be used for things such as the include! family of macros, as well as env!, etc.

  • This would solve many "impurity" problems of macros trying to access the environment or the filesystem and having their expansion be "incorrectly" cached: by using the built-in include…! macros from the language (which can be guarded not to access paths outside the CARGO_MANIFEST_DIR or something along those lines), we get to have a guarded and tracked access to these environmental elements.

    That being said, for the case of env! and include…! specifically, another (non-incompatible!) approach seems to be underway: special proc-macro APIs, such as proc_macro::tracked_env - Rust (no equivalent for the fs, though). But given how unergonomic setting up proc-macros is and will keep being for the foreseeable future, I think that macro_rules!-accessible ought not to be overlooked (that being said, in the same fashion that I've written ::with_builtin_macros, it is not hard to offer macro_rules!-targeted proc-macro helpers, such as paste!).


Regarding the OP: @Robbepop, since your objective is to hash "macro call location information" so as to generate unique ids, it so happens that your actual problem might be XY-ed if you embed the Span::call_site() among the hashed stuff. It's not pretty, but could get the job done while waiting for some of these APIs to be implemented (if they ever are!).

1 Like

Regarding the OP: @Robbepop, since your objective is to hash "macro call location information" so as to generate unique ids, it so happens that your actual problem might be XY-ed if you embed the Span::call_site() among the hashed stuff. It's not pretty, but could get the job done while waiting for some of these APIs to be implemented (if they ever are!).

Thanks for this input! Never thought about this and it probably won't result in generating deterministic output between compilations. However, it is worth a try.

I don't see any Hash implementation for Span or a way to write one though

Yep, unfortunately .. it seems like the only way to hack your way around it is to use its Debug output probably. But even if that works it was not a recommendable option though.

Or this could be solved via the unstable SourceFile in proc_macro2 - Rust API.

Span and SourceFile both seem like they would have issues when your proc-macro invocation is generated, e.g. something like

macro_rules! mkfoo {
  () => {
    #[derive_trait_method_info]
    pub trait Foo {
        fn foo(&self);
    }
  };
}

mod foo1 {
    mkfoo!()
}
mod foo2 {
    mkfoo!();
}

Also, since SourceFile is affected by --remap-path-prefix it can't really be relied on for much of anything except diagnostic prints, it may have no relation to the filesystem and two different files may have the same value.

2 Likes