Proc_macro in an existing library crate

I'm not particularly talking about const only, the proc macro here is useful for just offloading from runtime to build time, and can be used as:

fn find_matching(locales: Vec<Locale>) {
    let enUS = locale!("en-US");

    for loc in locales {
        if loc == enUS {
           println!("Match found!");
        }
    }
}

That's true that it doesn't work now as a full substitute now, but as const implementation matures it could one day be the solution to your multi crate issue. Ideally in the future, even in this example, you could replace the macro call with a call to a const function that does the parsing and validation.

This is a technical limitation that we would be happy to see lifted, but the technical details are hairy. I'm not sure who has the technical expertise to help drive the feature, but I think everyone would be excited to see progress here.

5 Likes

Would it be possible for someone with understanding of the context to provide list of blockers that are already filled? It would be good to understand what has to happen before this can be tackled.

This requires splitting source code for something that looks like a single crate into multiple parts that are compiled (or interpreted) separately, possibly for different targets if we are cross-compiling.

This is a major change to the compiler, but it would also be useful for other things (like SYCL, for example) where you need to generate code for different targets from a single source.

4 Likes

How difficult is it to enumerate the data-structure and component boundaries within the compiler where target-architecture annotation would need to be added? Presumably proc-macros would be just a virtual target-architecture, though probably a distinguished member of the target-architecture enumeration, along with the compiler's host target-architecture.

This same effort could also lay the groundwork for the modular ABI proposal that is in a concurrent thread, as in some sense those different ABIs can be considered to be alternate target-architectures.

This requires splitting source code for something that looks like a single crate into multiple parts that are compiled (or interpreted) separately, possibly for different targets if we are cross-compiling.

But this is already the case with build.rs, how is that different?

There could be in the Cargo.toml something like

[proc-macro]
path = "macro/macro.rs"

# Or can it be merged with  [build-dependencies]
[proc-macro.dependencies]
syn = "1"
quote = "1"
11 Likes

That's an interesting middle ground proposal! Previously, we've imagined just writing them anywhere in your library, and having rustc figure it out. But having it work like build.rs means that cargo knows how to compile a separate crate as part of the build step for your library. That would be much more trivial to implement.

3 Likes

This would be fantastic. It'd be nice if the path was treated as a whole module (e.g., src/macros/mod.rs for 2015 edition or src/macros.rs for 2018 edition modules) so you could structure your code such that everything for proc macros were in a particular (multi-file) module tree.

It would have to be treated as a crate root, not a module root, so it would probably want to be src/macros/lib.rs. And then it would need its own set of dependencies, as @ogoffart already identified.

1 Like

Would it be feasible for said macro root crate to depend on the normal crate it's a part of? For example, the crate mentioned in this thread looks like this:

facade dependencies:

  • macro crate
  • impl crate

macro dependencies:

  • impl crate

impl dependencies: none (well, except third party ones)

It can go one way or the other. Either the impl crate can use the proc macro, or the proc macro crate can use the impl library.

I think as a default if one direction has to be chosen, making the impl crate have access to the macro as crate::macro_name is better than the inverse (where it would be there for the outside world but not in the impl).

In your case you'd still be able to reduce the number of crates from three to two.

I have a personal project where the dependency is actually circular (the macro directly depends on itself, even), but because I use watt for the macro impl, this isn't exposed to the user. Your example could easily reduce to one crate if the macro is precompiled to wasm and run via watt.

The proc-macro crate cannot depend on the crate itself.

Note that, already today, you can re-use code between the macro crate and the implementation crate.

imagine a directory structure like this:

 ├── Cargo.toml
 ├── src
 │   └── lib.rs
 └── macro
     ├── Cargo.toml
     ├── src
     │   └── lib.rs
     └── common
         └── mod.rs

in the main create src/lib.rs you have:

// re-export the macro
pub use my_macro_crate::my_macro

// import the common code
#[path="../macro/common/mod.rs"]
mod common;

// optinally re-export it
pub use common::*;

And the macro/src/lib.rs can also make use of the shared code with

#[path="../common/mod.rs"]
mod common;

But one must be careful inside the common code that referring to crate:: might not be the right thing.

This works because the macro crate is within a sub directory of the other crate. So you need to publish only two crates on crates.io instead of three.

But ideally, if we could have a [proc-macro] section in the root's cargo.toml, then we would only need to publish a single crate, which would be much more convenient.

Under this new model, we could also redefine what it means to do quote!(crate::foobar) where crate refer to the actual parent crate. Maybe depending on the span site. But that actually might be more difficult.

One use case is to make your life easier when implementing a crate ie just abstraction, but in this case done with a proc macro. The other use case is extending the public API of a crate with a proc macro.

I have encountered both use cases ie. they are both extant in the wild. So having to make a choice for at most 1 of them is a bit like having to choose between sync and async code support in rustc: they serve different use cases, so having to choose between them is not really acceptable.

As for the proc macro as build.rs-like construct: that's an interesting idea, but I would like to add that I would want to be able to declare multiple proc macros in 1 crate (for either use case outlined above in any mixture) so it shouldn't be too much like build.rs, which is limited to just 1 per crate.

Perhaps it can't in actuality, but can we maybe pull a sleight of hand to make it look like it can? Eg. by using another hidden crate or some such?

I was not aware this existed. That could be useful indeed when implementing a proc macro given the current infrastructure, but ultimately this is still a hack compared to the conceptual ideal where I simply don't have to care about proc macro crates and impl crates; rather I would be able to just write a proc macro as easily as a function ie without extra crate shenanigans. Anything less than that will still leave people in general pining for that ideal.

You can say declare multiple proc macro entry points in a single proc macro crate. A "build-rs like proc-macro" would still be able to declare multiple proc macros from the single crate.

As I said, there's two possibilities.

  1. The proc-macros defined are available within the library as crate::macro and usable within the library.
  2. The proc-macros do not exist within the library. They can be accessed by someone using the library as ::your_crate::macro, but the library cannot access or use the proc macros.

A dependency from the proc macro onto the library is only possible in the second case. That's because in the first case, the dependency is circular: the library uses the proc macro uses the library uses [snip]

I think it makes sense to handle "build-rs like proc_macro" like build-rs: it can't depend on the library directly, because it's used to build the library.

There exist simple (enough) hacks to get around this bootstrap problem:

  • Use mod to mount (parts of) the library into the buildscript/proc_macro (requires consumers to compile that subset of the library twice)
  • Use an extra crate to factor out the shared code (I still want workspace-private crates that don't have to be published to crates-io)
  • The buildscript/proc_macro uses an older version of the library from crates-io (requires consumers to compile at least two versions of the library; be careful about depending on the same version (infinite recursion doesn't work) or growing the bootstrap chain longer)
  • Use WASM or other precompilation to drive the buildscript/proc_macro (requires some setup, but provides the best buildscript/proc_macro compile time to users, as only the developer compiles your library more than once)

Somewhat obviously, I'm personally in favor of the "extra crates" version (especially if we get workspace private crates eventually, please?) and/or the WASM version (especially if we can get support baked into cargo for uploading/using a WASM precompiled version).

1 Like

For the particular case that I listed at the top of this issue, I don't think that would work.

I need the proc_macro to be able to use my library, and then I need to expose it for users of my library (my library does not have to use the proc_macro).

I'd like to be able to do the same for the time crate. Having the ability to use macros would be nice, but not strictly necessary.

Right now there's too much overhead (given lack of diagnostics and hygiene), but down the road if like to be able to treat a macro as an integral, lightweight part of a crate.

The crate mentioned in the OP is indeed using 3 crates, when 2 would suffice (the #[proc_macro_hack] pub use crate can be inlined within the fronted crate).


I think that having a:

[proc-macro]
path = "src/lib/proc_macros/mod.rs"

as sugar for the shenanigans that are currently required to achieve that effect (which include having to use something like cargo release or cargo workspaces to handle the versions of the two Cargo.toml files) would indeed be a nice ergonomic bump, especially now that rust 1.45.0 (to be released in July) is expected to support procedural macros in expression and statement position. Both changes will really help making procedural macros a first class citizen in the ecosystem.

  • I would be willing to help with the actual implementation effort in that regard
Aside

in the case of function-like procedural macros this annoying limitation can be circumvented by having a macro_rules! macro wrap the procedural one:

//! facade crate
#[doc(hidden)] /** Not part of the public API **/ pub
use proc_macros::my_macro as __my_macro__;

#[macro_export]
macro_rules! my_macro { ($($input:tt)*) => (
    $crate::__my_macro__! {
        #![crate = $crate]
        $($input)*
    }
)}
3 Likes

Would this need a proper RFC to move forward? It seems like a good idea, and I'd hate to see it fall by the wayside.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.