Procedure macros should have access to some basic compilation information

I don't know if this has been requested by others, but I personally want procedure macros to have an idea of what target architecture the user code is being compiled for, so that I can generate some platform-specific code for the user (for performance reasons).

This behavior could be achieved by adding a bunch of #[cfg(...)] conditions in the generated code, but that's inefficient and the complexity of the cfg expression grows exponentially as I consider more architectures, and therefore far from ideal.

This feature seems to be requested by others as well on stackoverflow, see here and here.

I think it make sense to allow procedure macro having access to, during user code compilation, some common information including:

  • target type (binary/library/example/doc/etc.)
  • target architecture (x86, x64, etc.)
  • optimization level (debug/release)

Since a procedure macro behaves like a compiler plugin, I haven't seen any reason a procedure macro shouldn't be allowed for this. If this ability can be supported, then the power of procedure macros in my opinion could be pushed even further.

In terms of how to support it, I guess provide some kind of environment variable to the procedure macro (only during user code compilation) could be feasible?

2 Likes

Surely from the rest of your post you must mean "compiled for". Right? Cross compilation means they arent necessarily the same.

3 Likes

It's a technique I use myself, so I'll ask: What's inefficient about it? All that gated code is compiled out if the predicate doesn't match and thus doesn't take up any space or cpu cycles during runtime. So what am I missing?

Again, I don't see how? Would you mind providing an example?

The dev and release profile don't guarantee a specific optimization level. In addition you can make your own profile.

The rust compiler doesn't have any such a concept. It is cargo which distinguishes between them. Rustc doesn't know.

If this is supported at all, I don't think it should be done using an env var. Env vars are global to the entire process while multiple compilation sessions with different configurations may run in the same process. (rustc doesn't do this, but custom drivers can)

2 Likes

There are some existing per-compilation-unit env vars exposed to proc-macros, e.g. OUT_DIR if the current compilation-unit has a build script.

Those are set by cargo, not rustc. In addition they too prevent eg rust-analyzer from running proc macros in parallel.

1 Like

Yes, I meant "compiled for", thanks!

The cfg macro works okay if generated code is valid under all cases, that is one can generate

if cfg!(target_arch = "x86_64") {
    // platform specific code A
} else if cfg!(target_arch = "x86") {
    // platform specific code B
} else if cfg!(target_atch = "aarch64") {
    // platform specific code C
} else {
    // generic code
}

However, if the platform specific code only compiles on that platform, and the conditions are combined with other flag checks, then the code becomes more bothersome: (My mistake here, it won't be more complex exponentially, but it can be if the conditions are independent, for example I have several other conditions to be checked together)

#[cfg(all(target_arch = "x86", override_64bits = "true", override_32bits = "true"))]
{
    // specific code A
}
#[cfg(all(target_arch = "x86", override_64bits = "false", override_32bits = "true"))]
{
    // specific code B
}
#[cfg(all(target_arch = "x86", override_64bits = "true", override_32bits = "false"))]
{
    // specific code C
}
#[cfg(all(target_arch = "x86", override_64bits = "false", override_32bits = "false"))]
{
    // specific code D
}

(these are not exactly what I'm doing, just for explanations)

My point is that if we can move these conditions from users' code to the macros' code, we can then use cfg! and switch in these cases, which will be more elegant and not limited by the syntax of #cfg attribute.

And my guess for the "inefficiency" is that it will affect the compilation time, as there are more tokens to be generated from the macro, and there are more compilation conditions to be check at compile time of the user's code.

1 Like

I thought somehow there is distinction between debug build and release build because we have a debug_assert macro that only evaluates in debug build. But I just found out that it's based on a compilation flag. I wish there is a standard way to distinguish between debug and release just like some compilers define NDEBUG in C++.

Then how about letting cargo set these environment variables or compilation flags? Maybe the compilation flag is a better way to pass this information? But I don't know if it's currently feasible to read rustc flags during procedure macros' runtime?

Though, at least with a proc-macro you can write code to generate these sets for you. I would expect it to be possible to have very similar code for generating the sets as you would have for choosing one of the set to generate.

1 Like

Yes. Nevertheless, the difference is that you can use regular rust expressions to select the set in the macros' code, but you have to rely on #cfg expressions in the users' code.

Just encounter another piece of information that's great to have access to: target endianness

While it's not necessary to expose this information directly to procedural macros, I do agree that it's definitely useful to be able to do so.

I think it might be useful to design a general way to allow you to wrap procedural macros to add extra tokens to their input. The biggest thing that macro authors want/need currently is a way for the procedural macro to know the (hygienic) path to a runtime crate. You can do this information passing for functionlike macros (though it can be annoying for cfg matrices), e.g.

cfg_if! {
    if #[cfg(target_arch = "x86_64")] {
        #[macro_export]
        macro_rules! m {($($args:tt)*) => {
            $crate::proc_macros::m! {
                #cfg {
                    crate: $crate,
                    target_arch: x86_64,
                }
                $($args)*
            }
        }}
    } else if #[cfg(target_arch = "x86")] {
        #[macro_export]
        macro_rules! m {($($args:tt)*) => {
            $crate::proc_macros::m! {
                #cfg {
                    crate: $crate,
                    target_arch: x86,
                }
                $($args)*
            }
        }}
    } else if #[cfg(target_arch = "aarch64")] {
        #[macro_export]
        macro_rules! m {($($args:tt)*) => {
            $crate::proc_macros::m! {
                #cfg {
                    crate: $crate,
                    target_arch: aarch64,
                }
                $($args)*
            }
        }}
    } else {
        compile_error!("unsupported target architecture");
        macro_rules! m {($($args:tt)*) => {}}
    }
}

I'd personally like a way to pass extra tokens like this as an additional argument to any procedural macro. Combined this with eager macro expansion this makes passing arbitrary macro-time information into a library concern.

(That said, perhaps it'd be better to avoid pushing for macro_rules! metaprogramming?)


That said, just providing access to environment variable cfg configuration the way they're exposed to buildscripts is a lot simpler. Plus, rustc is already considering isolating procedural macro invocations into separate processes already. And if we ever run them on wasm, we fully define the RPC, including environment access.

3 Likes

That's great! Invoke procedural macro in a wasm-like way definitely makes this problem easier to solve!

1 Like

Note that NDEBUG is part of the language standard and is not a "some compilers" thing.

NDEBUG seems to be directly equivalent to cfg(not(debug_assertions)) (if you consider C assert to be equivalent to Rust debug_assert!). It's not directly related to debug/release profiles or optimization levels and can be enabled/disabled in either of them.

3 Likes