Idea: stateful procedural macros

I've mentioned this idea in the global registration proposal as a more general alternative to introducing built-in "distributed slices". I decided it will not hurt to describe it in a bit more depth and discuss its feasibility in a separate thread.

This idea is heavier than hypothetical built-in distributed slices, but it's more general and fully covers use cases of distributed slices, so we probably should not introduce both of them.

Proposal

Introduce Statefull Procedural Macros (SPM) defined using the following attributes usable in proc-macro = true crates: #[proc_macro_state], #[proc_macro_method], #[proc_macro_init], and #[proc_macro_finalize].

Example of defining an SPM:

// Type marked with `proc_macro_state` must have one and only one method
// marked with `proc_macro_init`/`proc_macro_finalize`.
#[proc_macro_state]
pub struct DistributedSlice { ... }

impl DistributedSlice {
    #[proc_macro_init]
    pub fn init(ts: TokenStream) -> Self { ... }

    /// Push new entry to the distributed slice and return its position.
    #[proc_macro_method]
    pub fn push_ds_entry(&mut self, ts: TokenStream) -> TokenStream { ... }

    /// Returns token stream for accumulated slice.
    #[proc_macro_finalize]
    pub fn finalize(self) -> TokenStream { ... }
}

#[proc_macro_method], #[proc_macro_init], and #[proc_macro_finalize] attrributes can be used only on inherent methods of a type marked with #[proc_macro_state].

Downstream crates can use this SPM to create statics:

use distributed_slice::DistributedSlice;
pub use distributed_slice::push_ds_entry;

// `DistributedSlice::init` will be called with token stream equal to `&[u64]`
#[DistributedSlice]
pub static U64_LIST: &[u64];

// The macro evaluates to `0`.
// With postfix macros we could write it as `U64_LIST.push_ds_entry!(13)`.
const IDX13: usize = push_ds_entry!(U64_LIST, 13);

fn foo() {
    // The macro evaluates to `1`.
    let idx42: usize = push_ds_entry!(U64_LIST, 42);
}

push_ds_entry can be used only with statics attributed with associated procedural macro.

If static associated with an SPM is public, it can be manipulated in other downstream crates:

use u64_list::{U64_LIST, push_ds_entry};

// The macro evaluates to `2`.
const IDX64: usize = push_ds_entry!(U64_LIST, 64);

If no other crates push entries to U64_LIST, it will evaluate to &[13, 42, 64] when compilation finishes.

Order of execution of SPM methods is deterministically random if codegen-units is equal to 1 and random otherwise.

SPMs can be used to create not only slices, but other complex static types, e.g. they can be used to implement static distributed perfect hash tables.

Low-level implementation

Procedural macro crates compile down to shared libraries (in future it may be WASM modules). When compiler encounters an SPM static, it calls associated initialization function which creates heap-allocated SPM state protected by mutex. The static itself is handled like an extern static.

When compiler encounters SPM method "call", it calls associated method from the shared library. This method locks associated SPM state, executes the method's code, unlocks the state, and returns TokenStream which gets pasted into the compiled crate. If codegen-units is not equal to 1, then we get race condition and order of SPM method calls may change between different compilation runs.

After all crates have been compiled, compiler finalizes all SPMs by calling associated finalization methods. Based on resulting token streams compiler generates statics and links them against the rest of the project.

Unresolved questions

  • Interaction with incremental compilation. Compiler would need to cache SPM states for parts of compilation tree.
  • How postfix macro syntax would work with SPMs? Do we need to explicitly import "method" macros?
  • Interaction with cross-compilation.
  • Maybe we should not tie SPM to statics? How would it work otherwise?
3 Likes

This seems to expand the required implementation a lot, now the compiler needs to be propagating the proc-macro state cross-invocation, and somehow go back and fixup the upstream crate's static after the final crate has been compiled.

3 Likes

You misunderstood the proposal a bit. As described in the second section, in the proposed model compiler does not retroactively "fix" the upstream crate. During its compilation, the SPM static in it is effectively an extern static with "undefined" value. It gets defined only after linking delayed to the later compilation stages.

Compiler keeps SPM state in a separate memory until all crates get compiled, then it generates phantom codegen unit which defines all SPM statics in the project and links them with the previously defined extern statics. Because compiler may compile crates in parallel, it needs to protect SPM states with mutex. In other words, in the worst case scenario, compilation of one crate may block compilation of another crate. It's certainly a disadvantage, but I don't think it should be a problem in practice with sanely implemented SPMs.

Each crate is compiled with a separate compiler though.

It does not mean they can not use shared memory.

Crates in the same dependency graph are compiled with different rustc processes and those can run at wildly different moments. Think for example of running cargo build, then stopping it, then running it again after rebooting your pc; it won't recompile crates that it had already compiled, so whatever shared state you want will need to be persisted on disk.

I would require the shared data to be serializable (in order to store it on disk and also possibly allow incremental compilation) and the proc macro to provide an associative way to combine two copies (to allow compiling crates independently).

4 Likes

Yes, this is why I mentioned incremental compilation in the last section.

It may be possible for it to work without serialization. Because we know that proc macro was compiled with the same version of Rust compiler, we can dump snapshot of shared memory directly onto disk. Of course, for this to work we would need to ensure stability of virtual addresses of restored shared memory. We also would need to be careful with used allocator. It may be better for this to work through an RPC protocol instead of direct shared library calls.

Associative property is also not needed, shared memory snapshot can record crates (dependent on the associated SPM) which have finished compiling before its creation. If during new compilation we recompile only crates which are not in the list, we can start from this snapshot.

Incremental compilation only happens within a single crate. With multiple crates is just how it always works, there's no alternative.

At that point it risks to go way out of scope.

1 Like

A somewhat more limited-in-scope way to do this is to consider a two phase approach:

  1. The proc-macro produces serializable data. (Possibly just [u8] buffers, letting the proc-macro deal with the serialisation/deserialisation itself.)
  2. At the end we invoke another method on the proc-macro, providing all the collected data so far, and have it produce the final data that should be put into the binary (be it as a token stream or otherwise).

This is sort of akin to a map-reduce way of structuring things.

4 Likes

Serialization could be simplified by forbidding SPM methods to return TokenStream. In this case, compiler can only remember (and serialize) method names and token streams used to call them without executing methods themselves. Execution of SPM code can be then deferred to the very end of project compilation. It also could help with determinism, since we could sort all method calls based on method names and serialized arguments. This is effectively the global registration proposal, but with additional capability to process registered items.

Lack of TokenStream return is a noticeable restriction, but since it can significantly simplify implementation, I think it will be reasonable to start with it and leave the door open for potential future extension.

using shared memory to bypass intentional crate encapsulation does not sound like a good idea.

What happens when finalized registry itself produces a macro call? Better yet subscribes to another registry?

I don't know what you mean by a registry "subscribing", you will have to define exactly what what means. I would expect macros producing calls to macros to work like it does today (and I don't actually know how that works).

It works by feeding a piece of syntax (what token stream is) to a separately compiled lib to get another syntax. The produced syntax is then parsed itself and all macro calls from it also get executed.

The quirk is when proposed stateful macro is expanded it can pile up another entry to be registered elsewhere, or even in itself potentially causing infinite loops in macro expansion (and registering in itself is the easiest case)

1 Like

To fix this we can make a special mark for token stream produced by such a macro finalization that inhibits any use of any stateful macro push attribute inside of it and emits a warning.

This way we retain ability to add calls to stateful macros from other macros as well as vice versa, we only prohibit the problematic cases.

prior art: zig's now-removed comptime var