I've mentioned this idea in the global registration proposal as a more general alternative to introducing built-in "distributed slices". I decided it will not hurt to describe it in a bit more depth and discuss its feasibility in a separate thread.
This idea is heavier than hypothetical built-in distributed slices, but it's more general and fully covers use cases of distributed slices, so we probably should not introduce both of them.
Proposal
Introduce Statefull Procedural Macros (SPM) defined using the following attributes usable in proc-macro = true
crates: #[proc_macro_state]
, #[proc_macro_method]
, #[proc_macro_init]
, and #[proc_macro_finalize]
.
Example of defining an SPM:
// Type marked with `proc_macro_state` must have one and only one method
// marked with `proc_macro_init`/`proc_macro_finalize`.
#[proc_macro_state]
pub struct DistributedSlice { ... }
impl DistributedSlice {
#[proc_macro_init]
pub fn init(ts: TokenStream) -> Self { ... }
/// Push new entry to the distributed slice and return its position.
#[proc_macro_method]
pub fn push_ds_entry(&mut self, ts: TokenStream) -> TokenStream { ... }
/// Returns token stream for accumulated slice.
#[proc_macro_finalize]
pub fn finalize(self) -> TokenStream { ... }
}
#[proc_macro_method]
, #[proc_macro_init]
, and #[proc_macro_finalize]
attrributes can be used only on inherent methods of a type marked with #[proc_macro_state]
.
Downstream crates can use this SPM to create statics:
use distributed_slice::DistributedSlice;
pub use distributed_slice::push_ds_entry;
// `DistributedSlice::init` will be called with token stream equal to `&[u64]`
#[DistributedSlice]
pub static U64_LIST: &[u64];
// The macro evaluates to `0`.
// With postfix macros we could write it as `U64_LIST.push_ds_entry!(13)`.
const IDX13: usize = push_ds_entry!(U64_LIST, 13);
fn foo() {
// The macro evaluates to `1`.
let idx42: usize = push_ds_entry!(U64_LIST, 42);
}
push_ds_entry
can be used only with statics attributed with associated procedural macro.
If static associated with an SPM is public, it can be manipulated in other downstream crates:
use u64_list::{U64_LIST, push_ds_entry};
// The macro evaluates to `2`.
const IDX64: usize = push_ds_entry!(U64_LIST, 64);
If no other crates push entries to U64_LIST
, it will evaluate to &[13, 42, 64]
when compilation finishes.
Order of execution of SPM methods is deterministically random if codegen-units
is equal to 1 and random otherwise.
SPMs can be used to create not only slices, but other complex static types, e.g. they can be used to implement static distributed perfect hash tables.
Low-level implementation
Procedural macro crates compile down to shared libraries (in future it may be WASM modules). When compiler encounters an SPM static, it calls associated initialization function which creates heap-allocated SPM state protected by mutex. The static itself is handled like an extern
static.
When compiler encounters SPM method "call", it calls associated method from the shared library. This method locks associated SPM state, executes the method's code, unlocks the state, and returns TokenStream
which gets pasted into the compiled crate. If codegen-units
is not equal to 1, then we get race condition and order of SPM method calls may change between different compilation runs.
After all crates have been compiled, compiler finalizes all SPMs by calling associated finalization methods. Based on resulting token streams compiler generates statics and links them against the rest of the project.
Unresolved questions
- Interaction with incremental compilation. Compiler would need to cache SPM states for parts of compilation tree.
- How postfix macro syntax would work with SPMs? Do we need to explicitly import "method" macros?
- Interaction with cross-compilation.
- Maybe we should not tie SPM to statics? How would it work otherwise?