Idea to improve compile-times of derive macros. Maybe by a lot

I've realized that derive macros often do a lot of unnecessary work. Consider serde's Serialize and Deserialize derive macros, both of which go through the following steps:

  1. The derive input is turned into a TokenStream
  2. Rust calles the function in the serde proc-macro dylib, giving it the TokenStream
  3. The derive macro in serde parses this TokenStream into syn::DeriveInput
  4. Then the derive macro does a bunch of validation on it, and parses it into some internal AST like serde::SerdeAst
  5. Each derive macro takes this serde::SerdeAst and generates macro-specific code, depending if we are inside of Serialize or Deserialize
  6. The derive macro turns the data into a TokenStream which then is received by Rust. This TokenStream is then also parsed by Rust to verify that it is correct.

Steps 1-4 don't need to happen more then once. Why do they?

Consider these 2 derives:

#[derive(Serialize, Deserialize)]
struct Something { /* ... */ } 

First, Rust does all 6 steps for the Serialize macro, and we end up with this:

#[derive(Deserialize)]
struct Something { /* ... */ } 

impl serde::Serialize for Something { /* ... */ }

Then, Rust does all 6 steps again for the Deserialize macro, and now we have this:

struct Something { /* ... */ } 

impl serde::Serialize for Something { /* ... */ }
impl serde::Deserialize for Something { /* ... */ }

This seems pretty wasteful. What if there was a better way? What if we only needed to do the first 4 steps once, and subsequent derives could re-use it?

Note, specifically for serde: This is not exactly how serde_derive works, but it should be possible to adapt it into a single AST, where the serialize_with attribute is only allowed if serialize == true in the input, as an example.

Proposal

Let's have a #[proc_macro_derive_group] which allows us to place Serialize and Deserialize under a single function that receives the item, then we don't have to repeat those 4 steps for the 2nd time.

With #[proc_macro_derive]

#[proc_macro_derive(Serialize, attributes(serde))]
pub fn derive_serialize(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as syn::DeriveInput);
    let serde_ast = SerdeAst::parse(input);
    // serialize-specific logic
    quote! { #impl_serialize }
}

#[proc_macro_derive(Deserialize), attributes(serde)]
pub fn derive_serialize(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as syn::DeriveInput);
    let serde_ast = SerdeAst::parse(input);
    // deserialize-specific logic
    quote! { #impl_deserialize }
}

I showed how Rust will use these 2 functions in an earlier code block.

With #[proc_macro_derive_group]

This new attribute lets you avoid doing extra work.

#[proc_macro_derive_group(Serialize, Deserialize, attributes(serde))]
pub fn derive_serde(input: TokenStream, serialize_called: bool, deserialize_called: bool) -> TokenStream {
    let input = parse_macro_input!(input as syn::DeriveInput);
    let serde_ast = SerdeAst::parse(input);
    let impl_serialize = if serialize_called
        // serialize-specific logic
        Some(quote! { #impl_serialize })
    } else {
        None
    };
    let impl_deserialize = if deserialize_called
        // deserialize-specific logic
        Some(quote! { #impl_deserialize })
    } else {
        None
    };
    quote! {
        #impl_serialize
        #impl_deserialize
    }
}

Let's dissect it.

For this code:

#[derive(Serialize, Debug, Deserialize)]
struct Something { /* ... */ } 

Rust knows that Serialize and Deserialize belong to a single group, so it just call serde::derive_serde exactly once:

serde::derive_serde(quote! { struct Something { /* ... */ } }, true, true)

After a single expansion, we now have this:

#[derive(Debug)]
struct Something { /* ... */ } 

impl serde::Serialize for Something { /* ... */ }
impl serde::Deserialize for Something { /* ... */ }

Then the other derives can kick in.

Benefit: Compile-times are going to improve. We no longer have to repeat the first 4 steps.

It is of course allowed to still do just 1 of the derives: #[derive(Serialize)] or #[derive(Deserialize)], in which case the respective argument will be set to false.

Where this idea came from

I just made an RFC for the #[ignore] attribute: https://github.com/rust-lang/rfcs/pull/3869

In order to test this RFC out, I want to create a crate that re-implements the standard library's derives: PartialEq, Hash, PartialOrd, Ord and Debug

But each derive macro will have a helper attribute ignore_derives so you can write stuff like this:

use derives_with_ignore_derives_attribute::{Debug, PartialEq, Hash};

#[derive(Clone, Debug, PartialEq, Hash)]
pub struct Var<T> {
    pub ns: Symbol,
    pub sym: Symbol,
    #[ignore_derives(PartialEq, Hash)]
    meta: RefCell<protocols::IPersistentMap>,
    #[ignore_derives(PartialEq, Hash)]
    pub root: RefCell<Rc<Value>>,
    #[ignore_derives(PartialEq, Hash)]
    #[ignore_derives(fmt::Debug)]
    _phantom: PhantomData<T>
}

Essentially all logic for the 5 derives will be the same, they'll all just process the DeriveInput a little differently and emit different code.

There should be zero reason why I'd have to parse the TokenStream into DeriveInput 5 times instead of 1.

This feature has the potential to improve compilation performance of derives macros

4 Likes

New people, same issues :grinning_face_with_smiling_eyes:

Nice idea, though I'd want the API to do more to encourage the derive macros to not interact with each other in weird ways, maybe the API could be split into 2 steps -- a parse step and a derive step where the parse step is run once producing a user-defined type and then the output is cloned (or maybe just pass a shared reference) when feeding into each of the derive steps. e.g.:

pub struct SerdeAst { ... } // any type that impls TryFrom<TokenStream>

impl TryFrom<TokenStream> for SerdeAst {
    type Err = syn::Error; // any type that impls Into<TokenStream>
    fn try_from(input: TokenStream) -> Result<Self, Self::Err> {
        let input = syn::parse(input)?;
        Self::parse(input)
    }
}

#[proc_macro_derive(Serialize, attributes(serde))]
pub fn derive_serialize(input: &SerdeAst) -> TokenStream {
    // serialize-specific logic
    quote! { #impl_serialize }
}

#[proc_macro_derive(Deserialize), attributes(serde)]
pub fn derive_deserialize(input: &SerdeAst) -> TokenStream {
    // deserialize-specific logic
    quote! { #impl_deserialize }
}
4 Likes

Another version of this would be to go further and say that dealing in tokens as input at all was the major issue here.

For the common case of derives that just implement a trait, imagine a version where the body is generated from something that can be passed a parsed version of the whole thing with type information too! No more re-parsing at all, plus it could tell the difference between normal u8 and someone's evil type u8 = String;. That would need to come with the restriction that such a derive can't add any visible items, but that would be well worth it most of the time.

(Insert asterisks here for things like auto trait leakage that would need to be figured out.)

8 Likes

Could you expand on what you mean here? Isn't the whole point of derives to add impls? Are those not items?

You could look at C# source generators for some inspiration.

Those, for example, mean you write something like

[GeneratedRegex("abc|def", RegexOptions.IgnoreCase, "en-US")]
private static partial Regex AbcOrDefGeneratedRegex();

and the source generator fills in the body of the method -- but notably it's not just "add the attribute and that generates a new method".

So maybe in Rust it could be something like

#[derived] impl Serialize for MyType;

which generates the body of the impl, but doesn't get to add any items (no other types, no other impls, etc) so that name resolution can results can't be changed from whatever the generator emits.

7 Likes

Hm, I'm pretty sure I have seen derives that do more than that. For example the unstable SmartPtr (or whatever it got renamed to) in std caused two traits to get implemented iirc.

I'm pretty sure I have seen other examples, including some adding private helper mods, but maybe those were not derives but full on attr macros.

But I suppose for the vast majority of derives, all they do is add a single impl for a trait with the same name as the derive.

Yeah, absolutely. Part of the power of working on tokens is that it's early enough that things like that are possible without the user needing to list out the items. Generating alternate versions of the types too, for example.

But that's why it's

not "for every possible proc macro".

Keep both, letting them pick between more power but more annoying to work with and more restricted but much easier to write because they have way more structured information available.

3 Likes

or have the token-based derive generate several type-based derives allowing you to actually use types but also have the simple usage of one derive.

1 Like

This looks similar to what TemplateHaskell does. It gives you access to a type checked AST with all the information (and a lot more via Q monad). The problem is that every time they change the AST representation you have to modify your derive code. #[non_exhaustive], might help to avoid breaking stuff, but this usually results in lots of _ => panic!("This could have been a compile time error, but here we are..."),

Maybe Rust could provide coarse-grained, partial AST? Something between a fully-opaque TokenStream, and a fully-detailed AST for every leaf.

AST nodes like Item or Expr don't have to be an enum that exposes everything Rust has. They could be objects with private fields, and only provide getters like as_struct(&self) -> Option<StructView>, and a to_token_stream() fallback.

TH doesn't give a full internal AST either. I guess if none of this API exposes fields and everything is a getter...

In Haskell you support different versions using C preprocessor :slight_smile:

newtypeInstD' :: Name -> [Type] -> Con -> Dec
newtypeInstD' name args con = 
#if MIN_VERSION_template_haskell(2,15,0)
     NewtypeInstD [] Nothing (foldl AppT (ConT name) args) Nothing con []
#else
     NewtypeInstD [] name args Nothing con []
#endif 

That is really not the correct reaction to non_exhaustive... If the user does that, then the issue is squarely on the user. They get to keep all the pieces.

1 Like

rust's stability guarantees will also help with stuff like that, we don't arbitrarily break stable APIs unless they're unsound or we verified with crater (which tests all of crates.io and public repos on github) that basically no one will be affected. Other languages such as python and, from what you're describing, haskell don't have those guarantees to the same extent.

This is roughly what macro fragment fields are trying to do: exposing a subset of the AST to macro_rules! in a way that avoids having to recreate the grammar in a macro just to parse out pieces of it.

1 Like

1-4 are the steps we're trying to optimize. But doesn't rustc already cache the result of 1? And as for 4, serde has field attributes like skip_serialize, skip_deserialize which means the validation step isn't the same for both Serialize and DeSerialize. So the only thing we can optimize is the parsing of token stream to syn struct. We need benchmarks to find out if it's worth it.

According to chatgpt, rustc already caches the results of proc macro invocation, so if that isn't a hallucination (I doubt it, it looks like such an easy thing to cache), this optimisation will only help a clean build.

Rustc does not currently cache proc macro invocations. There is a PR open for doing it, but doing it is non-trivial due to spans.

SerdeAst will support parsing both skip_serialize and skip_deserialize. Once we do validation for just Deserialize, emit a compile error for the skip_serialize attribute.

Essentially you parse the structure into a superset of what Serialize and Deserialize support, then do extra validation on top of that if only 1 of them is present

WRT the future compatibility issue of proc_macro, it's already the case for declarative macros, so as long as they're treated mostly opaquely I don't think it should be adding any issues. In particular we have examples of using editions to alter what gets parsed for extending the syntax.