Macros 1.1 Requires Derives to be Ordered

So I’ve been working on implementing macros 1.1 for a library of mine and have run into a case where the original struct in the token stream given to the macro is transformed, so my macro then recieves a different input to what the user gave. At least that’s what I think is going on, I may be wrong.

That means suddenly the order of derives becomes important, because the input actually received depends on who transforms the token stream first.

Personally I’m not a fan of that state of affairs, and think each derive should recieve the same input, meaning that original input would have to be added to the token stream output independently of the macro (or it could mutate it and mess with other macros).

That of course limits the power of the implementation.

Is this state of affairs by design? Or am I waving my arms around over a bug? If it is by design then how do we ensure a reasonable set of derives will behave together?

2 Likes

I’m guessing the real problem is that there is no stable implementation of custom attributes, so the proc_macros_derive spec was forced to include the “implementer strips custom attributes” rule (since attributes are almost critical for this feature).

I’t s a known about issue, and there has been some discussion of how to fix it in the macros 1.1 tracking issue.

2 Likes

This is precisely the reason why I never allowed custom_derive! macros to modify the input. My intention was always that if people really wanted that, then what they should use is a non-derive attribute. It makes no sense for derive, which should be producing new constructs, to mutate the input.

It’s also why I included support for arguments on the derive attribute itself, so that all the derive-specific stuff was contained to one place.

The solution I wanted for this was to have a meta attribute that everything (including the compiler) propagates, but otherwise ignores.

2 Likes

Derive arguments don’t help when you require configurable behaviour for each individual struct field, or enum variant.

True, but they’re good enough for the majority of cases, the rest of which would’ve been covered by a #[meta(..)] attribute.

Are you sure about that? The majority of attributes I have interacted with apply per-field - for example #[serde(skip_serializing)] or #[postgres(name = "foo")].

I’d contend that’s only because people don’t think much about deriving simple traits, which is the overwhelming majority in my experience. Complex stuff like serde is important, but I don’t believe it’s actually the bulk of what you can derive.

To put it another way: with something like custom_derive!, it’s way easier to derive simple stuff that takes a few basic arguments to define their behaviour, so you do exactly that. Most people used to plugin-supported custom derive only go through that rigmarole for the complex stuff like serde which does legitimately need attributes sprinkled all over the place like dandruff.

1 Like

Why? What are the benefits of that approach?

Is this state of affairs by design? Or am I waving my arms around over a bug? If it is by design then how do we ensure a reasonable set of derives will behave together?

It is by design, although it is a part of the design that is controversial and I'm not 100% is the right approach. I think that any responsible set of derives should behave - as far as I can see, the only way derives don't combine well is if one of the derives either assumes too much about its input or does something unexpected to its output.

A key point is that you cannot assume the input to your derive will be as written in the source code - it is always possible for it to be transformed by some other macro before you get it. From your point of view, it doesn't matter whether it was affected by another macro, another custom derive, or the user wrote 'weird' (possibly non-compiling) code - your derive ought to handle this gracefully.

The motivation for letting custom derives modify their input is:

  • simplicity - all macros are basically the same whether they are custom derive or not. This makes it easy to change a macro to be custom derive or not, to share code between custom derives and regular macros, makes them easier to learn, and simplifies the implementation.
  • it is actually desirable - removing attributes is the common case, adding attributes also seems reasonable. It also seems possible that a derive that adds doc comments or other decoration are legitimate use cases.

On the other add, allowing custom derive to only add code fits better with user expectations. So, I think the trade-off is between that expectation and the above. In particular, we must (I believe) at least allow a way for derive to remove attributes. My feeling has been that it is better to allow general purpose modification, rather than provide a special case mechanism for removing attributes. In particular, it is not at all clear what a special case mechanism for removing attributes should look like.

I didn’t follow macros 1.1 too closely, but the fact that the behavior of the derive macros can be dependent on the order they’re listed in sounds like a very unfortunate consequence, and it would be better if there were a way to get the effects we want without that consequence.

I’ve just read the RFC again, and though this aspect of the system is mentioned, the reasoning behind it isn’t well explained. Why do derives need to be able to remove attributes? Why couldn’t those attributes just be no-ops? Is this only because there’s no way to register a no-op attribute right now?

1 Like

I understand your point from the point of view of the macro, but I don’t think that’s how users will typically think about these things. Or at the very least it isn’t how I think of it. I view each derive as a function of the input I’ve typed and given it, independent of other functions. Maybe that’s not shared by other devs and I need to change my mental model.

My concern with unbounding what derives can put out is that in isolation it seems like a great feature. You can add fields, remove them, change their types, do whatever! It’s only when you chain them, with no visibility of the input to other macros that it janks out.

If it’s a feature that’s allowed, then surely it will get used, and we’ll run into this problem. And if it’s not ‘recommended’ and its only use case is stripping attributes then why is it there?

1 Like

sounds like a very unfortunate consequence

I agree this kind of sounds bad when you look at the derive syntax. But if you want to allow sophisticated derives this seems inevitable, and I'm not sure why this is actually bad, rather than just slightly unexpected.

Why do derives need to be able to remove attributes? Why couldn't those attributes just be no-ops? Is this only because there's no way to register a no-op attribute right now?

Yeah, basically, there is no such thing as a no-op attribute - the compiler can assume its a macro or a built-in attribute, so you'd get an error about a custom attribute. There is an RFC to address this, but it doesn't aim to allow no-op attributes like this.

I view each derive as a function of the input I've typed and given it, independent of other functions.

This is wrong, you must assume that the code could have been altered by other macros, independent of what we decide with derives.

It's only when you chain them, with no visibility of the input to other macros that it janks out.

Why? The input as typed is irrelevant to the macro, the only way I can see this causing a problem is when debugging a macro.

It occurs to me that there are two different ways to think of custom derive:

  • as a special thing that works just like built-in derive, but is not part of the compiler
  • just like any other macro, but applied in a different way.

Depending on which of these views you subscribe to, I think being able to modify the source, and thus order-dependency seems completely wrong or completely natural.

There is also a perverse incentive here in that custom derive is going to be stable a long time before other procedural macros, so people are incentivised to do as much as possible with custom derive, rather than using a regular procedural macro where it would be more appropriate.

I think the macros 1.1 RFC was written very much from the second perspective (indeed, originally it was not meant to be limited purely to custom derive). Partly because it was intended to be a solution for macros rather than for derive, we have not really discussed what custom derive should look like (indeed, in the past we have tossed around ideas about declarative custom derive, rather than a procedural macro solution).

3 Likes

To brainstorm a library solution to the specific problematic behavior around attribute stripping, how about the following?

#[derive(Serialize, ElasticType)]
struct S {
    #[serde(...)]
    x: X
}

gets expanded to ->

impl Serialize /* ... */

#[derive(ElasticType)]
#[derive(Strip_serde_derive)]
#[strip = "serde"]
struct S {
    #[serde(...)]
    x: X
}

gets expanded to ->

impl Serialize /* ... */
impl ElasticType /* ... */

#[derive(Strip_serde_derive)]
#[strip = "serde"]
#[derive(Strip_elastic_types_derive)]
#[strip = "serde"]
struct S {
    #[serde(...)]
    x: X
}

gets expanded to ->

impl Serialize /* ... */
impl ElasticType /* ... */

struct S {
    x: X
}

Explanation

This is basically equivalent to registering post-expansion passes in Syntex. The two custom derives each append their own cleanup action after all other attributes.

It can’t be a shared #[derive(Strip)] because:

  • If it is provided by a separate strip-attrs crate, the user’s code would need to be responsible for importing it: #[macro_use] extern crate strip_attrs.
  • If it is provided separately by serde_derive and elastic_types_derive that would be a conflict.

So they each provide a #[derive(Strip)] tagged with their own crate name. I can throw together a helper crate that provides the implementation so all we would need is:

#[proc_macro_derive(Strip_elastic_types_derive)]
pub fn strip(input: TokenStream) -> TokenStream {
    strip_attrs::strip(input)
}
1 Like

I think that's the key point.

I've been viewing derives as just like the internal ones, that derive behaviour from the input. It seems more intuitive to new users. You might not need to worry about what you type being what a macro gets unless you do need to debug them. But it could be a hard job to try and untangle a derived mess if you do end up with one.

1 Like

@dtolnay that seems like a good solution to the attributes problem with the tools we have now.

So to clarify my position, I’m not really a fan of derives being ordered because I found it unexpected and think it will be difficult to debug. I could live with it if it was clearly communicated that this is how they work (in places the average new user would look), and macros that mutated their input instead of just deriving new behaviour were recommended to put that in their docs.

This leads me to two questions:

  1. Is it worth having a separate syntax for derive macros that may modify the type?

This would provide more transparency to people using #[derive(…)] about what may, or may not be happening under the covers. It also means the order of simple derive macros is no issue (something which being used to the current built-in derives, I currently assume).

  1. Should there be a unique syntax for ‘marker’’ attributes, if it’s currently intended for custom attributes to not simply be no-ops?

The #[strip] attribute is the best idea I’ve seen for dealing with attributes required for multiple derive functions. However having a separate syntax for marker attributes would mean that they no longer technically need to be stripped.

The library I’m writing will have three optional derives, that all use a single attribute. Besides making all three required, and used in a specific order, I’m not sure how I’ll handle the attribute stripping (for now I’m testing with the custom_attributes feature on nightly, so I don’t need to think about it).

@KodrAus I put together a crate for the post-expansion solution to attribute stripping.

https://github.com/dtolnay/post-expansion

I would love a code review and I may have time after work to transition Serde over and benchmark.

I don't want to disrupt the more general "derives are ordered" discussion so let's follow up in GitHub issues if necessary.

1 Like

You’re a legend @dtolnay :+1:

I’ll give it a whirl and take a poke through.

Thank you so much - I wholehartedly second that.

I think the second view is just another pandoras box. And calling a mutating derive "sophisticated" sounds like a bad joke - at least to me. I would rather call it a nightmare the derive that eats your socks or potential PITA.

Although I am repeating a lot of my arguments from the macros 1.1 discussion. The thing is, I expect way more decorating ops on the average struct/enum than mutating ops.

What I really would love to see (at least in the far far future) is a clear separation of decorating operations which behave in a read-only way and are independent of order on one side and mutating big-caliber stuff on the other side. .. and derive is just the perfect candidate for this read-only/decorating stuff. I imagine this separation should lead to much more composable "macro" ecosystem or whatever you want to call it - derives that don't bite each other.

Allowing big-caliber stuff in custom derives only to clean up some attributes and keeping things simplistic seems like a short-term shortcut with quite some long-term burden.

A concerned Rustacean.