Pre-RFC: Generic parameters in derive

Here's something to ponder on between your RustFest sessions.

Summary

Add ability to pass generic parameters of the impl to the derive macros, greatly increasing the flexibility of the derive attribute.

Motivation

Derive macros are a very convenient way to generating trait impls based on the definition item of a type. However, the ability to use #[derive(Trait)] is denied when the impl must have generic parameters that need to be defined and bound in a more customized way than what the derive macro could generate automatically based on the definition item of the Self type.

Consider The Most Annoying Problem of #[derive(Clone)]:

#[derive(Clone)]
pub struct WaitingForGodot<T> {
    // ...
    _phantom_godot: PhantomData<T>
}

The use of derive here is often a convenient pitfall that generates this impl:

impl<T: Clone> Clone for WaitingForGodot<T> {
    //  ^---- Oops, did not really need this bound
    // ...
}

This can be easily solved by customizing the impl parameter:

#[derive(<T> Clone)]
pub struct WaitingForGodot<T> {
    // ...
    _phantom_godot: PhantomData<T>
}

More traits could be made conveniently derivable with custom generics than is feasible now:

use derive_unpin::Unpin;

#[derive(<St: Unpin, F> Unpin)]
pub struct MyFold<St, F> {
    #[unsafe_pinned]
    stream: St,
    #[unsafe_unpinned]
    op: F,
}

In tandem with more elaborate helper attributes, it could be even more powerful:

// A not-yet-written library providing the derive macro
use async_state_machine::Future;
use futures::future::{TryFuture, IntoFuture, MapOk};

#[derive(
    <
        Fut1: TryFuture,
        Fut2: TryFuture<Error = Fut1::Error>,
        F: FnOnce(<Fut1 as TryFuture>::Ok) -> Fut2,
    > Future
)]
enum AndThen<Fut1, Fut2, F> {
    First(MapOk<Fut1, F>),
    #[after(First)]
    #[future(output)]
    Then(IntoFuture<Fut2>),
}

Guide-level explanation

The trait name in a derive attribute can be adorned with generic parameters that specify the generics of the generated impl item:

#[derive(<T: Bound1, U: Bound2> Frob<T>)]
struct Foo<U> {
    // ...
}

The derive macro for Frob is expected to generate an implementation item with these generic parameters:

impl<T: Bound1, U: Bound2> Frob<T> for Foo<U> {
    // ...
}

Reference-level explanation

The syntax of an item in the derive attribute is extended to a subset of the language that can occur in a trait implementation item between the keywords impl and for:

DeriveItem :
   Generics? TypePath

The procedural macro can optionally support generic parameters to derive by defining an entry point annotated with the proc_macro_derive_with_generics attribute:

extern crate proc_macro;
use proc_macro::TokenStream;

#[proc_macro_derive_with_generics(Frob)]
pub fn derive_frob_with_generics(
    generics: TokenStream,
    trait_args: Option<TokenStream>,
    item: TokenStream,
) -> TokenStream {
    // ...
}

Invoked in the example above, the function will receive the token stream of <T: Bound1, U: Bound2> as the first argument, a Some value with the token stream of <T> as the second argument, and the token stream with the struct Foo item as the third.

If the compiler does not find a matching proc_macro_derive_with_generics symbol in the procedural macro crate that it has resolved for a derive item that features generics, an error is reported stating that the macro does not support generics. A plain old derive item can be processed with a function annotated as proc_macro_derive_with_generics if no function is annotated as proc_macro_derive for the same trait, otherwise the other function gets called.

Drawbacks

This extension complicates the syntax of the derive attribute.

Rationale and alternatives

Extending derive this way, we can solve its current shortcomings and open it to more uses and experimentation.

Everything proposed here is also possible to implement with custom attribute macros. But this would unnecessarily multiply mechanisms for generating a trait implementation for a type. Plugging into a well-defined syntax of the derive attribute would make the macro more convenient for the users and may be more friendly to automatic analysis than freeform attribute macros.

Prior art

The author is not aware of metaprogramming facilities in other languages that are sufficiently similar to Rust procedural macros and derive in particular.

Within Rust, the author has implemented support for generic trait impl parameters in a custom attribute macro, before realizing that being able to plug it into derive would make it more intuitive to the users and take away some of the parsing complexity from the macro.

Unresolved questions

  • Is it advisable, or even possible syntactically, to extend the general derive syntax with optional generics for each comma-separated item, or should this be only permitted as an alternative form of derive with a single item? An alternative combining syntax #[derive(<T: Bound> Trait1 + Trait2 + Trait3)] is also possible, either as a single item or part of a comma-separated list.
  • Should it be permitted to have two derive macros in scope for the same trait, one with a proc_macro_derive_with_generics entry point and the other with a plain proc_macro_derive?

Future possibilities

Extending derive with generics would open this language extension mechanism to far wider use and experimentation than what is possible today; the motivation section provides only a few profitable examples.

2 Likes

Cc @dtolnay

I would prefer not to do this. Once we're talking about invocations like the following, you should just handwrite the impl. Not every impl needs to be derived, and we shouldn't aspire to that.

#[derive(
    <
        Fut1: TryFuture,
        Fut2: TryFuture<Error = Fut1::Error>,
        F: FnOnce(<Fut1 as TryFuture>::Ok) -> Fut2,
    > Future
)]
enum AndThen<Fut1, Fut2, F> {
    First(MapOk<Fut1, F>),
    #[after(First)]
    #[future(output)]
    Then(IntoFuture<Fut2>),
}

For something like:

#[derive(<St: Unpin, F> Unpin)]
pub struct MyFold<St, F> {

, I would instead expect some kind of inert attribute to override whatever bounds are inferred by the macro.

#[derive(Unpin)]
#[unpin(where St: Unpin)]
pub struct MyFold<St, F> {
1 Like

FWIW, this is pretty close to the syntax I suggested here:

Although apart from syntax, you've also chosen a bit different meaning in how to apply those generics. I'm not sure if that would be more or less palatable to the language team.

3 Likes

Regarding fanciful attribute syntax, the following would be the most readable to me:

#[derive(Future) where
    Fut1: TryFuture,
    Fut2: TryFuture<Error = Fut1::Error>,
    F: FnOnce(<Fut1 as TryFuture>::Ok) -> Fut2,
)]
enum AndThen<Fut1, Fut2, F> {
#[derive(Unpin) where St: Unpin]
pub struct MyFold<St, F> {

Why not fix #[derive] so that it generate this instead?

impl<T> Clone for WaitingForGotot<T>
where /* each field type: Clone, */ PhantomData<T>: Clone {
    // ...
}

After all, this is what the impl body relies on, not on T itself being Clone.

1 Like

You would still sometimes want custom bounds even if derive did what you suggest, so this is mostly off topic for this thread. In any case the answer is:

  • That potentially violates private-in-public rules
  • That breaks mutually recursive types
  • That doesn't behave well when lifetimes are involved.

More info: Correct bounds processing for field types · Issue #370 · dtolnay/syn · GitHub

1 Like

The last closing parenthesis should not be there, I think.

The current where syntax does not allow unbounded parameters (Edit: wrong, see below), which are needed for the Clone fix (where you need to erase a bound that would otherwise be assumed by the macro) or, potentially, an unbounded parameter on the trait alone.

#[derive(<T> Clone)]
#[derive(<T, U> Frob<U>)]
pub struct WaitingForGodot<T> {
    // ...
    _phantom_godot: PhantomData<T>
}

The parameter could be considered unbounded if it is omitted from the where clause, but otherwise present on the type (where it might have a different bound?) or in the trait arguments. I'm not sure that would be clear enough.

There will be an "anything goes" bound when the never type is stabilized, but I think it would be more of a wart than the angle bracket syntax: T: From<!>

In this particular example, the Future impl written by hand would be mostly tedious and somewhat error prone, as it involves first branching over variants under a Pin to poll them, then maybe replacing the pinned value with something that comes out of the poll. With the extended derive, you just write the bounds, sprinkle attributes around the state variants, and maybe define a state transition helper trait impl where an attribute wasn't enough. I have a library in the making that does it with a custom attribute macro, and the results are marvellous; I should publish it when it's ready enough.

That helper attribute would have to be implemented for many traits in the same or nearly same way. I think the opt-in support in the derive macro itself is a better alternative from the maintainability POV.

Strange how relevant RFCs do not pop out on you in the search results when you are looking (with one eye, in a rush to publish) for them... This proposal is an alternative to adding attributes for the type or its fields, which set bound rules for all derive macros:

This proposal avoids some of issues that are currently unresolved in RFC 2353, and solves others: extending the derive syntax instead would enable case-by-case control over the derived impl's bounds in a more IMHO readable way, while (if the compact form #[derive(<T: Bound> Trait1 + Trait2 + Trait3)] is allowed) not being much more verbose.

Cc @Centril

1 Like

In discussion on @cuviper's proposal, it's been suggested that proc_macro_derive could be flexible enough to deal with the extended function signature as well. Perhaps this should be used in this proposal, though I'm not entirely comfortable with the signature of two-maybe-three TokenStream parameters by itself meaning "it does generics".

I was wrong about this one, but it would look a little unusual:

#[derive(Unwrap) where St: Unwrap, F:]
struct MyFold<St, F> {
    // ...
}

I have posted the RFC, incorporating some of the ideas brought up in the discussion. Thanks to everyone who contributed.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.