Pre-RfC: #[auto_derive] for generating #[deriving(...)] decorators


#1

Summary

Add an #[auto_derive] attribute that allows the programmer to generate custom #[deriving(...)] implementations. In most cases, this will try to figure out the “scheme” for deriving, however the programmer can explicitly specify it if necessary.

Motivation

#[deriving] is a very powerful tool in Rust; one that is used all over the place — however it’s restricted to a bunch of library types at the moment. There is some reusable code for defining your own ones (as done here), but it requires one to write libsyntax code and is rather patchy.

Detailed design

Mostly, this will be used by simply tacking #[auto_derive] onto a trait definition. At compile time, the compiler will try to figure out how to auto derive the trait and make a #[deriving(MyTrait)] available to the rest of the code.

For example, (assuming Clone didn’t have in built auto deriving functionality already), the following would work:

#[auto_derive]
pub trait Clone {
  fn clone(&self) -> Self;
}

// Elsewhere in the code
#[deriving(Clone)]
struct Foo {bar: u8}; // Assuming `Clone` is already implemented on u8
let foo = Foo {bar:1u8};
foo.clone();

Of course, for auto-deriving to work all fields of the datatype should implement that trait.

Now, there are many ways to fold up methods called on children fields to get the definition for the method on the parent scheme – how does the compiler know which one to pick?

We solve this by introducing deriving schemes. There are three schemes: the constructor scheme, the visitor scheme, and the general “folding scheme”.

Schemes apply to trait methods and define how they expand. A single trait with multiple methods can have different schemes for each one, eg:

trait Foo {
 fn create() -> Self; // Constructor scheme used
 fn visit(a: uint); // Visitor scheme used
 #[scheme(...)]
 fn encode(); // custom folding scheme used
 fn bar(&self) {baz(self)}; // Implementation exists, no need for a scheme
}

##The visitor scheme

This is the simplest scheme, and will be used for methods of the signature fn foo(/* zero or more arguments*/) -> (). The arguments may include self, but not an argument with a type contianing Self in its path1. The implementation for the method for the parent datatype is just to call the function on all of its fields with the same arguments.

This works for library types Hash and Eq (Eq has a hidden, inline, noop method assert_receiver_is_total_eq(&self) -> () used for deriving), as well as JSTraceable in Servo

Auto-deriving for empty traits like Copy/Send/Sync can also be made to work with this (even though the concept of schemes is meant to apply to individual methods, not traits as a whole)

Example:

#[auto_derive]
trait JSTraceable {
 fn trace(&self, trc: *mut JSTracer);
}

#[deriving(JSTraceable)]
struct Element {id: u8, name: String}

generates

impl JSTraceable for Node {
  fn trace(&self, trc: *mut JSTracer) {
    self.id.trace(trc);
    self.name.trace(trc)
  }
}

##The constructor scheme

This scheme is for methods which return Self (it can be easily extended to methods which return Option<Self> and Result<Self,E> as well, though for now I don’t think that is necessary). It can have any number of arguments. The arguments may include self, but not an argument with a type contianing Self in its path1.

###For non-static methods:

Call the method on all of the fields (with the arguments if they exist), then construct the result using the individual results.

Example:

#[auto_derive]
trait Incrementable {
  fn add_one(&self) -> Self;
}

#[deriving(Incrementable)]
struct Element {id: u8, large_id: u64}
 // u8/u64 already have an implementation of Incrementable

#[deriving(Incrementable)]
enum Foo {
 Bar(u8),
 Baz(u64)
}

will generate

impl Incrementable for Element {
 fn add_one(&self) -> Element {
  Element {
    id: self.id.add_one(),
    name: self.name.add_one()
  }
 }
}

impl Incrementable for Foo {
 fn add_one(&self) -> Foo {
  match *self {
   Bar(i) => Bar(i.add_one()),
   Baz(s) => Baz(s.add_one())
  }
 }
}

This could be used for the existing library type Clone.

###For static methods

This will mostly work the same way as the scheme for non-static methods, however it won’t be allowed to be used on enums (the reason being that while for static methods we can pick an enum variant with match *self for non-static methods, we can’t do anything similar here)2

This can be used for the existing library types Bounded, Default, Zero and Rand.

##Folding schemes

We’ve covered schemes for most of the existing auto-derived library types, however there are a couple left which don’t have any way to easily generate auto-deriving code yet (The motivation being for a design to be a good one it should at least be able to work with most of the existing use cases). The major ones are Encodable , PartialEq, PartialOrd, and Show .

We (I and @eddyb) haven’t really come up with any concrete proposal for this, hopefully we can hammer something out here in Discourse.

At its simplest, a custom folding scheme provides a #[derive_scheme(Fold(...))] annotation on the method in question. It can be applied to a method which would normally pick up the visitor/constructor scheme as well as being applied to an arbitrary method. The compiler will throw an error if one uses #[auto_derive] on a trait where it is unable to figure out the scheme for one or more method (and there is no scheme provided for these)

For example, PartialEq could be as follows:

trait PartialEq {
  #[derive_scheme(Fold($_ &&))]
  fn eq(&self, other: &Self) -> bool;
  #[derive_scheme(Fold($_ ||))]
  fn ne(&self, other: &Self) -> bool;
}

The $_ means “the result of calling the method on a child field”. The results are folded using the stuff inside Fold, and placed in the method body.

The example above makes a pitfall obvious: This again, has no way of dealing with enums when there is a Self argument. While Self struct arguments can be destructured, enums variants may not all have the same contents. We could provide a EnumNonMatchingVal(...) option to #[derive_scheme(...)] to solve this, but it may not address all cases.

This solution is also not sufficient for more complex cases like Show and Encodable. These handle different layouts differently. Show would be something like

pub trait Show {
    #[derive_scheme(
        TupleLike { try!(fmt.write_str(concat!($_name, "("))); try!($_); fmt.write(b")") }
        StructLike { try!(fmt.write_str(concat!($_name, "{"))); try!($_); fmt.write(b"}") }
        UnitLike { fmt.write_str($_name) }
        AnonymousList { try!($_);try!(fmt.write_str(",")) }
        NamedList { try!(fmt.write_str(concat!($_name, ":")));try!($_);try!(fmt.write_str(",")) }
    )]
    fn fmt(&self, fmt: &mut Formatter) -> Result;
}

One may also need FoldWrapper(...) to do some setup/cleanup outside of the folding.

As we can see, this gets complicated fast. Hoping to see some better designs of improvements to the existing one here. (Alternatively, only support constructor and visitor schemes, which drastically simplifies things, but then we need to keep using libsyntax magic for Encodable et cetera)

Drawbacks

  • Might be unnecessarily complicated. (?)
  • The folding schemes bit has lots of pitfalls which might not be possible to solve

Alternatives

Nothing yet

Unresolved questions

  • Namespacing. We might need to specify full paths in #[deriving(...)], however I’m told that nmatsakis has some stuff in mind for resolving paths during macro expansion
  • The entire folding schemes bit

1. This is because while we can destructure Self for structs and feed it to the method calls on the children, there’s nothing we can do like this for enums. The proposal could be extended to support auto-deriving for methods with Self arguments which errors if the auto-deriving thing is used for anything other than a struct or enum without fields. 2. Similar reasons as (1). If we want to, we can extend this RfC to provide a #[default_variant] attribute to be placed on enum variants to make it easier to choose enum variants for static constructor-scheme auto deriving


#2

Copying my prior-art comment from Reddit:


Haskell supports a form of generics that allows (with some effort on the part of the trait author, like this RFC):

class Serialize a where
  put :: a -> [Bit]
 
  default put :: (Generic a, GSerialize (Rep a)) => a -> [Bit]
  put a = gput (from a)

  get :: [Bit] -> (a, [Bit])
 
  default get :: (Generic a, GSerialize (Rep a)) => [Bit] -> (a, [Bit])
  get xs = (to x, xs')
    where (x, xs') = gget xs

-- ...stuff...
 
data UserTree a = Node a (UserTree a) (UserTree a) | Leaf
  deriving (Generic, Show)
 
instance (Serialize a) => Serialize (UserTree a)

which ‘derives’ an impl of the custom Serialize trait for the custom UserTree type. (In Rust terminology.)


#3

Also, having a psuedo-Rust derive_scheme sublanguage just for defining derivings seems rather suboptimal; in my mind something like this would preferably either reuse the macro-by-example functionality, or just be proper Rust.


#4

Yeah, the macro comment on reddit seems like a good way to do it.