Pre-RFC: `#[derive(Default)]` on enums with a `#[default]` attribute

I've extracted and slightly modified some text from Centril's draft RFC from a while back that proposed this and much more. He has granted permission for me to use the text.


  • Feature Name: derive_enum_default
  • Start Date: 2021-04-07
  • RFC PR: TODO
  • Rust Issue: TODO

Summary

An attribute #[default], usable on enum variants, is also introduced, thereby allowing enums to work with #[derive(Default)].

#[derive(Default)]
enum Foo {
    #[default]
    Alpha(u8),
    Beta,
    Gamma,
}

assert_eq!(Foo::default(), Foo::Alpha(0));

The #[default] attribute may not be used on a variant that is also declared #[non_exhaustive].

Motivation

#[derive(Default)] in more cases

Currently, #[derive(Default)] is not usable for enums. To rectify this situation, a #[default] attribute is introduced that can be attached to variants. This allows you to use #[derive(Default)] on enums wherefore you can now write:

// from time
#[derive(Default)]
enum Padding {
    Space,
    Zero,
    #[default]
    None,
}

Clearer documentation and more local reasoning

Providing good defaults when such exist is part of any good design that makes a physical tool, UI design, or even data-type more ergonomic and easily usable. However, that does not mean that the defaults provided can just be ignored and that they need not be understood. This is especially the case when you are moving away from said defaults and need to understand what they were. Furthermore, it is not too uncommon to see authors writing in the documentation of a data-type that a certain value is the default.

All in all, the defaults of a data-type are therefore important properties. By encoding the defaults right where the data-type is defined gains can be made in terms of readability particularly with regard to. the ease of skimming through code. In particular, it is easier to see what the default variant is if you can directly look at the rustdoc page and read:

#[derive(Default)]
enum Foo {
    #[default]
    Bar {
        alpha: u8,
    },
    Baz {
        beta: u16,
        gamma: bool,
    }
}

This way, you do not need to open up the code of the Default implementation to see what the default variant is.

Guide-level explanation

The ability to add default values to fields of enum variants does not mean that you can suddenly #[derive(Default)] on the enum. A Rust compiler will still have no idea which variant you intended as the default. This RFC adds the ability to mark one variant with #[default]:

#[derive(Default)]
enum Ingredient {
    Tomato,
    Onion,
    #[default]
    Lettuce,
}

Now the compiler knows that Ingredient::Lettuce should be considered the default and will accordingly generate an appropriate implementation of Default for Ingredient:

impl Default for Ingredient {
    fn default() -> Self {
        Ingredient::Lettuce
    }
}

Note that after any cfg-stripping has occurred, it is an error to have #[default] specified on more than one variant.

Due to the potential of generated bounds becoming more restrictive with an additional field, the #[default] and #[non_exhaustive] attributes may not be placed on the same variant.

Reference-level explanation

#[default] on enums

A built-in attribute #[default] is provided the compiler and may be legally placed solely on exhaustive enum variants. The attribute has no semantics on its own. Placing the attribute on anything else will result in a compilation error. Furthermore, if the attribute occurs on more than one variant of the same enum data-type after cfg-stripping and macro expansion is done, this will also result in a compilation error.

#[derive(Default)]

Placing #[derive(Default)] on an enum named $e is permissible iff that enum has some variant $v with #[default] on it. In that event, the compiler shall generate an implementation of Default where the function default is defined as (where $f_i denotes a vector of the fields of $e::$v):

fn default() -> Self {
    $e::$v { $f_i: Default::default() }
}

Generated bounds

To avoid needlessly strict bounds, all types present in the tagged variant's fields shall be bound by Default in the generated code.

#[derive(Default)]
enum Option<T> {
    #[default]
    None,
    Some(T),
}

would generate:

impl<T> Default for Option<T> {
    fn default() -> Self {
        Option::None
    }
}

while placing the #[default] attribute on Some(T) would instead generate:

impl<T> Default for Ptr<T> where T: Default {
    fn default() -> Self {
        Option::Some(Default::default())
    }
}

Interaction with #[non_exhaustive]

The Rust compiler shall not permit #[default] and #[non_exhaustive] to be present on the same variant. Any variant not designated #[default] may be #[non_exhaustive], as can the enum itself.

Drawbacks

The usual drawback of increasing the complexity of the language applies. However, the degree to which complexity is increased is not substantial. One notable change is the addition of an attribute for a built-in #[derive], which has no precedent.

Rationale

The inability to derive Default on enums has been noted on a number of occasions, with a common suggestion being to add a #[default] attribute (or similar) as this RFC proposes.

Bounds being generated based on the tagged variant is necessary to avoid overly strict bounds. If this were not the case, the previous example of Option<T> would require T: Default even though it is unnecessary because Option::None does not use T.

Prohibiting #[non_exhaustive] variants from being tagged with #[default] is necessary to avoid the possibility of a breaking change when additional fields are added. If this were not the case, the following could occur:

A definition of

#[derive(Default)]
enum Foo<T> {
    #[default]
    #[non_exhaustive]
    Alpha,
    Beta(T),
}

which would not have any required bounds on the generated code. If this were changed to

#[derive(Default)]
enum Foo<T> {
    #[default]
    #[non_exhaustive]
    Alpha(T),
    Beta(T),
}

then any code where T: !Default would now fail to compile.

Alternatives

One alternative is to permit the user to declare the default variant in the derive itself, such as #[derive(Default(VariantName))]. This has the disadvantage that the variant name is present in multiple locations in the declaration, increasing the likelihood of a typo (and thus an error).

Another alternative is assigning the first variant to be default when #[derive(Default)] is present. This may prevent a #[derive(PartialOrd)] on some enums where order is important (unless the user were to explicitly assign the discriminant).

Prior art

Procedural macros

There are a number of crates which to varying degrees afford macros for default field values and associated facilities.

#[derive(Derivative)]

The crate derivative provides the #[derivative(Default)] attribute. With it, you may write:

#[derive(Derivative)]
#[derivative(Default)]
enum Foo {
    #[derivative(Default)]
    Bar,
    Baz,
}

Contrast this with the equivalent in the style of this RFC:

#[derive(Default)]
enum Foo {
    #[default]
    Bar,
    Baz,
}

Like in this RFC, derivative allows you to derive Default for enums. The syntax used in the macro is #[derivative(Default)] whereas the RFC provides the more ergonomic and direct notation #[default] in this RFC.

#[derive(SmartDefault)]

The smart-default provides #[derive(SmartDefault)] custom derive macro. It functions similarly to derivative but is specialized for the Default trait. With it, you can write:

#[derive(SmartDefault)]
enum Foo {
    #[default]
    Bar,
    Baz,
}
  • The same syntax #[default] is used both by smart-default and by this RFC. While it may seem that this RFC was inspired by smart-default, this is not the case. Rather, this notation has been independently thought of on multiple occasions. That suggests that the notation is intuitive since and a solid design choice.

  • There is no trait SmartDefault even though it is being derived. This works because #[proc_macro_derive(SmartDefault)] is in fact not tied to any trait. That #[derive(Serialize)] refers to the same trait as the name of the macro is from the perspective of the language's static semantics entirely coincidental.

    However, for users who aren't aware of this, it may seem strange that SmartDefault should derive for the Default trait.

Unresolved questions

  • Should the generated bounds be those required by the tagged variant or those of the union of all variants? This matters for enums similar to Option<T>, where the default is Option::None — a value that does not require T: Default.

    Resolved in favor of requiring all types in the only the tagged variant to be bound by Default.

Future possibilities

The #[default] attribute could be extended to override otherwise derived default values, such as

#[derive(Default)]
struct Foo {
    alpha: u8,
    #[default = 1]
    beta: u8,
}

which would result in

impl Default for Foo {
    fn default() -> Self {
        Foo {
            alpha: Default::default(),
            beta: 1,
        }
    }
}

being generated.

Alternatively, dedicated syntax could be provided as proposed by @Centril:

#[derive(Default)]
struct Foo {
    alpha: u8,
    beta: u8 = 1,
}

If consensus can be reached on desired bounds, there should be no technical restrictions on permitting the #[default] attribute on a #[non_exhaustive] variant.

20 Likes

How does this interact with generics, and in particular, generics that are not the type of any field of the default variant?

If generics work as they do with existing derive macros then the Option example does not work as intended, requiring a T: Default precondition on the impl.

This makes me think that this RFC would significantly benefit from some way to manually specify the generic bounds for the derived implementation. And perhaps one could also discuss a relaxation of those bounds to take the fields into consideration when the type is not pub (since in that case there are no concerns of inadvertently breaking semver compatibility or leaking implementation details).

On a second thought, leaking implementation details or doing breaking changes is not a concern for structs or enums without any private fields or non_exhaustive attributes. Since enums never habe private fields, the field types of enums could reasonably be considered for Default implementations, at least in the absense of non_exhaustive attributes. Such a change could also be made to deriving Default on structs without private fields and to other derive macros.


And some things that I noticed:

  • your Option example is missing the derive attribute
  • you don't clarify whether the default attribute can be used on enums that don't have a #[derive(Default)]

I would expect the bounds to be the same as how they are currently on structs. That is to say that the bounds would exist even if not strictly necessary for the declared default. I am aware that this means the Option example isn't ideal — I should change it.

Differentiating between pub and non-pub enums would be quite confusing. Making something public in and of itself should not open the door to potential breakage, in my opinion.

Nice catch. I'll be sure to add the #[derive(Default)] when I'm on my laptop next. The attribute should absoluely only be allowed with #[derive(Default)], as it would be that derive macro providing the attribute (as "normal" proc macros are able to do).

Yeah, that’s the strongest argument against something like this. My answer you quoted also included the suggestion of some explicit way to provide generic bounds. Similarly, one could also require an explicit annotation for a field-based implementation (and that could even be limited to the cases without leakage or potential for breakage that I explained above). Some code examples are in this comment from an earlier thread

One alternative worth mentioning, though I don't know if it's an improvement: rather than marking one enum variant with #[default], you could write the name of that variant in the derive: #[derive(Default(SomeVariant))].

Either way, this seems reasonable to me.

1 Like

I thought of that as well, but given the strong precedent for having a #[default] attribute (and multiple people independently coming up with it), it just feels right. I'll add it as a possible alternative, noting a drawback that there's a higher potential for failure as typos are a thing.

imo that's probably best suited for a separate proposal, as it would very much be applicable to structs as well. My goal was to keep this RFC small ergonomic improvement that pretty much everyone could agree on. Having a way to explicitly declare bounds would be great, though! I'll add it to future possibilities.

Is there any particular reason or specific use case for using the #[default] attribute on any variant besides the first in the enum?

That is to say: perhaps it's sufficient for the first variant to always be the implicit default and an explicit annotation isn't necessary.

There can be, for ergonomic reasons:

#[derive(Default)]
enum Foo {
    Bar,
    Baz,
    #[default]
    Quux,
    Quuz,
}

in order to write it using the first variant, one would have to write:

#[derive(Default)]
enum Foo {
    Quux = 2,
    Bar = 0,
    Baz,
    Quuz = 3, // (or move all the variants that were after `Quux` back after the new location of `Quux`)
}

which is also fragile to refactoring (and, imho, not necessarily the most intuitive / readable syntax)

Sometimes enum variants are "grouped" by purpose, and the first one isn't what's desired. Another instance is where enum variants have an inherent order.

I have an enum SubsecondDigits which has variants for 1-9 (written out) and "one or more". While I decided to have one or more be the default, choosing nine would have been more than reasonable. Having SubsecondDigits::One be placed before SubsecondDigits::Nine seems logical, even if the former isn't the default.

How would enum variants with payload work?

As another piece of prior art, in the num_enum crate, used for converting between primitive enums and primitives, we added a default attribute to allow for exhaustive From/Into implementations - it would be lovely to use a standard one which can be shared across crates, rather than each implement our own distinct from each other :slight_smile:

In the obvious way, by applying #[derive(Default)] semantics to the logical struct around the payload. (IOW each field is defaulted via Default::default().)

The order is also relevant for derived Ord implementations.

2 Likes

I have updated the original post to account for the feedback so far. You can see the changes by viewing the edit history.

I wouldnʼt have thought of it as a “union of variants” but more of a “juts put bounds on all of the generic parameters no matter what the variants and their fields are”. Isnʼt that whatʼs happening for structs as well? The fields/variants simply donʼt matter at all. Notably, this always also includes other problems, e. g. if you have an Option<T> field and wouldnʼt need the T: Default for that reason, or if you are using an associated type of T through a trait and thus the generated Default implementation doesnʼt even compile successfully.

Why? The given argument (to allow altering the default variant) is very weak.

Personally I think it would be useful to solve derive(Default) for generic structs first (rust#26925; yes it's nearly six years old)!

3 Likes

Changing the wording is not a bad idea :slight_smile: Ultimately it's just me trying to figure a way to word the behavior formally.

How so? Being able to change the default variant without potential breakage is a reasonable thing to do imo. I'm not necessarily opposed to it, but I think keeping this RFC small is best so as to hasten implementation and stabilization.

I'm not interested in discussing that issue here, as it's off-topic.

I think it's somewhat (but not completely) relevant because one of the options for solving it is would be allowing one to specify bounds for the derive implementation... and you'd likely do that with another annotation. Just as this RFC would introduce an accompanying attribute for a built-in derive, so might the solution to the incorrect bounds issue.

More generally though, adding the ability specify a default enum variant definitely seems like it's an invitation to add other annotations to help make the default derive more powerful.

I'm very much on board for adding the ability to specify a default enum value though.

Yes, and that should be brought up there. It's listed under future possibilities, which is the extent to which this is relevant for this RFC. Discussing the possibility of declaring explicit bounds is fine, but bringing up the issue regarding structs has nothing to do with this.