Pre-RFC: User-provided default field values

The text of this RFC was taken in large part from @Centril's larger RFC here. Permission was previously granted to use the text as part of RFC 3107; I presume that the permission would apply here as well.

Aside from the ordinary feedback expected when a pre-RFC is posted, I am additionally looking for any prior art that may exist, such as from other languages.


  • Feature Name: derive_manual_default
  • Start Date: 2021-12-29
  • RFC PR: TODO
  • Rust Issue: TODO

Summary

Users can provide default values for individual fields when deriving Default on structs, thus avoiding the need to write a manual implementation of Default when the type's default is insufficient.

#[derive(Default)]
struct Window {
    width: u16 = 640,
    height: u16 = 480,
}

assert_eq!(Window::default(), Window { width: 640, height: 480 });

Motivation

Boilerplate reduction

The #[derive(..)] ("custom derive") mechanism works by defining procedural macros. Because they are macros, they operate on abstract syntax and don't have more information available. Therefore, when you #[derive(Default)] on a data type definition such as:

#[derive(Default)]
struct Foo {
    bar: u8,
    baz: String,
}

it only has the immediate "textual" definition available to it.

Because Rust currently does not have an in-language way to define default values, you cannot #[derive(Default)] in the cases where you are not happy with the default values that each field's type provides. By extending the syntax of Rust such that default values can be provided, #[derive(Default)] can be used in many more circumstances and thus boilerplate is further reduced.

Usage by other #[derive(..)] macros

Custom derive macros exist that have a notion of or use default values.

serde

For example, the serde crate provides a #[serde(default)] attribute that can be used on structs and fields. This will use the field's or type's Default implementations. This works well with field defaults: serde can either continue to rely on Default implementations, in which case this RFC facilitates specification of field defaults, or it can directly use the default values provided in the type definition.

structopt

Another example is the structopt crate with which you can write:

#[derive(StructOpt)]
#[structopt(name = "example", about = "An example of StructOpt usage.")]
struct Opt {
    #[structopt(short = "s", long = "speed", default_value = "42")]
    speed: f64,
}

By having default field values in the language, structopt could let you write:

#[derive(StructOpt)]
#[structopt(name = "example", about = "An example of StructOpt usage.")]
struct Opt {
    #[structopt(short = "s", long = "speed")]
    speed: f64 = 42,
}

derive_builder

A third example comes from the crate derive_builder. As the name implies, you can use it to #[derive(Builder)] for your types. An example is:

#[derive(Builder)]
struct Lorem {
    #[builder(default = "42")]
    pub ipsum: u32,
}

This can similarly be simplified to:

#[derive(Builder)]
struct Lorem {
    pub ipsum: u32 = 42,
}

Conclusion

As seen in the previous sections, rather than make deriving Default more magical, by allowing default field values in the language, user-space custom derive macros can make use of them.

Guide-level explanation

Deriving Default

Previously, you might have instead implemented the Default trait like so:

impl Default for Probability {
    fn default() -> Self {
        Self(0.5)
    }
}

However, since you can specify f32 = 0.5 in the definition of Probability, you can take advantage of that to write the simpler and more idiomatic:

#[derive(Default)]
pub struct Probability(f32 = 0.5);

Having done this, a Default implementation equivalent to the former will be generated for you.

Default fields values are const contexts

When you provide a default value field: Type = value, the given value must be a constant expression such that it is valid in a const context. Therefore, you can not write something like:

fn launch_missiles() -> Result<(), LaunchFailure> {
    authenticate()?;
    begin_launch_sequence()?;
    ignite()?;
    Ok(())
}

struct BadFoo {
    bad_field: u8 = {
        launch_missiles().unwrap();
        42
    },
}

Since launching missiles interacts with the real world and has side effects in it, it is not possible to do that in a const context since it may violate deterministic compilation.

Reference-level explanation

Grammar

The following grammars shall be extended to support default field values:

TupleField = attrs:OuterAttr* vis:Vis? ty:Type { "=" def:Expr }?;
RecordField = attrs:OuterAttr* vis:Vis? name:IDENT ":" ty:Type { "=" def:Expr }?;

Defining defaults

Given a field where the default is specified, i.e. either:

RecordField = attrs:OuterAttr* vis:Vis? name:IDENT ":" ty:Type "=" def:Expr;
TupleField = attrs:OuterAttr* vis:Vis? ty:Type "=" def:Expr;

both of the following rules apply when type-checking:

  1. The expression def must be a constant expression.
  2. The expression def must unify with the type ty.

When lint attributes such as #[allow(lint_name)] are placed on a field, they also apply to def if it exists.

#[derive(Default)]

When generating an implementation of Default, the compiler shall emit an expression where the user-provided default value is used to initialize the field, rather than the default value of the field's type. For example,

#[derive(Default)]
struct Window {
    width: u16 = 640,
    height: u16 = 480,
}

shall generate

impl ::core::default::Default for Window {
    fn default() -> Self {
        Self {
            width: 640,
            height: 480,
        }
    }
}

All fields where the default is not specified will remain initialized to their type's default value.

Bounds

The bounds of the generated Default implementation are not affected by this RFC.

Drawbacks

  • This integrates the concept of a default value into the language itself, rather than solely existing in the standard library. Note that this does not necessitate making the Default trait a lang item.

Rationale and alternatives

This proposed syntax is, to an extent, expected: there is currently a diagnostic in place indicating that this syntax isn't supported. One alternative is to extend the #[default] attribute introduced in RFC 3107 to accept default values. The drawback of this is that the interaction with a future possibility from RFC 3107 is not clear:

#[derive(Default)]
enum Foo {
    #[default = …] // What should this be? `Foo::Bar { .. }`? Just `Bar { .. }`? Neither?
    Bar {
        a: u8,
        b: u8,
    },
    Baz {
        a: u8,
        b: u8,
    },
}

Provided associated items as precedent

While Rust does not have any support for default values for fields or for runtime parameters of functions, the notion of defaults is not foreign to Rust as a language feature.

Indeed, it is possible to provide default function bodies for fn items in trait definitions. For example:

pub trait PartialEq<Rhs: ?Sized = Self> {
    fn eq(&self, other: &Rhs) -> bool;

    fn ne(&self, other: &Rhs) -> bool { // A default body.
        !self.eq(other)
    }
}

In traits, const items can also be assigned a default value. For example:

trait Foo {
    const BAR: usize = 42; // A default value.
}

Thus, to extend Rust with a notion of field defaults is not an entirely alien concept.

Prior art

???

Unresolved questions

  • None so far.

Future possibilities

  • The derived Default implementation could be made const so long as types without a specified default value impl const Default.
  • Support could be extended to the #[default] variant of enums once non-unit variants are supported.
18 Likes

I feel like there's no rationale given here for why the limitation to constant expressions only is in place. You mention this simplifies things like derive_builder or structopt, yet those do currently of course support side-effects in those default values, so it won't be able to replace their API entirely. They would probably need to support multiple ways then; the new, neater-looking field: Type = value and their previous approach using an attribute. The derive_builder default value expressions even supports fallible expressions using ? to propagate an error.


I'm wondering what putting these default values on a struct does when there aren't and derive macros (that care about those default values). If they're only for derive macros, then perhaps a derive macro would need to opt-in to allowing this new syntax, similar to how it currently works with derive macro helper attributes? Then the rust compiler could emit an error for code like

#[derive(Debug)]
struct Foo {
    x: i32 = 0,
}

where the = 0 part would be completely useless. A derive macro would opt into this by using some attribute, e.g. something like

//                          vvvvvvvvvvvvvvvvvvvv--- explicit opt-in
#[proc_macro_derive(Foobar, default_field_values, attributes(baz, qux))]
pub fn derive_foobar(_item: TokenStream) -> TokenStream {
    ....
}

Slight caveat: if these default values ever do get any meaning beyond just being usable for macros, then such an opt-in would become entirely irrelevant (though I suppose keeping a deprecated and meaningless default_field_values argument for proc_macro_derive is not too bad).

4 Likes

In what situation is a default value having side effects desirable? A default should also never be fallible I would imagine; if it can then it hardly has a sensible default.

I've no doubt you realize this, but for those that don't care enough to read the full RFC, as currently written it's permitted without any semantic meaning. I wouldn't be opposed to emitting an unused code warning in the same manner as anything else; it's certainly desirable. I don't see why it needs to be a hard error, though.

I think the warning is based on spans, which would mean any macro emitting code with a span of the default field value would mark the code "used". This would avoid the need to have an explicit annotation on the proc macro. I could be mistaken and haven't checked a proc macro in this scenario on the code I have locally.

1 Like

I was mainly just pointing out something I spotted in the documentation of derive_builder; I cannot comment on how useful this is.

struct Lorem {
    ipsum: String,
    // Custom defaults can delegate to helper methods
    // and pass errors to the enclosing `build()` method via `?`.
    #[builder(default = "self.default_dolor()?")]
    dolor: String,
}

Well, (at least currently) const doesn't support allocations; default values for Box<...> fields for example can make a lot of sense. In any case; AFAICT the const constraint made sense in the original RFC where these values were used to allow for constructing these structs without explicitly populating the fields with default value. However as long as the use-case is only in macros, this limitation becomes questionable, at least in principle. One potential reason I can see to keep it const is for future-compatibility, should something like the RFC from Centril be (possibly) wanted in the future. In any case, it feels to me like this RFC needs to add at least some explanation for why the constrait to const expressions is made, and maybe also highlight the possibility to not place this restriction as an alternative under "Alternatives".

6 Likes

If it is not required to be const, the following questions arise in my mind (they were forming, then were resolved when I read that bit, but I agree that it should be explicit):

  • In what context is the default value evaluated?
    • That is, if it panics, what does the backtrace look like at runtime?
    • Would struct LogCtx { file: &'static str = file!(), line: u32 = line!() } "make sense" or be fairly useless? If the former, what rules are deferring the macro expansion? If the latter, would this be desirable behavior to have?
  • What names are in scope in these expressions? If macros are deferred so that LogCtx works as one would desire, does the caller need to import names expanded by these macros? Do the macros themselves need manually use'd?

There are pieces of others bouncing around in my head, but they haven't coalesced.

1 Like

In case we would eventually get more features from the proposed RFC of Centril, then I'd imagine that we might eventually want that code like

#[derive(Debug)]
struct Foo {
    field: i32 = panic!(),
}

causes a compilation error, similar to

const FOO: i32 = panic!();

assuming that the default values would be treated like actual constants.

With the RFC as proposed here, I suppose we'd be deciding that code like this does not cause a compilation error? Or, actually, I'm not sure, since you aren't all that specific about what enforcing the rule that "The expression def must be a constant expression" actually entails.


This thought also just brought me to the question of how this feature is going to interact with generics. I'd assume that generic arguments are... probably... not in scope, so e.g. something like

#[derive(Default)]
struct Foo<T> {
    value: T,
    size: usize = std::mem::size_of<T>(),
}

won't work. Or would it? This deserves some clarification in the RFC though, I guess.


Edit: Taking constants in traits for comparison, something like

trait Foo {
    const X: usize;
}

impl Foo for ()  {
    const X: usize = panic!();
}

with the constant never use doesn't actually fail, and generic arguments can be used as in

trait Foo<T> {
    const SIZE: usize = std::mem::size_of::<T>(),
}
1 Like

...and also requiring const is a smaller more incremental step, right? Centril's RFC was good btw..

It's unspecified in the same way that const _: T = panic!(); is not actually guaranteed to abort compilation (though in practice it always does).


After some minor discussion on Zulip in addition to here, I'm going to go ahead and change the RFC to accept an arbitrary expression (that needn't be const). A potential future RFC for #[derive(const Default)] or similar is where we're headed I believe.

As for generics, it seems sensible to permit it. I'll make an explicit note of this.

4 Likes

Please make sure to (at least indirectly) also address whether or not (ignoring usages in derived trait implementations, or when there isn't a derive in the first place)

  • the expression is checked for whether everything is in scope,
  • the expression is type-checked,
  • the expression allows for control-flow stuff like ? or return or break (without a containing function/try-block/loop.
2 Likes

I just want to note that allowing = def as dead code with no effect is almost certainly not forward compatible with using the values by anything other than newly added macros, as it would turn dead code into functional code. (This includes macros adding new support for the syntax.)

As such, I'm really against this form; I expect something closer to Centril's RFC, and would prefer to stay forwards compatible with it.

5 Likes

In that situation how could this be made forward compatible? @steffahn mentioned the possibility of requiring an explicit annotation that a macro uses it, which would presumably permit rejecting the new syntax if nothing needed it. That doesn't seem ideal, so is there anything else?

Personally, I'm +½ on fully adopting the Foo { x: 3, .. } syntax for construction. It certainly wouldn't be as easy to implement, but it's still doable. If others are on board with that, then I can update the RFC to that effect. And if this is desired, is maintaining the const context important given the implicit nature of the syntax?

1 Like

In my opinion, yes; any side effects originating in a .. seem somewhat questionable. At least const is a very reasonable restriction here, and in case it turns out it's too strict and not needed, it can always be lifted later.

(If it's lifted, note that a crate adding a field without a const-compatible default value to one of its structs would be a new instance of (possibly easy to miss) semver-breaking changes that one would need to look out for, since a user initializing a const item with something like const X: Foo = Foo { field: 42, .. }; would break over such a change.)

6 Likes

I don't see that as a problem. And also, if we will adopt Centril's RFC, won't it be the same?

I don't see why we should typecheck it at all. It can become a valid syntax with no meaning, that is open for macro use (though I agree with @steffahn that it's better if we'll require macro authors to specify a dependency on it).

But I have other problem with this proposal: the syntax can be used only once. If you use it for #[derive(Default)], for example, you can't use it for anything else, say #[serde(default)].

And it means one of two:

  • You fall back to existing mechanisms like attributes.
  • You don't need different expressions, because they should be the same. But then there isn't much of advantage in this RFC, since you can just express all of them by a manually-implemented Default, and manually implementing one trait is not such a big deal.

It took me a minute to realize that this is not the case as well. No, it's not the same, because a derive has to be explicitly included whereas Centril's RFC is native syntax that's always present.

It's semantically a default value for a field of a given type. Isn't that reason enough?

Why?

It will make it possible to use them for Centril's RFC, but it's only a symptom: the general problem, that we cannot change a feature to support default values - only until we stabilize or new features, will stay.

Semantically.

And the context varies. When #[derive(Default)] will paste it in the body of fn default() -> Self it will be typechecked there. But there can be other contexts. For example, what if some framework will allow you to provide async/fallible default?

I don't understand your question. Suppose I want to provide default value of 1 to the Default trait and 2 for serde. I can't.

Serde could still have an attribute for that special case, while using the common syntax for the easy case where all default-using derives need the same value. (Serde would still need the attribute for backward compatibility anyway, they simply need to pick the correct priority rule.)


That said, starting to support the common syntax might be a breaking change for a crate, since it could be present for another derive and subtly change behavior.

#[derive(changing_crate::WasUsingDefaultTrait, other_crate::AlreadyUsingSyntax)]
struct X {
    x: u8 = 1;
}

Whoops, WasUsingDefaultTrait now uses 1 instead of 0. (The above case was misleading before the change in changing_crate, but I believe it is not considered to be important for the definition of “breaking change”?)

I explicitly pointed on that:


This was already said:

cc @ekuber , since I'd still like to see exactly what was worked out in their previous thread

2 Likes

I don't see what the value of this is. Why is there need for custom syntax if attributes can achieve the same thing?

AFAIK non-string attributes are already stable, so the author of structopt could just update the crate so that it accepts non-string expressions for defaults as well, and the implementation of #[derive(Default)] could do the same. There is absolutely zero justification for separate syntax here.

3 Likes