Pre-RFC: User-provided default field values

Responding en masse; no individual part of this response is directed towards a particular user unless someone is directly mentioned.

NB: I have not updated the pre-RFC since its posting. I am discussing what should happen before updating the text, which takes a fair amount of time and effort.


After reading all responses, I am leaning further in favor of the Foo { .. } style initialization present in Centril's RFC and will be responding as such. I suggest others follow suit as that's the direction I'm headed. The RFC text will be updated once there is general agreement on the main principles.

My current intention is to have the default field value be semantically the same as a const block, such that control flow is not supported, everything must be in scope, and the expression is type-checked against the declared field type. With Foo { .. } construction, there is no concern about the expression potentially being dead code. The requirement to be a const context implicitly prohibits async code, though I suppose a block_on call could potentially be supported in const contexts in the future.

With regard to the discussion of how to support different defaults for different derives: why? What situation would necessitate a field to have a different default value depending on what is being derived? And is this a use case we want to encourage, or is is requiring manual implementations or an attribute macro in such a situation (as is the status quo) preferable?

For syn supporting this syntax, it's unfortunate that the library doesn't appear to be forward compatible in this situation. That said, I don't believe we should be making decisions about what to include in a language primarily on the reason that a library can't support it, particularly when said library is capable of creating a breaking release while the language cannot (barring edition-related things). This wouldn't be the first syntax introduced that syn is incapable of handling, as #[foo = expr] currently cannot be parsed. While certainly relevant to the topic at hand, how crates such as syn, serde, etc. would handle the transition (if any) from existing attributes to built-in syntax doesn't seem related to the RFC. Examples are provided in the RFC as to what would theoretically be possible, not what the crates must or even should do; it's a decision for those crates to make on their own.

The syntax for default field values and #[derive(Default)] using them need to land on stable (but not necessarily nightly) at the same time. If syntax lands first, it would be a breaking change to implement the others. Foo { .. } construction would not need to land at the same time. That's why changes to #[derive(Default)] are (and will continue to be) included in the RFC.

One thing I thought of that is not covered in Centril's RFC is that construction of structs where there are private fields (or where the struct is non-exhaustive) is that there needs to be a way to guarantee that future fields have a default value (for external users). I'm not sure what form this should take, so ideas are welcome.

4 Likes
  • is this another change?
  • to allow construction of non_exhaustive structs outside of their crate?

Would it perhaps be better to leave things as they are?

  • non_exhaustive?
  • can't construct form outside, sorry

It's a specific question that is also noted in Centril's RFC. As #[non_exhaustive] is stable, it's a question that will need to be answered as part of this RFC. It's marked as an unresolved question because it's not addressed in the section on interaction with #[non_exhaustive].

Yes, and

is one possible answer. An author who wishes to permit such construction can drop non_exhaustive.

Keep in mind that #[non_exhaustive] was only part of the question I posed. There's still the related question of a struct like

pub struct Window {
    width = 640,
    height = 480,
}

It needs to be determined how a field without a defaulted value (like foo: T rather than foo: T = bar) can be added to a struct like this in a backwards-compatible manner. Barring an attribute that opts into this syntax, possible solutions likely won't be thought of in a minute or two; it'll need a fair amount of deliberate thought and consideration. Prohibiting situations like this would be extremely restrictive to the point that only exhaustive structs with all fields being public would be permissible ā€” a.k.a. effectively nothing.

FYI, I specifically brought this up mainly, because you mentioned specific crates such as serde in the RFC, and also because the RFC was only about derive-macros at that point. I didn't mean to say that this is a blocker in any form. It might be worth considering the point about syn if we're only talking about adding the = value syntax for usage in derive-macro, because then the feature is only about "hey, let's improve macros", so forcing a breaking change in syn (i.e. not an "improvement" both for macro authors and users of older, unmaintained macros) just for "improving" macros is - at least in principle - a bit questionable. But if the feature of using Foo { .. } construction is also on the table, a breaking change of syn is completely irrelevant for the debate on this language feature and its syntax.

Maybe syn itself could eventually even benefit from this feature, making use of default-initialized fields for structs such as syn::Field in order to avoid further breaking changes when new optional syntax is introduced to the language that requires new fields in such structs.

It was just brought to my attention that there was an RFC that was postponed a few years ago; it appears as though this was actually the basis for Centril's RFC.

This was the crux of that part of my response. I think in any situation it's important to keep crates in mind, but I don't think it should be a significant consideration.

To be perfectly honest, the order of things is coincidental.

Another point that's relevant here, syn itself says that breaking changes are still intended to happen at some point after 1.0

This 1.0 release signifies that Syn is a polished library with a stable design and role and that it offers a user experience we can stand behind as the way we recommend for all Rustaceans to write procedural macros.

Be aware that the underlying Rust language will continue to evolve. Syn is able to accommodate most kinds of Rust grammar changes via the nonexhaustive enums and Verbatim variants in the syntax tree, but we will plan to put out new major versions on a 12 to 24 month cadence to incorporate ongoing language changes as needed.

(source)

5 Likes

I think I finally understood: the language needs three kinds of structs:

  1. cannot be constructed outside own crate so that

    • new private or
    • pub fields w/o defaults can be added
      and this is not a breaking change
  2. can be constructed outside own crate
    using the current S{a : b} syntax
    but adding new fields is a breaking change

  3. can be constructed outside own create
    using the new S{a : b, ..} syntax
    but adding new fields without defaults is a breaking change

Currently Rust only has (1) and (2) which are distinguished via non_exhaustive. But there is no way around it is there? Kind number (3) has to be provided, right? And it has to be marked somehow.

1 Like

As I described above, in my view #[non_exhaustive] should be equivalent to having a private field without default value. So #[non_exhaustive] struct Foo {} or #[non_exhaustive] struct Bar { bar: i32 = 0 } does not permit being constructed via Foo { .. } or Bar { .. } or Bar { bar: 2, .. } outside of the crate that defines the type.

One reason: Currently a #[non_exhaustive] pub struct Foo {} is basically equivalent to a pub struct Foo { pub(crate) _dummy: () }; external users of the crate cannot construct this struct through its constructor. This fact can - in principle - even be relied on for soundness.

Following my argument above, the way to do this would then to add #[non_exhaustive] - or equivalently - a private field without default value to the struct.


Either that, or by the presence of private fields.

Not necessarily marked by an attribute though; it could just be:

All structs that aren't #[non_exhaustive] and where all private fields have default values, but there is at least one private field (otherwise we're back in case (1)).

AFAICT; the main benefit of #[non_exhaustive] on a struct is that you don't need to have the overhead of explicitly initializing the otherwise necessary private pub(crate) _marker: () field. However when you'd add a private _marker: () = () field to get your struct from kind number (2) to kind number (3), then the overhead is just that you'll have to initialize it using S{a : b, ..} syntax yourself, too. Far less syntactical overhead.

2 Likes

All your points are true. Explicit marking however might

  • enhance learnability
  • improve error messages
  • lower cognitive load
1 Like

I smell moving goalposts here. I am talking about providing defaults, and not about enforcing non-defaults.


From a bird's eye view, there is a more general pattern emerging here, too. People are apparently trying to jam so much arbitrary, diverse (should I call it incoherent?), and niche semantics into a single feature that it's becoming hard to see how throwing in everyone's needs and trying to please them all would work.

If one has complex requirements regarding the initialization of fields, then express the requirements explicitly in code. Make every field private, implement Default manually if applicable, or don't implement it at all, and provide a constructor which requires all non-defaultable fields to be passed.

Don't try to force the semantics of arbitrarily complex initialization into the territory of "defaults". "Fields are required but not all of them, but we have a kinda-sorta default for some of them some of the time, and there is a Default impl that may or may not make sense semantically, and sometimes I really want a Builder" is a really, really poor design for Default.

Default was meant to be simple, like, "solves 95% of baseline problems with a derive macro" simple. But beyond that, there is the rest of the entire Rust language at everyone's disposal, too! Honestly, I simply cannot fathom why it is not good enough to derive Default (with explicit attributes on fields) in the majority of the cases, and then use the full power of the language when it gets more complicated than that. We have struct construction and FRU syntax, we have privacy, we have functions, we have methods, we have macros ā€“ what do they not solve beyond trivial shortening of 3 lines of code to 1 line?

6 Likes

non_exhaustive currently removes a ton of guarantees. I could see a future, though, where it has variants which remove fewer. For example (placeholder syntax!), we could have #[non_exhaustive(pub)] which works with FRU because it makes it a compile-time error to add any fields which aren't pub. Or #[non_exhaustive(default)] which means that any added fields will have defaults, and thus construction via struct literals with .. would be allowed.

I rather like that latter one, as it nicely parallels how you can pattern match non_exhaustive structs today.

2 Likes

You know what, Iā€™ve change my opinion. Explicit marking seems rather beneficial both as

  • a way to opt-out of Foo { a, b } construction in a struct where all fields are public (i.e. as a way to avoid the pub(crate) _marker: () = () workaround)
  • a way to opt-in to usage of Foo { a, .. } in a struct that has some private fields (in order to avoid any possible confusion/mistake if someone adds default values to those fields just with the intention of either using a custom value for the derive(Default) implementation, or in order to use .. internally)

I would personally propose that the new marker would simply be a .. in the struct declaration itself. I.e.

pub struct Foo {
    pub a: u32,
    pub b: u32,
    ..
}

would no longer permit initialization with Foo { a, b } outside of the current crate (only Foo { a, b, .. } allowed), and

pub struct Bar {
    pub a: u32,
    b: u32 = 0,
}

would not allow initialization as Foo { a, .. } outside of the current module (due to the visibility of b), unless you change the declaration to

pub struct Bar {
    pub a: u32,
    b: u32 = 0,
    ..
}

Both could render nicely in rustdoc, too, the latter could just render as

pub struct Bar {
    pub a: u32,
    ..
}

because the private fields donā€™t matter in the documentation.


Adding the .. marker to a struct with private fields without default value results in a compilation error.

This also means, it becomes impossible to accidentally break API by adding a private field without default value.

5 Likes

Note that that was what was originally proposed for non_exhaustive, but it became an attribute. So my default assumption here is that this desire should also be an attribute instead of the .. syntax.

But it's still possible to break it by adding a public field without a default value, so I'm unsure how much extra certainty that check would add.

2 Likes
  1. I don't appreciate the implication you're projecting on me.
  2. I sinned of reading this thread way too quickly trying to catch up with the current conversation, accidentally assuming that the underlying desire was also to have partially defaultable structs as in the previous thread I opened back in 2020.

The conclusion from the pre-pre-RFC for partial initialization was that traits in general are a poor fit for partial initialization, if it is to be accomplished it must be a lang feature. Using leveraging the existing .. syntax without an expr seemed to me as the best way of introducing surface level syntax for that potential behavior, because it is the same concept to an end user. The conversation about Default back then was around whether, if the field: Ty = default syntax was introduced, should it affect what derive(Default) does.

For all "magic" ergonomic features there's always a cliff: a point where you've strayed far away enough from the "happy supported path" that you have to abandon it, roll your sleeves and implement the functionality yourself without relying on the ergonomic feature. This is one case. If you have [#derive(Default)] struct Foo { a: Option<i32> } and add an b: i32 field, you suddenly go from a single line change to having to write a whole new impl. If you want to make the user always explicitly supply b and optionally supply a you need to write a whole builder from scratch, potentially with a type-level state machine to ensure that noone writes BuildFoo::with_a(a).with_b(b).with_a(x);. It also not only makes writing the API more verbose, it makes the use of that API more verbose.

Now, is moving the cliff a bit further worth it by complicating the language further, as slight as it might be? Well, that's why we come to internals instead of jump straight to an RFC, isn't it?

I personally feel that adding partial default struct declaration and expression will remove the need to explicitly write the vast majority of builders I've had to write or seen in the past. I also think that if the const default syntax is adopted, it should affect derive(Default), which would also remove the need to write a significant number of manual impl Defaults. I think where we significantly disagree is that it's not 3 lines vs 1 line of code: the changes balloon and compound very quickly.

From your positions, I'm also somewhat picturing that if derive(Default) didn't currently exist you would also be against its introduction: the same arguments apply to it to begin with.

7 Likes

Intersting point. However, in my view, thereā€™s a strong point in favor of using .. in this case: It mirrors the syntax thatā€™s being used to construct values of the type.


True, I guess I didnā€™t consider that case :slight_smile:

Iā€™m personally also interested in language feature( proposal)s that more generally help addressing the problem of unintentionally breaking API changes. E.g. there could be tooling that compares the whole API of a crate with a different, nominally semver-compatible, version of the same crate and reports back whether there are any breaking API changes (to be used as a check you can do before publishing a new version of a crate). This could even cover things like implicit Send/Sync information on impl Trait return types or async fn futures, so you donā€™t have to write tests for those things anymore. But thatā€™s unrelated to this topic.

2 Likes

I received a private message from a lang team member to discuss this; I won't name them as I'm not sure they want that. After discussion, we agreed the following is the best place to start is to:

allow .. on a struct if and only if every omitted field has an explicit = value default and every omitted field is visible to the user of .., whether or not #[non_exhaustive]

(reworded slightly for readability to others)

Note that this means private fields (or more specifically fields not visible in the current scope) are not supported. We weren't certain as to the utility of doing so, and it's forwards compatible if we want to support that situation in the future.

We agreed that #[non_exhaustive] structs will need to opt in to this syntax, as some libraries will want to support it and others won't. Being opt-out would impose a requirement on existing use cases, which is not desirable (and is, in a roundabout way, a breaking change). The team member's suggestion (which I support) was to consider the presence of a defaulted field (via = value) the opt-in, as it meant that the struct was modified after the introduction of the new syntax, indicating that the author intends to support this. This implicitly means that adding a defaulted field to a #[non_exhaustive] struct prohibits the author from adding a private or non-defaulted field in the future. This is still desirable in situations like this struct of mine, where I am willing to guarantee all fields will be public (and defaulted).

How do others feel about this? I understand some might want private fields, but understand that having this in some situations is still better than none. Perhaps you might be able to convince me and the lang member that it's desirable from the start.

Currently crate authors can declare:

#[non_exhaustive] - means exactly that; today.

But the suggested scheme diminishes this ability.

If I write an all-pub struct with a single default
I can no longer declare that.

I'm then obliged to never add a private field,
nor can I add a pub field without a default.
unless I declare it a breaking change.

Is that really the best way to go?


I still think it's worth giving authors those three choices.
The above is an argument against taking away choice number (1).


Update: the suggested scheme is a little awkward to support choice number (3) too. Suppose I write an all-pub struct such that I want to be able to add pub fields with defaults to in the future. So I want users to write S{a : 1, ..}. But I don't yet have any fields with defaults. I'm then forced to add _marker : () = (). And make it pub. It works.. but is a little awkward.

2 Likes

Comparing this to what I proposed above, this sound like the proposal is:

  • Except for structs without any = value defaults, (the exception is for backwards compatibility), #[non_exhaustive] plays the role as the

  • There is no

    at all, but this may become possible in the future.

  • For structs without any = value, #[non_exhaustive] keeps playing the role as a way to allow future addition of any fields (private or public, defaulted or not).


I donā€™t think thereā€™s any disagreement that

  • for structs where

    • all fields are public
    • the struct is not #[non_exhaustive]

    a constructor call in the Foo { bar: value, baz: value2, .. } style should be allowed, if all the omitted fields (which could be no fields at all) have default values.

And this point is identical between what you propose and my last proposal.


I donā€™t like the dual role that #[non_exhaustive] plays above. If it didnā€™t already exist as a way to explicitly opt into support for adding new fields (private or public, defaulted or not) to structs, then the #[non_exhaustive] attribute could be a good syntactic alternative to my proposed .. notation above. Since it already has a meaning, I think itā€™s questionable to give it a second meaning, depending on arbitrary concerns like ā€œis there any = value fieldā€, that exist in this form merely to ensure backwards compatibility. Well, put differently, if we were so keen on re-purposing #[non_exhaustive] in this way, we should probably also start issuing warnings on every existing use-case of #[non_exhaustive] on structs, in order to encourage them to add a private (dummy) field instead to reduce confusion. I would however personally prefer introducing a new syntax (be it attribute or dedicated syntax) for this feature, to avoid the need for ā€œstealingā€ repurposing the #[non_exhaustive] attribute.

Regarding private fields, to name a theoretical use-case, note that (whether or not thatā€™s desired), if private fields are supported, all structs with a const fn new() constructor could - technically - also start supporting initialization via StructName{..}. So we could make the standard library support things like

let s = String{..};
let v = Vec::<i32>{..};
let m = BTreeMap::<i32, i32>{..};
1 Like