Request: IsZero trait and #[derive(IsZero)]

My situation

When I use serde to serialize/deserialize data, I want to skip serializing empty fields. I do this:

#[derive(Default, Serialize, Deserialize)]
struct MyData {
  pub mandatory_field: i32,
  #[serde(default, skip_serializing_if = "String::is_empty")]
  pub optional_string: String,
  #[serde(default, skip_serializing_if = "Vec::is_empty")]
  pub optional_array: Vec<i32>,
  #[serde(default, skip_serializing_if = "Option::is_none")]
  pub optional_value: Option<i32>,
}

It is manageable with built-in data structures, but with custom types, I have to make my own is_empty:

#[derive(Default, Serialize, Deserialize)]
struct MyData {
  pub mandatory_field: i32,
  #[serde(default, skip_serializing_if = "OptionalData::is_empty")]
  pub optional_data: OptionalData,
}

#[derive(Default, Serialize, Deserialize)]
struct OptionalData {
  #[serde(default, skip_serializing_if = "String::is_empty")]
  pub optional_string: String,
  #[serde(default, skip_serializing_if = "Vec::is_empty")]
  pub optional_array: Vec<i32>,
  #[serde(default, skip_serializing_if = "Option::is_none")]
  pub optional_value: Option<i32>,
}

impl OptionalData {
  pub fn is_empty(&self) -> bool {
    self.optional_string.is_empty() && self.optional_array.is_empty() && self.optional_value.is_none()
  }
}

Proposal

Add a trait named IsZero and a derive macro named IsZero. The above code could be simplified into:

#[derive(Default, Serialize, Deserialize)]
struct MyData {
  pub mandatory_field: i32,
  #[serde(default, skip_serializing_if = "IsZero::is_zero")]
  pub optional_data: OptionalData,
}

#[derive(Default, Serialize, Deserialize, IsZero)]
struct OptionalData {
  #[serde(default, skip_serializing_if = "IsZero::is_zero")]
  pub optional_string: String,
  #[serde(default, skip_serializing_if = "IsZero::is_zero")]
  pub optional_array: Vec<i32>,
  #[serde(default, skip_serializing_if = "IsZero::is_zero")]
  pub optional_value: Option<i32>,
}

Alternatively, instead of IsZero, we should have a trait named Zero that both create "zero" value and check if a value is "zero" (I am assuming that Default trait doesn't always return "zero").

3 Likes

You could use num_traits::Zero. This used to be in the standard library before Rust 1.0, but was pushed out to a crate in the final days off stabilization.

But that's only for numeric zero, not containers and such.

1 Like

Naming aside ("zero" doesn't make much sense for a string or many other data types), is there any use case for this aside from serde? Having to implement an is_empty method or similar doesn't seem like too big of a hurdle, imo.

4 Likes

I would have to make N is_emptys for N optional structs.

Required question: can this be a userland derive macro? What benefit is there in having this in std rather than a crate? What about just the trait in std, and the derive in a crate?

Last I saw discussion around it, we might end up having a "trait IsEmpty" in std. We currently have an unstable ExactSizeIterator::is_empty, but (last I saw) general consensus was that it doesn't really fit there, because an Iterator can definitively know if it's empty without knowing its exact size. Plus, things that are not iterators, just iterable, can be empty (or not). However, while none of the existing iterator traits seemed the "right" place to put is_empty, general temperature was that a trait for is_empty also felt not quite "right" either.

Personally, though, I find that this kind of aggressive "don't serialize if 'empty'" approach isn't all that great.

It's meaningful to be able to tell the difference in a configuration between "unset; use the default" and "set (to the current default)," especially if there's any chance of the default changing in the future. Additionally, skip_serializing_if just straight doesn't work for non-self-describing formats (such as bincode). (Thus this is for human-understandable serialization, even if not intended to be human edited.) Obviously sometimes you just have to match a JSON web payload with very weakly typed semantics (and often legacy data with the wrong shape (cough trello >.>)), so you can't always take the "moral" path.

4 Likes

It can be.

Convenience, widespread usage, no ecosystem split.

My proposal is broader than just containers and iterators. It includes all numbers, things that act like numbers (such as complex numbers, vectors, matrices, etc.), anything that has an "add" operation (n + 0 = n, text + "" = text, set + [] = set, etc.).

1 Like

You can define this:

trait IsDefault: Default + PartialEq {
    fn is_default(&self) -> bool;
}

impl<T: Default + PartialEq> IsDefault for T {
    fn is_default(&self) -> bool {
        *self == T::default()
    }
}
6 Likes

comparison is not the most efficient operation, not to mention calling T::default() can be expensive.

2 Likes

That's technically true, but for all the examples you've given it would be optimally efficient. eg. Vec::default doesn't allocate, the call would presumably get inlined, and is_default would reduce to a check that len is zero.

3 Likes

It can be but in the overwhelming majority of cases, it isn't. Numeric types, standard collections, even things like Box<str> and anything that is built on these by means of #[derive(Default)] implement Default at the cost of a couple of register moves.

2 Likes

Naming nit: If you want to use it on Vec, it shouldn't be called IsZero, since that could easily be confused with non-zero optimizations in enums like Option, where any empty Vec is non-zero from that perspective.

Something along the lines of "is default" may be better.

(Said without taking a position on whether the trait should exist.)

Because it's tied to serde(default), comparison with an actual Default value seems most correct (assuming that Default is deterministic, which theoretically it doesn't have to be). Otherwise it would create subtle bugs when Default implementation initialized values to something that isn't "zero".

Another solution could be to copy Golang's zero initialization concept, with a pair like Zero::zero_value() and Zero::is_zero(). But I'm not keen on that in Rust, since in other languages zero/falsy values are used as a less-precise Option alternative.

Zero value is supposed to be a constant or a pure function.

You are talking about nullable types (e.g. number | undefined in TypeScript)? Types that impls Zero are very different from nullable, they are just normal types (such as i32 where "zero" is 0 or String where "zero" is "").

I think you misunderstood kornel since this is effectively agreeing with them despite being phrased as a disagreement.

Yes, zero values are different from nullable types. That was actually kornel's point. zero/falsy values are not a kind of nullable type, they are a "less-precise alternative" to true nullable types.

Typescript's type | undefined counts as true nullable types, although the benefits are a bit muddled by not having a single standard Option/Maybe/etc type like other languages with such types. Golang doesn't have nullable types in general (iiuc nil only applies to interface types), which I assume is because it doesn't have generics, so that's why it often ends up using a type's zero value to deal with nullability in a generic way (albeit a flawed / "less precise" way since the zero value could easily be a legitimate value as well as an absence).

3 Likes

For what it's worth, in formal terms such a structure is called a monoid: a tuple (S, β€’, 0ΜΈ) such that β€’ is an associative binary operation over the set S and 0ΜΈ ∈ S is the (unique) neutral element with respect to β€’. In other words, it is a semigroup with a neutral element. In Haskell:

class Semigroup a => Monoid a where
    mempty :: a
    mappend :: a -> a -> a

This is certainly a useful and very natural abstraction to have, but strictly speaking also stronger than what your serde use case requires.

2 Likes

Thank you.

My biggest problem with the OP's proposal is the use of the term "Zero" for non-numerics such as the empty string (""). Your note above, giving background on monoids, also provides a non-numeric name for the trait and derive: IsNeutral and #[derive(IsNeutral)]. I could support such names, whereas I vehemently oppose using the term "Zero" for non-numerics.

1 Like

I mean treating an empty string as a sentinel value. Option<String> can tell a difference between value not being set at all, and a value intentionally set to an empty string.

That distinction may be useless in your case. But when it's built-in into the language, it becomes more widely used, and becomes something that needs to be considered when defining structs β€” that's my impression from Golang.

JS has null/undefined that can be set even on string fields, but in this case Rust's IsZero would be closer to Golang. Golang can't set string to nil, so the zero value of a string is an empty string.

But it's not "the zero value"; it's the additive identiy under + as the string concatenation operator.

1 Like

What did you have in mind when you said "ecosystem split"? In my mind, an "ecosystem split" happens when you have two crates that do something similar, but are mutually incompatible, or prevent composition in some important way.

I don't see that happening if this feature is implemented in a crate. If both you and I implement a crate that provides an IsZero or IsEmpty trait, downstream crates would be able pick either one (or even use both) without risking any incompatibility.

JavaScript does not have static typing. TypeScript has static typing, and when "strictNullCheck" is set to true (recommended practice, but sadly not default due to backward compatibility), a variable of type string cannot be set to undefined or null.

Go does not have Option or nullable so developer have to resort to use zero values. But Rust has Option, why would a Rust developer treat zero values as "unset".

(to clarify, in my use case, I don't treat zero value as null either, I merely see that explicitly setting something to zero is redundant)