Better enums

Enums in Rust are very pleasant to work with because they allow opt-out comprehensive case handling. They allow to express polychotomies which naturally occur in many situations. For me, they are perfect reification of sets in the mathematical sense; arguably, the second most important concept in mathematics after (proper) classes. Their instances, in turn, represent choices of elements from the sets.

However, as reification of sets they lack some useful features [=sugar].

  1. types for enum variants

  2. adhoc trait impementations

// Will work if every type in all enum items implements MakeSound
adhoc impl MakeSound for Animal;
  1. orders on enums as sets
enum Author {
CharlesDickens,
GeorgeOrwell,
WilliamShakespeare,
GeorgeEliot,
//...
}

order Alphabetic;

// Implementation of the order can come from #[derive(Alphabetic)] or std can offer
// Lexicographic order instead
order Alphabetic on Author {
CharlesDickens,
GeorgeDickens,
GeorgeEliot,
WilliamShakespeare,
//..
}

// Iter GAT should be defined if every type in all enum items implements Default
let authors_iter = Alphabetic::Iter<Author>::new();
  1. enum sum (union in the mathematical sense)
enum IOError {
// ...
}

enum ReqHandlingError {
// ...
}

// Any error in this case is either IOError or ReqHandlingError.
// Name collisions are expected to be caught by the compiler
enum Error = IOError | FormatError;

fn handle_req() -> Result<(),Error> {
  let req = recv_req().map_err(Into<Error>::into)?;
  req.handle().map_err(Into<Error>::into)?;
}

Update: July 11, 2022

The enum Error = {IOError::*, FormatError::*}; syntax seems to be more flexible due to interaction with as keyword.

  1. For cases where only some syntactically identical expressions with enum items are of the same type, adhoc can be used to avoid boilerplate as well.

  2. Multiple layers of inclusion should be traversed with a Ker::<...Kernel>::ker() function.

// 2 in 1 example
use crate::{Result, MainChef};

// derive defines MenuItemKer enum and implements Ker::<MenuItemKer>
#[derive(Ker)]
enum MenuItem = {
  MainCourse::*,
  Salad::*,
  Drink::*,
};

#[accountable_personnel("Pierre Gagnaire <...@gmail.com>")]
enum MainCourse {
  ButterChicken,
  #[dish_of_the_day]
  BalakPaneer,
  RoganJosh,
}

#[accountable_personnel("Marco Pierre White <...@gmail.com>")]
enum Salad {
  Green,
  Michigan,
  Ceasar,
  Ambrosia,
}

#[derive(Ker)]
enum Drink {
  Wine::*,
  Pop::*,
  Juice::*,
}

#[accountable_personnel("Josh Peck <...@gmail.com>")]
enum Wine {
  Barbera,
  Cabernet Franc,
  Cabernet Sauvignon,
  Carignan,
}

// No one is accountable for pop
enum Pop {
  Pepsi,
  Coke,
}

fn handle_complaint(c : &Complaint) -> Result<()> { 
  // Complaint { id, menu_item, text, time }
  match c.menu_item.ker() {
    MenuItem::MainCourse(mc) => {
      if matches!(mc, MainCourse::DISH_OF_THE_DAY) { MainChef::notify_about_complaint(c) }?;
      MainCourse::notify_about_complaint(c),
    }
    MenuItem::Salad(_) => Salad::notify_about_complaint(c),
    MenuItem::Drink(d) => {
      match d.ker() => {
        Drink::Pop => Ok(()),
        // drink: D should be another form of irrefutable pattern
        _drink: D => adhoc D::notify_about_complaint(c),
      }
    }
  }
}

Any ideas?

6 Likes

Enum variants as different types was actually discussed a few times. 1 2 3. Currently this has a "maybe someday" status, without any specific plans. In the meantime you can crate a struct wrapped by each enum variant, or use something like the enum_variant_type crate.

Adhoc impls and orderings: I didn't understand what is it you want and what is your proposal. Note that ordering in Rust is generally managed via the Ord trait, which is easy to both derive and implement manually on enums. However, there is no way to have two orderings based on traits. That is a fundamental limitation of the trait system and it's unlikely to change in the foreseeable future.

Sum and union are very different in mathematical sense. A sum is essentially what an enum already is. It combines two types (which are possibly the same) in an ordered and uniquely distinguishable way. I.e. the sum of A and B is an enum

enum Either<A, B> {
    Left(A),
    Right(B),
}

It may be nice to have anonymous sums of types, personally I would like to have them. However, there are design issues, and it is also not something currently on the radar. E.g. see 1.

Union types, on the other hand, are something like what is implemented in Typescript. A union A | A is A itself. This is a much more contentious (though also sometimes desired) feature.

Typescript gets union types basically for free due to JS object semantics, but I imagine it is much less ergonomic to implement in Rust.

7 Likes

You might be interested in the discussions on enum impl Trait and "anonymous enums", ideas similar to some of what you're proposing.

3 Likes

How would something like adhoc impl Default for Enum work?

How about safety? An enum of bytemuck::Pod types can't be Pod due to the discriminant part of the enum not satisfying the Pod contract.

1 Like

That one is simple enough: adhoc impls can't be provided for unsafe trait. Or even just require stating unsafe again when adhocing the impl, which puts the requirements onto the developer to check that adhoc is fine for this trait.

Generally you'd restrict adhoc implementation to object safe traits, so the enum is just acting as a static form of dyn Trait for an adhoc impl.

3 Likes

Your "ad-hoc impls" are what derive macros are for.

Union types sound good in theory but they are highly non-trivial to define in the presence of generics.

1 Like

I don't think general union types are being suggested in this proposal. The syntax was enum MyUnion = A | B where A and B are existing enum types and it just merges the variants from the provided enums. This syntax does not imply that A | B is a type; in fact it is slightly incompatible with A | B being a type since then this contruct would become ambiguous.

It sounds very close to something a proc-macro could do, except that I think a macro implementation would need to be provided the definitions of A and B to work.

1 Like

Derive macros is a syntactic feature. They need to have visibility into all variants of the enum at the time of macro expansion.

If adhoc trait implementation is provided for an enum sum,

// The derive macro was replaced with an attribute macro to accept an input, the name
// of the trait for which adhoc implementation must be provided
#[adhoc_trait_impl(std::error::Error)]
enum Error = IOError | FormatError;

the derive attribute macro must be able to query the token streams of the enums-summands or somehow else retrieve the much-needed variants and, finally, the methods of the trait. By design, attribute macros have very limited scope.

One possible solution is synchronization of information between possibly different macros from one proc-macro crate. I thought of "stateful" macros earlier (macros with shared state within proc-macro). But I haven't had a clear idea how to implement them. AFAIK, there are no proc-macro crates that utilize this approach.

There's another approach to querying the data about enums-summands. The attribute macro can identify the place of macro invocation and locate the referents "std::error::Error", "IOError", "FormatError" in the source code. If one of them is a sum enum, the process can continue recursively.

These two approaches are difficult and slow (though can be useful for other, less common features). I believe the problem must be resolved "somewhere" at type level. It is true that adhoc trait implementation can be easily "projected" onto existing features. That's not a bug, that's a feature.

There is one more problem with this being a procedural macro. It is a commonly used thing. At the time of writing, syn is the 2nd most downloaded crate with 131,106,901 all-time downloads. If you look at the source code, you can find that it's largely macro-generated, notably using ast_struct! , ast_enum! , and ast_enum_of_structs! from syn/macros.rs at master · dtolnay/syn · GitHub . These are macro_rules! because syn is usually the tool people use to create procedural macros (together with proc-macro2 and quote). It's hard to imagine how frequently syn gets compiled. Every time the crate compiles, these terrific macro_rules get compiled too. The token stream produced by the macros must be processed too. It's lots of work that can be made easier.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.