Generic Enum and Struct Matching Syntax Idea

Problem

consider the following enum:

enum Items{
      A(FileItemTreeId<i32>),
      B(FileItemTreeId<String>),
      C(FileItemTreeId<FunctionNodeType>),
      D ... 
}

one might want to execute the same code for multiple variants of the enum. e.g:

fn print(it : Items){
   match(it){
     Items::A(it) => println!("{it}"},
     Items::B(it) => println!("{it}"},
    ...
   }
}

An example can be found in the Rust Analyzer code in "crates/hir-def/src/item_tree.rs", where a ModItem implements an ast_id with this recurring pattern. While you can move the function to a macro in this case, a more general way might be helpful.

Looking through the source code of Rust-Analyzer, I also found other repetitive code related to structs that group multiple elements together. e.g:

#[derive(Default, Debug, Eq, PartialEq)]
struct ItemTreeData {
    imports: Arena<Import>,
    extern_crates: Arena<ExternCrate>,
    extern_blocks: Arena<ExternBlock>,
    functions: Arena<Function>,
    ...
}
fn shrink_to_fit(&mut self) {
        if let Some(data) = &mut self.data {
            let ItemTreeData {
                imports,
                extern_crates,
                extern_blocks,
                functions,
                ...
            } = &mut **data;

            imports.shrink_to_fit();
            extern_crates.shrink_to_fit();
            extern_blocks.shrink_to_fit();
            functions.shrink_to_fit();
            rules.shrink_to_fit();

Although you could simplify this code with a custom derivation macro, a more general, simpler method might be helpful.

Possible Syntax - Idea:

To solve these repetitions, one could add an additional syntax.

For the first enum matching case:

match(it){
     //overide custom behavior for Items::A
     Items::A(it) => { println("{}",it.index},

     //Match all variants with exact one tuple field of generic type T:
     impl<T : ToString> for Items::$name(T) => println!("{}:{}",  stringify!($name), name.0},
     //name would be like a local variable with the type of the matched variant, in this case an enum 
     //so name.0 might return an FileItemTreeId<*> object. 
     //$named would return the name of the variant, so this could return the ident A of Items::A 

    //Match all struct-variants with exact two fields named x and y of generic Type T:
     impl<T : ToString> for Items::$name{x : T, y : T} => println!("value:{}",name.x}, 
     //Match all variants with the first tuple field of generic Type T:
     impl<T : ToString> for Items::$name(T, _) => println!("value:{}", name.0},
     //Match all struct-variants with at least two fields named x and y of generic Type T:
     impl<T : ToString> for Items::$name{x : T, y : T, _} => println!("value:{}",name.y}, 
   }

The impl keyword is a bit strange, but in the end we implement a generic inline function for the patern-matched type.

Another idea would be to use similar syntax or other macro expressions to iterate over all enum variants or struct fields and generate code for each entry:

fn test(v1 : ItemTreeData, v2 : Items){
    //This might iterate over all field members of ItemTreeData and call the inline function for every matching case  
    impl<T : ToString> for  ItemTreeData::$name : T in v1 => println!("{}:{}", stringify!($name), name.toString()};
    //This might go statically through all the enum variants of items and call the inline function for each matching case
    impl<T : ToString> for  Items::$name(T) => println!("{}",stringify!($name)};
}

Or you could even add fields or variants based on other fields or variants of the type.

struct s{
	s1 : i32,
	s2 : i32, 
	s3 : i32,
        //Create a field for every variant in Items matching the pattern, with the name of the variant.
       // "s1", "s2" and "s3" would not match as this is already used above.
	impl<T> for Items::$name(T)  => $name : T,
}

What do you think about such impl syntax? Would it be useful? Would it be possible to implement it. In the end, too many complicated syntaxes for special cases are not good either.

1 Like

This seems like it somehow starts to involve traits in the match. Is that correct?

The idea is to match multiple enum variants with a given structure and allow a trait-bound generic parameter:

match(value){
    //Match all MyEnumType variants with a FileItemTreeId<T> type tuple field. 
    //where T is a generic type specified e.g. by an impl statement 
    MyEnumType::$name(FileItemTreeId<T>)  => { }
}

and using a special syntax with impl or for to specify the generic type.

I like the concept in theory, but for the implementation it probably needs to wait for TAIT and "variable impl trait" to cook a bit more; I think the best way to handle unifying multiple match bindings with disparate types will look very similar to impl trait.

Personally, I think I'd suggest spelling it something along the lines of

fn print(it: Items) {
    match it {
        Items::A(it) | Items::B(it) | Items::C(it)
        where it: impl Debug => println!("{it}"},
        _ => unimplemented!()
   }
}

(You could emulate this today (without the opaque type) by a proc macro by duplicating the match arm for each alteration separately.) The where on a pattern would offer a place to ascribe the type of bound names, and we'd treat impl Trait in this position as a universal type (or "input existential") like APIT, monomorphizing the match arm as necessary for the different concrete types bound.

More involved structural matching over multiple variants is significantly more involved and a significant new kind of thing. At a minimum, your use of $name as a variable itself is relying on the conceptual feature of "enum variant types," which are still purely a (reasonably liked) idea at this point in time AIUI.

4 Likes

Not to focus too much on the (still very bike-sheddable) syntax, but: shouldn't it be where it: Debug?. If it isn't a typo, that would introduce a second variant of the where clause minilang, as currently only traits can be used there.

As an aside, I do like this syntax better than the originally proposed one, as it's much more clear to me, and much less wordy, and to much the same effect it seems.

Either way, it's necessarily a new variant, because it is a binding name, not a type name.

So yes, I do mean $binding_name : $type. It's introducing effective lets for the bound names where a type can be given. where let it: impl Debug would perhaps be "nicer," but only really in the way that people generally want new things to have noiser syntax because they're not used to them yet.

1 Like

But here you have to specify each variant manually. On the other hand, automatic variant name matching as I suggested, could introduce too many "dependencies" between code where changing enum type variants could change the behavior of other code that uses the enum.

While this is a different problem in the end I also thought about a way to "generate code" or "generate matching patterns" like a macro expanding for all variants of a type with a given structure without deriving the type. So a language feature to generate code for matching enum variants (or even matching struct fields) could be used to match multiple Variants , call a function for all variants, generate fields, etc. But I must admit that

TYPE::$name(FileItemTreeId<T>)  => { } 

doesn't feel like anything is being generated here.

With macro_rules you could do something like this very bike-sheddable syntax:

//a macro, which can iterate though variants (or Fields) of a type
macro_rules! myCustomMacro for MyEnum::$name(T) where T : impl Debug {
    $x:expr =>
    { 
        //teat name as list of Identifiers for the Enum 
       //where could one specify that $name 
         $(
              MyEnum::$name(it) => {$x} ,
         )*
    };
}

and you would have to allow match arms generated from macros or use the match expression within the macro, and you may not be able to override the behavior of custom variants above the generated code.

match items{
   myCustomMacro!(println!("{it}"), MyEnum)
}

I'm not entirely sure about the syntax but I definitely see myself using this feature. I have plenty of code that uses macros for that case.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.