Keyword idea: Boosting enum functionality

MrFaul · January 19, 2024, 9:17pm

One major goal for me, is to make this stuff more accessible for newcomers.
By anchoring this in actual Rust syntax it becomes far easier to conceptually grasp than macros and derived stuff.

These three cases are fairly common tasks.
This way Auto complete can do a lot of boiler plate.
The structure alone gives a lot of hints how to use it.

When I started to learn Rust a while back I was stoked to get the exhausted matches but also confused why this is pretty much the only where this is happening.

Sure when you write your own enums you get warnings galore if you have variants that are not used.
But if you use stuff you haven't written yourself it's easy to miss details.

idanarye · January 20, 2024, 1:27am

I'm not sure what you mean by "putting in the correct default/dummy data". Not all types have a Default, and not all types have reasonable dummy data (what would be the one for std::fs::File?). But whatever you think I meant - that's probably not it, because "the compiler yells at you if the field-type changes" is a property of your suggested syntax (where you put the types themselves) and not mine (where you use _ and .. to ignore the values). I also don't see the virtue of that property. I want the compiler to yell at me if a variant is added or removed because the point of an enum is that either the entire enum is handled as a single "black box" or (which is the case for every) different variants are handled differently. But if the type of a field inside a variant is changed, and it doesn't otherwise break the block, then why should I care? And why should I pay the price of having to redundantly repeat the type inside the every statement?

No. That was not my intention here.

The idea behind my suggested syntax is to allow nested "pattern matching" on the fields of the variants. If a variant has a field that is also an enum, then you can (but don't have to) have a block for each of its variants, without using a nested every. This also means that if you can use every on a tuple of enums.

And I don't think this suggestion adds complexity to every, because this exact thing can already be done, with the exact syntax, with match. The only differences is that the values must be ignored - which also means that guards cannot be used.

Even if my suggestion gets rejected, I don't think forcing the user to write the types of the variants inside every arms is a good idea. It's syntactic salt, and one that does not offer valuable protection. Also, it's a new syntax that is not already used somewhere else - so it does add more complexity. It's better to just ignore the variants' payload completely in every statements' arms.

zackw · January 20, 2024, 4:19am

I needed case 3 for value-like enums (i.e. none of the variants holds any data) for tests a couple months ago. I asked about it over on URLO and someone came up with this macro:

macro_rules! exhaustive_list {
    ($E:path; $($variant:ident),* $(,)?) => {
        {
            use $E as E;
            let _ = |dummy: E| {
                match dummy {
                    $(E::$variant => ()),*
                }
            };
            [$(E::$variant),*]
        }
    }
}

used like so:

enum Bool {
	False,
	True,
}

let all_variants = exhaustive_list![Bool; False, True];
// => let all_variants = [Bool::False, Bool::True];

let all_variants = exhaustive_list![Bool; False];
// => error: `Bool::True` not covered

In principle this could be extended to any enum for which one can write an expression that constructs a value of each variant, but I don't quite see how to do it with macro_rules! because I'm not aware of any way to adjust the expansion of each arm of the match dummy based on whether the variant being matched contains data.

MrFaul · January 20, 2024, 8:01am

Ohhhh, well that makes much more sense now than what I initially thought.
Good point.

And I guess you're right, any respectable code editor will tell you anyway what's supposed to be in there.
So you might as well discard it when not needed.

dlight · January 20, 2024, 10:51am

Not sure if this has been discussed or proposed here or as an RFC, but this particular feature would be very nice, like the dual of exhaustive pattern matching.

zackw · January 20, 2024, 4:20pm

Your concrete syntax suggestion for case 2 is no good because match input as Enum { ... } already means something -- it's the same as match (input as Enum) { ... }.

Also, I think limiting case 2 to match expressions and values of the enum type itself is too restrictive. Here's some actual code of mine that could benefit from case 2 support:

#[non_exhaustive]
pub enum ChecksumAlg { Blake3, SHA256, }
struct ParseChecksumAlgError;
impl FromStr for ChecksumAlg {
    type Err = ParseChecksumAlgError;
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        if s.is_char_boundary(5) {
            let (prefix, suffix) = s.split_at(5);
            if prefix.eq_ignore_ascii_case("blake")
                && (suffix == "3" || suffix == "-3")
              { return Ok(Self::Blake3); }
        }
        if s.is_char_boundary(3) {
            let (prefix, suffix) = s.split_at(3);
            if prefix.eq_ignore_ascii_case("sha")
                && (suffix == "256" || suffix == "-256")
              { return Ok(Self::SHA256); }
        }
        Err(ParseChecksumAlgError)
    }
}

There might be a better way to write this but that's not important right now; the point is that a series of "if this, return that" checks with fallthrough on failure is a totally natural way to write a parser. Also that what you're returning might be a derived value, e.g. Result<Enum, Failure> and the error case(s) should not interfere with the check.

I'm thinking that a better way to express case 2 would be with an attribute, applicable to both functions and single expressions:

impl FromStr for ChecksumAlg {
    type Err = ParseChecksumAlgError;
    #[exhaustive_value]
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        ...
    }
}

With no arguments, #[exhaustive_value] requires there to be at least one live control flow path from entry to exit of the function (or expression) that produces each possible variant of the function's (expression's) return type, recursively. In this case, Ok(Self::Blake3), Ok(Self::SHA256), and Err(ParseChecksumAlgError) must all be possible returns.

#[exhaustive_value] takes optional keyword arguments, there currently being only one, exclude = <pattern>. Variants that match the pattern are required not to be returnable. For instance

impl FromStr for ChecksumAlg {
    type Err = ParseChecksumAlgError;
    #[exhaustive_value(exclude = Err(_))]
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        ...
    }
}

the body of from_str must have live control flow paths that return all variants of Ok(Self) but must not return any Err() values. (Once type Err = ! works in this context, #[exhaustive_value] should automatically exclude variants that are or contain the never type.)

("Live control flow path" needs to be given a precise definition that does not depend on optimization level.)

zackw · January 20, 2024, 4:27pm

Not everyone uses IDEs, for concrete, legitimate reasons such as

they can't keep up with people's typing rate
they drain laptop batteries too fast (last time I tried rust-analyzer it halved the battery runtime of the machine I'm typing this on)
they take up too much disk space on shared servers
bad experiences with buggy older IDEs, possibly decades in the past (I tried Eclipse once, circa 2003; I fed it all of GCC and it crashed; I uninstalled it)

zackw · January 20, 2024, 4:45pm

Thinking about case 3 some more: for value-like enums

enum Foo { A, B, C, }

the ideal thing, IMHO, would be to let you write

for variant in Foo {
    match variant {
        A => ...,
        B => ...,
        C => ...,
    }
}

No ceremony and the only new syntax is to allow a type by itself as the in expression of a for-loop. If this is not feasible, then perhaps a built-in derive macro that allows for variant in Foo::into_iter(), or something like that, on enums that opt in.

For enums with variants that hold data, you need to be able to construct a value of each variant, which isn't always possible. I do not have a good idea, but I think something like the exhaustive_list! macro I posted earlier is a better approach than anything that involves adding a new keyword and a new expression structure.

CAD97 · January 20, 2024, 11:14pm

Something like [playground]

macro_rules! every {{
    $Enum:ident {
        $($Variant:ident => $handler:expr),* $(,)?
    } 
} => {
    if false {
        // ensure exhaustiveness
        match (|| -> $Enum {unreachable!()})() {
            $($Enum::$Variant { .. } => unreachable!(),)*
        }
    } else {
        // actually do work
        $($handler;)*
    }
}}

The Record { .. } pattern syntax works for any variant kind (enum, tuple, struct). Alternatively, take $:pat in the macro and use the full language of pattern exhaustiveness. (Uses the immediately invoked closure to avoid undesired unreachable code warnings.)

This is pretty much a perfect application of macros IMHO.

FZs · January 21, 2024, 1:30pm

I agree. I actually proposed a very similar macro but it got buried under the other discussion.

FZs:

macro_rules! every{
    ($enum_type:ty { $($pattern:pat => $code:expr),* $(,)? }) => {{
        fn _check_exhaustive(v: $enum_type) {
            match v {
                $( $pattern => () ),*
            }
        }
        
        $( $code; )*
    }}
}

IMHO, this is a niche enough use case that new syntax would not worth the cost. Whether std should have this macro may be up for debate, although I think it falls in the "perfectly fine as a crate" category.

toc · January 23, 2024, 2:36pm

This is the most interesting option here to me, and one obviously not covered by iteration^[1]. I think it would be interesting to try and write down the semantics beyond toy examples, though:

every Foo {
    Bar(Some(0..10)) => ...,
    Bar(Some(10..)) => ...,
    Bar(None) => ...,
    Baz(_) => ...,
}

This kind makes it seem like this would be a thing similar to match, but divergent in many ways.

I don't want to iterate a bajillion things ↩︎

jpleyer · February 20, 2024, 9:32am

Sorry if this is a bit of a necro-post but I just thought about this problem again.

Motivation

To me the main difference between the use-case of a regular enum and this approach here is that we are looking for a "unique (ordered) list" of possibly different types. The enum uses the name "variant" but I do not think that this fits necessarily here. I can imagine a world, where we want to have a collection which does not have "the same type of object in different variations" but rather actually distinct types in such a collection.

I believe we should think it in terms of "iterating over types".

Problems

Implementation

The problem with these lines

is that they generate a huge amount of code. Consider that for the first variant, we have u32::MAX + 1 possible values alone and variants such as Bar(i32, i32) yield combinatorical explosions. Even if there is only one line an the right-hand-side, the compiler will emit u32::MAX+1 lines of code (in the first case).

Generics

Consider this enum

Foo<T> {
    Bar(T)
}

how do we use the every keyword here? We do not know how to exhaustively match the generic parameter T. The syntax is not easily generalizable.

Scopes

If we want to use the variables created in the every block, we would require them to be in scope for the remaining code. However, differently to the match keyword, we cannot define one unique output value for all our blocks since every single one of them is being executed. However, even if items are in a different scope than our current program, the following should work

let i;
every Foo {
    Bar => {i=0}
}

but this would fail

let i;
every Foo {
    Bar1 => {i=0},
    Bar2 => {i=0},
}

due to immutable assignment.

Iterating over Types

Some of the problems above could be solved by this approach. I am not sure that the syntax is very nice so feel free to critique. This is how I imagine this iteration to look like.

enum State<const T: usize> {
    Number<T>,
    V2,
}

typelist MyTypeList = [
    u8,
    States::Number<1>,
    States::Number<42>,
    States::V2,
    bool,
    Messenger
];

fn main() {
    every MyTypeList {
        u8 => {...},
        States::V1 => {...},
        _ => {/* Does the same for the remaining stuff */}
    }
}

The scoping and assignment problem stil persist.

idanarye · February 21, 2024, 10:44pm

Why should it? The purpose of the patterns in an every statement (or should it be an expression?) is typechecking only. So"

every Foo {
    Bar(Some(0..10)) => { println!("1"); }
    Bar(Some(10..)) => { println!("2"); }
    Bar(None) => { println!("3"); }
    Baz(_) => { println!("4"); }
}

Will be desugared (after typechecking) to:

{ println!("1"); }
{ println!("2"); }
{ println!("3"); }
{ println!("4"); }

It won't have to do { println!("1"); } 10 times, and it certainly won't have to do { println!("2"); } four billion times. Each block will only have to be emitted once.

jpleyer · February 23, 2024, 12:23am

Okay I misunderstood your example. But this yields a new type of problem: I thought of this new every keyword such that every variant of an enum is producing some kind of code. For an enum with N variants, we expect to have N code-blocks. With this approach, we can have variable amount of code generation. This code

every Foo {
    Bar(Some(0..10)) => { println!("1") },
    Bar(Some(10..20)) => { println!("2") },
    Bar(Some(20..)) => { println!("3") },
    Bar(None) => { println!("4") },
    Baz(_) => { println!("5") },
}

will produce 5 blocks (compared to the 4 from before and 3 if we only consider different types).

To me this is not what the initial design was aiming for. We want to enforce that every variant of an enum is covered exactly once (in my opinion) with the every keyword. But now we cover the same variant 3 times. This is also not conditional since the compiler will emit every statement irrespective of the actual values for the range x..y. I do not really see what the every keyword provides in terms of language benefits.

Conceptually we are mixing information known at compile time (such as types) with runtime (such as the exact value of i in Bar(Some(i)) but in the end do not depend in this mixture of information for code generation. Thus the following code will produce equivalent results to the one above.

every Foo {
    Bar(Some(0)) => { println!("1") },
    Bar(Some(1)) => { println!("2") },
    Bar(Some(2..)) => { println!("3") },
    Bar(None) => { println!("4") },
    Baz(_) => { println!("5") },
}

But when reading the ranges, one would assume that they differ. I find this problematic and I think that inline comments serve a similar purpose in this case.

The problem described in my earlier response when considering generics also still persists. These are all not problems if we consider iteration over a list of types.

system · May 23, 2024, 12:24am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Post: optimizing hashmaps even more language design	5	1249	August 17, 2021
Input: Gauging feelings about syntax language design	27	1048	September 7, 2024
Idea: Option over value language design	7	748	April 27, 2019
Optimization in the enum/match pattern compiler	7	1868	July 27, 2022
[IDEA] Implied enum types language design	83	4960	September 12, 2023