[Pre-RFC] Multiple if-let guards (not chaining)

Summary

if-let-guard in match should be expanded to allow different if-let-guards in or patterns.

// Existing if-let-guard
match foo {
    _ if let x = bar => {}
}

// Proposed if-let-guard-in-or-patterns
match foo {
    (A if let x = bar) | (B if let x = baz) => {}
}

Note the use of (...) for compatibility with existing programs/patterns.

Motivation

Given:

/// A pattern element.
pub(crate) enum PatternElement<T: PatternTypes> {
    Arrow,

    Identifier(usize),

    StringKey(usize),
    RegexKey(usize),
    ParameterKey(usize),
    KeySubtree(usize),
    ValueSubtree(usize),
    ApplyPredicate(usize, PhantomData<fn(&PatternConstants<T>) -> &Predicate<T>>),

    SkipStringKey(usize),
    SkipRegexKey(usize),
    SkipParameterKey(usize),
    SkipKeySubtree(usize),
    SkipValueSubtree(usize),
    SkipApplyPredicate(usize, PhantomData<fn(&PatternConstants<T>) -> &Predicate<T>>),

    End
}

// These use more memory than the above. They're easier to work with tho.
// TODO replace with (StringKey(x) if let skip = false) | (SkipStringKey(x) if let skip = true)
// (if those ever become a thing)
enum PatternElementHelper {
    Arrow,

    Identifier(usize),

    StringKey(usize, bool),
    RegexKey(usize, bool),
    ParameterKey(usize, bool),
    KeySubtree(usize, bool),
    ValueSubtree(usize, bool),
    ApplyPredicate(usize, bool),

    End
}

impl<T: PatternTypes> From<PatternElement<T>> for PatternElementHelper {
    fn from(a: PatternElement<T>) -> PatternElementHelper {
        match a {
            PatternElement::Arrow => PatternElementHelper::Arrow,

            PatternElement::Identifier(x) => PatternElementHelper::Identifier(x),

            PatternElement::StringKey(x) => PatternElementHelper::StringKey(x, false),
            PatternElement::SkipStringKey(x) => PatternElementHelper::StringKey(x, true),
            PatternElement::RegexKey(x) => PatternElementHelper::RegexKey(x, false),
            PatternElement::SkipRegexKey(x) => PatternElementHelper::RegexKey(x, true),
            PatternElement::ParameterKey(x) => PatternElementHelper::ParameterKey(x, false),
            PatternElement::SkipParameterKey(x) => PatternElementHelper::ParameterKey(x, true),
            PatternElement::KeySubtree(x) => PatternElementHelper::KeySubtree(x, false),
            PatternElement::SkipKeySubtree(x) => PatternElementHelper::KeySubtree(x, true),
            PatternElement::ValueSubtree(x) => PatternElementHelper::ValueSubtree(x, false),
            PatternElement::SkipValueSubtree(x) => PatternElementHelper::ValueSubtree(x, true),
            PatternElement::ApplyPredicate(x, _) => PatternElementHelper::ApplyPredicate(x, false),
            PatternElement::SkipApplyPredicate(x, _) => PatternElementHelper::ApplyPredicate(x, true),

            PatternElement::End => PatternElementHelper::End,
        }
    }
}

Allowing the syntax in that TODO comment would allow getting rid of PatternElementHelper entirely, with no additional runtime cost (either processing or memory). This feature would be an adaptation of explicit fallthrough for use with pattern matching.

Instead, replacing PatternElement with PatternElementHelper would cause a 33% increase in memory usage for the structs that contain PatternElements.

Guide-level explanation

[TODO]

Reference-level explanation

[TODO]

Drawbacks

This proposal introduces new syntax. It also requires look-ahead, as the proposed syntax is somewhat ambiguous with tuples.

Rationale and alternatives

This is seems to be the simplest adaptation of fallthrough for use with pattern matching, with a good cost/benefit tradeoff. It expands on existing syntax instead of introducing completely novel syntax, avoids the pitfalls of implicit fallthrough, avoids introducing non-lexical flow control (such as goto, etc), and correctly fulfills the need it is meant to fulfill.

A possible alternative could have been unreferenceablefolded fields. However, being able to fold bools into the enum variant would be problematic for the same reason we don't support packed structs anymore. More importantly, there is no good way to support mutating folded fields through &mut Enum. For example,

enum Foo {
  A(fold bool, Mutex<()>),
  B(fold bool, Mutex<()>),
}
fn bar(&mut Foo) {
  // how would you change A(false, mutex) into A(true, mutex), without unsafe code?
}
let mut x = Foo::A(false, Mutex::new(())); // so far so good
bar(&mut x);

Prior art

  • C and C++ have implicit fallthrough. It's such an issue that gcc and clang have a flag -Wimplicit-fallthrough to warn on implicit fallthrough, such that the programmer is required to add a // fall through comment where fallthrough is desired.
  • [TODO]

Unresolved questions

  • Are there other alternatives to this?
  • [TODO]

Future possibilities

Chaining if-let and/or allowing (...) around arbitrary parts of patterns is out of scope for this proposal.

Rust's match is not really the same as C's switch. Especially not in philosophy. Fallthrough has been proven to cause so much pain with little benefit that I really don't think it should be in Rust (or any modern programming language, really.)

It's not clear what goal PatternElementHelper achieves. What are you actually trying to do, and what goes wrong? Perhaps there is a simpler, already doable approach for achieving what you want – provided that the above piece of code is a real-life example.

1 Like

Note that the comment/thread linked below is where this thread’s proposal came from, it helps to explain the context a bit more

We don't know how to address this clearer in the rationale. It is precisely for those issues that we're not proposing fallthrough, but a "philosophical alternative" (to borrow your words) to it.

This thing is part of a VM/interpreter. The goal is to keep the function objects small (by keeping the opcodes small), while being able to inherit behaviour across opcodes.

The original code looks something like this:

class StringKey(PatternElement):
    """The 'literal' token."""

    def __init__(self, toks):
        self.key = toks[0]
        self.skippable = toks[1] == '?'

    def on_in_key(self, frame, path, defs):
        return self.on_not_in_key(frame, path, defs)

    def on_not_in_key(self, frame, path, defs):
        path[-1].iterator = self._extract(path[-1].parent)
        path[-1].empty = False
        return True

    def _extract(self, obj):
        try:
            yield (self.key, obj[self.key])
        except (TypeError, IndexError, KeyError):
            if not self.skippable:
                raise exceptions.ValidationError

For memory usage reasons we're converting it to use enums + a constant pool. This means matching the whole thing. It's less than ideal having to duplicate the code between the Skip* and the non-Skip* variants.

1 Like

This doesn't appear to be the case, at least in your particular example. Both enums have the same size:

size_of::<PatternElement>() = 16
size_of::<PatternElementHelper>() = 16
2 Likes

Wait, bools aren't usize?

No, bool has one-byte width and alignment.

4 Likes

This feels like just a logical combination of RFC 2535 + RFC 2294 and moving guards from being something match specific to being part of general patterns.

1 Like

And enum tags aren't usize?

By default, an enum is tagged with the smallest-sized integer that is large enough to represent every variant. (For example, u8 if there are fewer than 256 variants.)

The compiler can also pack the enum tag into unused bit patterns in types like char or bool or &T, in some cases, though this doesn't affect the types in your example.

4 Likes

As an example of this optimization, this code:

enum EnumRef<'a, T> {
    Foo,
    Bar(&'a T),
}

enum EnumOwned<T> {
    Foo,
    Bar(T),
}

fn main() {
    println!("usize: {}", core::mem::size_of::<usize>());
    println!("EnumRef<usize>: {}", core::mem::size_of::<EnumRef<usize>>());
    println!("EnumOwned<usize>: {}", core::mem::size_of::<EnumOwned<usize>>());
}

prints this:

usize: 8
EnumRef<usize>: 8
EnumOwned<usize>: 16

As you can see, even though EnumRef<usize> has two variants and thus is conceptually "bigger" than usize, it actually takes up the same amount of space in memory. That's because Rust references are known to never be NULL – the compiler knows that and can use 8 bytes of zeros to represent the EnumRef::Foo variant and all other memory states to represent EnumRef::Bar(some_val).

On the other hand, EnumOwned<usize> can't be optimized this way because then you wouldn't be able to represent values like EnumOwned::Bar(0).

1 Like

If anyone wants to close this or if anyone has another use-case for this and wants to take it over, feel free.