Mini idea: slice length patterns

In today's Rust, you can use pattern-matching to check that a slice has a specific length:

match slice {
    &[_, _, _] => println!("slice has length 3"),
    _ => (),
}

Or check that it has a minimum length:

match slice {
    &[_, _, .., _] => println!("slice has length at least 3"),
    _ => (),
}

But you can't constrain its maximum length without an if guard (that exhaustiveness checking doesn't understand). And if the length you want to check for is large, the syntax becomes unwieldy:

match slice {
    &[_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _] => println!("slice has length 21"),
    _ => (),
}

To remedy this, the syntax for specifying array lengths could be adapted to work with slice patterns:

match slice {
    &[..; 21] => println!("slice has length 21"),
    _ => (),
}
match slice {
    &[..; 13..=23] => println!("slice has length between 13 and 23"),
    _ => (),
}
match slice {
    &[.., 0; 13..=23] => println!("slice has length between 13 and 23, with last element 0")
    _ => (),
}
18 Likes

One question I have is how can this compose with normal slice patterns?

Can the following work?

match slice {
    &[first, ..; 30, last] => { ... }
}

I would expect that to be written [first, .., last; 31].

9 Likes

Is the .. required? If not, what's the behavior of a length mismatch?

In particular, what is the behavior or error here?

fn hmm(arr: &[i32; 256]) {
    match arr {
        [0; 256] => {}
        [1; 256] => {}
        _ => {}
    }
}

And would it preclude this from ever working:

fn hmm<const N: usize>(arr: &[i32; N]) {
    match arr {
        [0; N] => {}
        [1; N] => {}
        _ => {}
    }
}

I would expect any length mismatch to be a compile error. So [1; 17] is an error, and so are [1, 2; 12..] and [1, 2, .., 3, 7; ..=3].

I suppose [1; 256] having a different meaning in expression and pattern position would be confusing, though... Good catch! (It wouldn't be the first such case; 0 | 1 also has different meaning in expression vs pattern position.)

2 Likes

With generic constants, this would still be expressible (though far more verbose):

fn hmm<const N: usize>(arr: &[i32; N]) {
    const ZEROS<const N: usize>: [i32; N] = [0; N];
    const ONES<const N: usize>: [i32; N] = [1; N];
    match arr {
        ZEROES<N> => {}
        ONES<N> => {}
        _ => {}
    }
}

I'm confused. How is the meaning different?

In expression position, [1; 256] is an array of 256 1's. In pattern position under this proposal, it's an array of length 256 that contains a single element equal to 1 (this being an impossibility, the pattern would be rejected).

1 Like

Right, it precludes shorthand literal array syntax in patterns directly.[1] Perhaps more of a diagnostic issue than anything. At least, if the "unsatisfiable length is an error" approach is taken.

(The current diagnostics are already funky, as they assume you meant , instead of ; and soldier on mumbling about length 2 mismatches and potentially type mismatches.)


I was thinking it's somewhat nicer if the length and slice portions are independent, in a trivial bounds sort of way. But I also think it would be pretty foot-gunny to allow overlap between array shorthand and slice length patterns without the unsatisfiable length errors; worthy of a default deny or at least warning lint.

Or just require .. if there's a length pattern.

But it turns out there's some precedent for erroring on unsatisfiable lengths already. So perhaps the thought is mostly "Alternatives" fodder.


  1. Unless they happen to be length 1, sorta. ↩︎

3 Likes

As I've written elsewhere, variadic generics could potentially benefit from relaxing that restriction, so I don't think it's a crazy idea.

What about allowing the "semicolon repeat" number after any element, but it only applies to the immediately preceding element?

Then [1; 256] means the same thing in pattern or expression position. The only issue is that the number after the semicolon is no longer always the length of the whole array / slice.

But I think that opens things up a bit because then you can do things like [start @ _; 16, ..]

Related comment about the interaction of (exclusive/from) slice patterns and range patterns. (Though ..= is stable in slice patterns and doesn't mean "ranging slice".)

1 Like

Nesting ; "inside" , isn't consistent with English or intuition (less prominent separator should nest between more prominent separator).

Other possible syntaxes would be [..[..start; 16], ..] or [start @ ..[..; 16], ..], though those are pretty ugly...

3 Likes

I don't think it's that bad. Colons are usually only allowed before a list in English. Plus, in most cases it's going to be at the end anyways. But if people hate it that much, another alternative is putting them in parens:

match slice {
    &[1; 10] => ...
    &[(2; 11), last] => ...
}

I think making .. mandatory when there's length mismatch would help to disambiguate it.

Mandatory .. would be more in line with structs. Not having to write [_, _, _, _, _] is a different feature from a syntax sugar for matching against [1, 1, 1, 1, 1]. In structs you have .. to skip fields, but there isn't any syntax to match all fields against a value.

3 Likes

It could be a monomorphisation-time error, e.g. [1,2,3,4,5,..; N] where N < 5. Or rather if it wasn't an error, it'd be weird to silently allow impossible pattern.

1 Like

(Edit: to be clear, I'd be fine if patterns like [1; 4] just failed to compile; slice len patterns would be valuable in any case.)

I’d expect that [1; 4] would mean the same in pattern and expression positions, ie. shorthand for [1, 1, 1, 1]. [1, ..; 4] as a pattern could mean any 4-len array/slice with first element 1. The [x; N] pattern would then admittedly be a special case, being the "repeat x" pattern, whereas [1, 2; 4] would fail to compile. [1, 2, 3, 4; 4] could compile but would be redundant.

More generally, the [<pat>; 4] pattern could mean a 4-len array/slice each element of which matches <pat>: [1..; 4] would mean a 4-len array/slice with every element at least one, but it might be too confusing/error-prone :< Also, it would be unclear what [a; 4] (or [a @ <pat>; 4]) means – what would a bind to and what would its type be?

(On the expression side it would be nice if splats like [1, ..Default::default(); N] worked and mirrored the corresponding pattern, but ..x of course conflicts with the RangeTo constructor – this is why I’d like ... to be repurposed to a splatting operator now that it’s free. It could also be confusing why [2, ...1; 10] doesn’t work and you’d have to write something like [2, ...[1; _]; 10] although it’s consistent with how struct update syntax works.)

4 Likes

If you just want to check slice's length, just call its len() method. I don't see the point of this feature. The use cases where one uses slice patterns usually involve extracting the respective elements from the slice, and the proposed feature doesn't have any affordances for that. What would be the use cases where your new patterns would actually be the best solution?

2 Likes

Main benefit is being able to convert a slice to an array if the length matches.

fn foo(array: &[i32; 4 ]) {
    // do stuff
}

fn baz(slice: &[i32]) {
    match slice {
        array @ [..; 4] => foo(array),
        _ => { /* do something else */ }
    }
 }
6 Likes

Though in many cases you can also use the TryFrom/TryInto &[T] -> &[T; N] conversion.