Better integral matching


#1

In the pipeline since some time there is a nice “Exhaustive integer matching” PR (that does something I assumed was done by the first Rust 1.0 I used):

It’s the first step to put the correctness of Rust pattern matching code in line with the level of correctness and reliability you expect from the rest of Rust code. Now you don’t need a catch-all clause in integral-only match expressions, allowing code like:

#![feature(exclusive_range_pattern)]

fn foo1(x: u8) {
    match x {
        0 .. 32 => {},
        32 => {},
        33 ..= 255 => {},
    }
}

A future RFC will hopefully support smarter cases like this:

#![feature(exclusive_range_pattern)]

fn foo2(x: u32) {
    match x % 3 {
        0 => {},
        1 => {},
        2 => {},
    }
}

Supporting functions like foo2() should be within the compile-time capabilities of the current DMD D language compiler (that is fast enough and doesn’t require abnormal amounts of RAM to work). Elsewhere I wrote some examples of what D Value Range Analysis does:

// temp.d(2,33): Error: cannot implicitly convert expression (x) of type const(uint) to ubyte
ubyte foo01(in uint x) { return x; }

ubyte foo02(in uint x) { return x % 300; } // Cannot implicitly convert
ubyte foo03(in uint x) { return x % 200; } // OK
ubyte foo04(in int x) { return x % 200; } // Cannot implicitly convert
ubyte foo05(in int x) { return x % 128 + 128; } // OK
byte foo06(in int x) { return x % 127; } // OK
byte foo07(in ubyte x) { return x - 128; } // OK
int foo08(in ubyte x) { return x - 128; } // OK
ubyte foo09(in uint x) { return x & 0b_1111; } // OK
ubyte foo10(in uint x) { return x / 10_000_000; } // Cannot implicitly convert
ubyte foo11(in uint x) { return x / 100_000_000; } // OK
ubyte foo12(in ubyte x, in ubyte y) { return x + y; } // Cannot implicitly convert
uint foo13(in ubyte x, in ubyte y) { return x + y; } // OK

I’d like also the same Value Range Analysis to allow like this in Rust (that uses into() instead of TryInto()):

fn foo07(x: u8) -> i8 { (x - 128).into() }

A harder case to support:

#![feature(euclidean_division)]

fn foo3(x: i32) {
    match x.mod_euc(3) - 1 {
        -1 => {},
        0 => {},
        1 => {},
    }
}

Supporting the foo3() case is useful, but it’s harder because it may require some support of annotations in a kind of limited dependent typing to tell the compiler what range of values mod_euc() returns.


While this PR (and future RFC) is meant to avoid the catch-all and spot missing cases, still it’s not strict because it allows code like this, where the x=100 value is covered two times:

#![allow(dead_code)]
#![feature(exclusive_range_pattern)]

fn foo4(x: u8) -> bool {
    match x {
        0 .. 101 => true,
        100 ... 255 => false,
    }
}

Generally I prefer my Rust compiler to be strict, but varkor has shown code like this, to remind me that overlapping patterns that are still reachable do not usually produce warnings, for convenience:

match Some(0) {
    Some(0) => {}
    Some(_) => {} // ok
    None => {}
}

To increase the strictness perhaps an attribute could be introduced to allow the overlapping:

fn foo5(x: u8) -> bool {
    #[overlapping]
    match x {
        0 .. 101 => true,
        100 ... 255 => false, // OK
    }
}

Or if that’s not possible, to disallow it:

fn foo6(x: u8) -> bool {
    #[not_overlapping]
    match x {
        0 .. 101 => true,
        100 ... 255 => false, // Compile error here
    }
}

Casting integers when bit-twiddling
#2

Feels like the overlap cleanup warnings should just be a lint that could be allowed or denied as normal, no need for a special attribute. Then it can fit in the normal “RLS can fix this for you” stuff, saying something like

        0 ..= 100 => true,
              ^ values `99 ..= 100` are already covered by this arm
        99 ..= 255 => false,
        ~~ so consider replacing this with `101`

If it can be done by just changing one end. (If the overlap is in the middle I’m not sure the warning is valuable enough, since | isn’t a general pattern feature.)


#3

This doesn’t seem to offer much value in the short term. I wouldn’t ask anyone to go through the trouble of implementing some subset of dependent types or compile time range analysis just to avoid having to type _ => unreachable() in the occasional match arm; that solution already provides a neat and general solution to the problem.

So I think there’d really have to be some other driving factor behind such a set of features.


#4

That solution means writing bug-prone code. Value range analysis is useful for other purposes too.


#5

A few, maybe. The band of calculations you can do where value range analysis is useful is quite marginal, and it wouldn’t seem to be worth the compile-time costs IMO.

Look at something like this:

let x: u8 = A_NUMBER;
if x == 6 {x += 1;}
match x {
    1..=5 => (),
    7..   => (),
    _     => unreachable!(),
}

Why bother implementing something this fragile? Most of the time you’re doing all this analysis only to find that you can’t constrain the range at all.