Inferbo, Rust roadmap, correctness, interval analysis


#1

In the Rust roadmap I’d like a statement that reminds us that one of the core values of Rust is the care for code correctness. A smart compiler allows you to write code that’s both flexible and very reliable.

Inferbo is an inference-based buffer overrun analyzer for C code:

This is how it scales to large codebases:

Rather than starting analysis of a program from the main procedure, you can start anywhere. As you analyze, you gather information that is needed for error-free execution to proceed. This information forms a precondition for the procedure. The consequence of this is that you can analyze little program parts without analyzing the entire program. Infer’s infrastructure stitches together the results of the many little program parts to give a picture of the whole. This opens the way the analyzer to keep pace with a large, fast-moving codebase.<

It performs interval analysis, it’s one of the features I’d like in the Rust compiler:

Symbolic intervals. Inferbo uses the conventional program analysis technique of intervals for the range of index values and buffer sizes. Normally, interval analysis works with intervals [low,high] consisting of constants. For modular analysis of procedures we made an analysis where the bounds are instead symbolic. The bound expressions are restricted to be from linear arithmetic (+ but no *) with additional min and max operators.<

Such interval analysis should be able to spot this bug at compile-time (Rust translation of the C code):

fn main() {
   let mut arr = vec![0; 9];
    for i in 0 .. 9 {
        arr[i] = 1;
        arr[i + 1] = 1;
    }
}

Later the article shows a similar inter-procedural analysis.

In Rust direct indexing is a bit less common compared to C code (here Clippy suggests to use an iter().enumerate().take()), but it’s still present, on the other hand an interval analysis is going to be useful in Rust also because it allows you to mix and match arrays and slices in a more flexible way.

This is valid D language code (the D int[N] type is equivalent to the Rust [i32; N] type, while the D int[] type is similar to the Rust &[i32] type):

int[3] foo1(int[] a) pure nothrow @safe @nogc {
    return a[0 .. 3];
}
int[3] foo2(int[] a) pure nothrow @safe @nogc {
    return a[2 .. 5];
}
int[3] foo3(int[5] a) pure nothrow @safe @nogc {
    return a[2 .. 5];
}
void main() {}

You can see the D compiler verifies the slice bounds statically, this gives a compilation error:

int[3] foo4(int[10] a) pure nothrow @safe @nogc {
    return a[2 .. 6];
}
void main() {}

But currently the slice interval analysis of the D compiler isn’t good, this range violation is caugth only at run-time:

int[3] foo5(int[2] a) pure nothrow @safe @nogc {
    return a[0 .. 3];
}
void main() {
    cast(void)foo5([1, 2]);
}

And it statically refuses this simple reasonable code:

int[3] foo6(int[5] a, in size_t i) pure nothrow @safe @nogc {
    return a[i .. i + 3];
}
void main() {}

foo1(), foo2() and foo6() are correct but they still need run-time bound tests. While in foo3() the run-time bounds test should be absent.

So in Rust I’d like interval analysis and a succint syntax (or a compiler-blessed function) that allows the array-slice and slice-array conversions where they are statically proven to be correct.