I really like the philosophy behind Rust and feel excited to use it in future projects. However, bounds checking when random accessing arrays/slices could be a real performance hog. As been pointed out before, this is not really a solvable problem: if one wants memory safe code bounds must be checked. And one can use iterators if access is sequential and unsafe code for the truly critical parts to speed up things.
But I figured that there must be some middle ground: some type of optimizations which ensure within-bounds access but keeps checking at a minimum. In the spirit of Rust this would need to happen completely at the compiler stage, lead to guaranteed safe code and lead to higher performance. The idea I had builds on that checking for a immutable object in a fixed-length (at run-time) vector-type object is only needed once. As variables are immutable by default in Rust and arrays/slices are fixed-sized (when initialized at run-time) this should be a quite common situation and therefore lead to some performance improvements.
I’ll explain the idea with some examples. Consider the following code:
fn main() {
let mut data = vec![1, 2, 3]; // user supplied
let req = 2; // user supplied
let slice = &mut data;
let i = req;
for n in 0..10 {
slice[i] = n; // Bounds check i in slice
println!("Value: {}", slice[i]); // BC i in slice
}
}
At compiler time it is impossible to know the length of slice
or the value of i
– bounds must be checked. However, the compiler can infer that the length of slice
and i
will never change after they are initialized. There is no need to bounds check at each iteration of the loop. It can instead be done at the first instance where both slice
and i
existed and then never be done again after that. I.e.:
fn main() {
let mut data = vec![1, 2, 3]; // user supplied
let mut req = 2; // user supplied
let slice = &mut data;
let i = req; // BC i in slice
for n in 0..10 {
slice[i] = n; // No BC
println!("Value: {}", slice[i]); // No BC
}
}
We have reduced the number of checks from 20 to just one! “Bounds safety” could be easily be inherited if the new object also is immutable:
fn main() {
let mut data = vec![1, 2, 3]; // user supplied
let slice = &mut data;
let i = 3; // BC i in slice
let j = i; // j inherits i's bounds safety for slice
let mut k = i; // mutable, no inheritance
for n in 0..10 {
slice[i] = n; // No BC
println!("Value: {}", slice[j]); // No BC
k = n;
println!("Value: {}", slice[k]); // Do BC for k in slice
}
}
Bounds safety could also be inherited through function calls if parameters are immutable:
// Compiler infers that `index' will be
// bounds checked in `data_slice' and requires that
// input is checked before supplied to function.
fn print_data(data_slice: &mut [i32], index: usize, set: i32) {
data_slice[index] = set; // No BC
println!("Value: {}", data_slice[index]); // No BC
}
fn main() {
let mut data = vec![1, 2, 3]; // user supplied
let i = 1;
let slice = &mut data; // do BC i in slice
for n in 1..10 {
// `print_data()' requires bounds checking
// Mark `i' as needing bounds checking for `slice'.
print_data(slice, i, n); // No BC
}
}
This could of course apply to several nested levels of functions. If there is some subroutine which will do a lot of random access in many iterations then this could be done only once when the subroutine is call. This without compromising safety, needing unsafe code or other considerations of the coder.
With scalar indices and simple flow control, inference is quite simple: one can easily infer at compile time if a variable needs bounds checking. In more complex setting this might not be possible. Then there is a trade-off: bounds check early to eliminate all future checks but at the cost of possible unnecessary checks, or check as close to access as possible thereby re-check often but never do unnecessary checks. Note that safety is never compromised either way; it is purely a question of performance. Consider for example:
fn main() {
let data = vec![1, 2, 3, 4, 5]; // user supplied
let index = 1; // user supplied
let b1 = true; // user supplied
let b2 = true; // user supplied
let d_slice = &data;
let i = index; // Bounds check `i' in d_slice here?
if b1 {
println!("Value: {}", d_slice[i]); // Or bounds check here?
}
if b2 {
println!("Value: {}", d_slice[i]); // And possibly again here?
}
}
If i
is BC at initialization then there will be an unnecessary check when b1=b2=false
. However, if bounds are check on access then the bounds will be checked twice when b1=b2=true
.
There are probably many smart ways the compiler can infer how the flow will be and do best guesses. In a few exceptions it could also introduce conditional checks, if early checking is deemed very beneficial in some setting but very bad in others. There could also be keywords to the compiler so the coder can indicate code that is likely needing early or late bounds checking:
0: fn main() {
1: let mut data = vec![1, 2, 3, 4, 5]; // user supplied
2: let indices = vec![0, 2, 3]; // user supplied
3: let order = vec![1, 0, 2]; // user supplied
4:
5: let d_slice = &mut data;
6: let i_slice = &indices;
7: let o_slice = ℴ
8:
9: bounds_check(i_slice, d_slice);
A: for &o in o_slice.iter() {
B: let i = i_slice[o];
C: d_slice[i] = 2;
D: }
E: ...
}
In this program there will never be any bounds checking on line B
(where o
accesses i_slice
). The compiler can infer that the loop will iterate over the elements of o_slice
thus all elements o_slice
need bounds checking and this can be done as early as possible. I.e., at the first state where both i_slice
and o_slice
exist: at line 7
. As a benefit, if elements of o_slice
is used to access i_slice
beyond line E
this can be done safely without bounds checking.
However, the problem here is line C
. Should i_slice
be checked in d_slice
at line 6
or should each single access call be checked at line C
? If the length of indices
is shorter than order
the first alternative is best, otherwise the second is. But when compiling this is not known! In this case the coder have added bounds_check(i_slice, d_slice);
to indicate that bound checking the complete i_slice
is preferable so the compiler will add checking at line 6
and the calls at line B
(and beyond) are free.
If the inference mechanism is good and coders keeps to simple and immutable data structures when possible (which I guess is good style), it feels like this type of optimization could lead to large improvements in many cases. Without unsafe code! But I’m quite new to Rust and not yet completely familiar with its internals, so I hope this idea is not incompatible with some of the inner workings or that I have not re-invented the wheel and that it is already implemented. So, does the idea have any bearing?