It would be nice to finally use some strong properties of rust’s type system in order to do some aggressive high-level optimizations. I came across something like this when profiling code - >90% of the time was spent computing a function that only took immutable references as arguments, and so it returned the same result each time. It was at the bottom of a stack of many loops. LLVM does do loop hoisting, but it seems to be very hit-and-miss, especially buried so deep in code that does a bunch of other stuff.
So I had an idea. Maybe the reason LLVM cannot optimize this code is because it doesn’t know enough about rust’s type system to know when values are immutable. At a higher level, we can reason about return values using the fact that references are immutable, values are immutable unless specified etc. - and then hoist the values out of the loop before it ever gets to LLVM.
Here’s an example I came up with where rustc produces inefficient code:
struct Struct {
data: Vec<i32>,
}
extern {
fn use_value(input: i32);
}
impl Struct {
fn compute(&self) -> i32 {
let mut ret = 0;
for _ in 0..10 {
ret += self.data[0] * 0b110110 ^ ret / 7 & self.data[0] + ret;
}
ret
}
}
#[no_mangle]
pub fn compute(data: i32, n: usize) {
let s = Struct { data: vec![data] };
for _ in 0..n {
unsafe {
use_value(s.compute());
}
}
}
Here, the generated assembly computes the value each time, even though from the immutability of s (and the lack of unsafe code in Struct::compute) we can reason that s.compute() will return the same value each time. Note that with a constant n the value is only computed once - I had to add complexity until it wasn’t.
So the transformation would turn the compute function into the following:
#[no_mangle]
pub fn compute(data: i32, n: usize) {
let s = Struct { data: vec![data] };
let value = s.compute();
for _ in 0..n {
unsafe {
use_value(value);
}
}
}
I will admit, I do not know very much about this kind of stuff, so I guess the key questions are:
- Will this be genuinely useful?
- What are the precise conditions under which this transformation is valid? I mentioned “lack of unsafe code” - but this is not strictly true - there is unsafe code when accessing an element of a
Vec, and other “hidden” unsafe code (like accessing a Mutex) could result in different results from the function each time. (Even a for loop calls mem::swap deep in its bowels, which is implemented with unsafe code)
- Can this be generalized to values in general rather than just return values from functions?
- Where/how should this optimization be applied? Is MIR the right place for this kind of thing? I notice MIR seems to add a lot of mutability where none is needed - this might be a problem for this kind of reasoning