Consider this simple function which creates an array containing the Fibonacci sequence:
const LENGTH: usize = 80;
fn fibonacci_array() -> [u64; LENGTH] {
let mut output = [0; LENGTH];
output[0] = 1;
output[1] = 1;
for i in 2..LENGTH {
output[i] = output[i - 1] + output[i - 2];
}
output
}
As it turns out, the compiler is not smart enough to realize that none of the zeros exist in the final array and does unnecessary work to zero out the array (as seen on Godbolt). Therefore, this function can be optimized using MaybeUninit
like so:
fn fibonacci_array_unsafe() -> [u64; LENGTH] {
use std::mem::{MaybeUninit, transmute};
let mut output = [MaybeUninit::<u64>::uninit(); LENGTH];
output[0].write(1);
output[1].write(1);
for i in 2..LENGTH {
output[i].write(unsafe { output[i - 1].assume_init() + output[i - 2].assume_init() });
}
unsafe { transmute(output) }
}
This runs about 20% faster on my machine. But is it actually better code? It's extremely ugly and uses copious amounts of unsafe
. If I make a logic error like forgetting to initialize one of the values, it triggers undefined behaviour. My proposed safe abstraction looks like:
use std::mem::arbitrary;
fn fibonacci_array2() -> [u64; LENGTH] {
let mut output: [u64; LENGTH] = arbitrary();
// At this point, the contents of `output` are well-defined but unspecified.
output[0] = 1;
output[1] = 1;
for i in 2..LENGTH {
output[i] = output[i - 1] + output[i - 2];
}
output
}
arbitrary()
tells the compiler: I don't care what the value is. The compiler can put anything it likes into output
at this point. 0, 42, the current timestamp, whatever. But the hope is that the compiler will choose a value such that the resulting program is as fast as possible. At this point, it would make sense to grab some uninitialized memory, call LLVM's freeze
on it (cf. 1, 2), and whatever happens to be there is a valid output of arbitrary()
.
If we accidentally forget to initialize output
, the result is merely garbage data rather than undefined behaviour. Thus we get the full speed of the unsafe version without having to use any unsafe
blocks.
Note that arbitrary()
can only be used for types that can hold arbitrary bit patterns (think bytemuck's Pod
trait):
let a: bool = arbitrary(); // ERROR
let b: String = arbitrary(); // ERROR
let c: i128 = arbitrary(); // this is ok
let d: [f32; 50] = arbitrary(); // this is ok
This includes signed and insigned integers, floats, and arrays or structs containing these. As always, assigning an invalid bit pattern to a type is undefined behaviour.
Anyway, I just wanted to say my piece, but maybe this can be implemented someday.