So far I've seen this Idx idea as a toy, but for fun I've tried to use it in more realistic code, short and self-contained. This is a sufficiently efficient solution to an Euler Project problem (https://projecteuler.net/problem=96 with a link to the test data), this solution is efficient also because avoids most heap-allocated data and works with arrays as much as possible (also thanks to the array_chunks feature):
#![feature(default_free_fn, array_chunks)]
use std::default::default;
use std::fs::read;
fn e96() -> u32 {
const L: usize = 3;
const SIDE: usize = L * L;
const SIDE_SQ: usize = SIDE * SIDE;
const SIDE_8: u8 = SIDE as _;
const SIDE_32: u32 = SIDE as _;
type Grid = [[u8; SIDE]; SIDE];
type V = [u32; SIDE];
type Squares = [[u32; L]; L];
type Zeroes = [(usize, usize); SIDE_SQ];
fn remove_title_lines(bytes: &mut [u8]) -> &[u8] {
let mut current = 0;
let mut i = 0;
while i < bytes.len() {
match bytes[i] {
b'G' => {
// Skip the title line.
i += 1;
while i < bytes.len() && bytes[i] != b'\n' {
i += 1;
}
},
b'0' ..= b'9' => {
bytes[current] = bytes[i];
current += 1;
},
_ => {},
}
i += 1;
}
&bytes[.. current]
}
struct Sudoku {
grid: Grid,
rows: V,
cols: V,
squares: Squares,
zeroes: Zeroes,
n_zeroes: usize,
}
impl Sudoku {
fn new(data: &[u8; SIDE_SQ]) -> Option<Self> {
let mut grid: Grid = default();
for i in 0 .. SIDE {
for j in 0 .. SIDE {
if let d @ b'0' ..= b'9' = data[i * SIDE + j] {
grid[i][j] = d - b'0';
} else {
return None;
}
}
}
let mut rows: V = default();
let mut cols: V = default();
let mut squares: Squares = default();
let mut zeroes: Zeroes = [default(); SIDE_SQ];
let mut n_zeroes = 0;
for i in 0 .. SIDE {
for j in 0 .. SIDE {
if grid[i][j] > 0 {
rows[i] |= 1 << (grid[i][j] - 1);
cols[j] |= 1 << (grid[i][j] - 1);
squares[i / L][j / L] |= 1 << (grid[i][j] - 1);
} else {
zeroes[n_zeroes] = (i, j);
n_zeroes += 1;
}
}
}
Some(Self { grid, rows, cols, squares, zeroes, n_zeroes })
}
fn solve(&mut self) -> Option<u32> {
let mut best_mask = 0;
let mut best_index = 0;
let mut best_num_bits = None;
for (index, &(i, j)) in self.zeroes[.. self.n_zeroes].iter().enumerate() {
let mask = self.rows[i] | self.cols[j] | self.squares[i / L][j / L];
let num_bits = Some(mask.count_ones());
if num_bits > best_num_bits {
best_index = index;
best_mask = mask;
best_num_bits = num_bits;
}
}
if best_num_bits.is_none() {
return Some(u32::from(self.grid[0][0]) * 100 +
u32::from(self.grid[0][1]) * 10 +
u32::from(self.grid[0][2]));
} else if let Some(SIDE_32) = best_num_bits {
return None;
}
let (i, j) = self.zeroes[best_index];
self.zeroes[best_index] = self.zeroes[self.n_zeroes - 1];
self.n_zeroes -= 1;
for candidate_minus_1 in 0 .. SIDE_8 {
if (best_mask & (1 << candidate_minus_1)) == 0 {
let old_rows = self.rows[i];
let old_cols = self.cols[j];
let old_squares = self.squares[i / L][j / L];
self.rows[i] |= 1 << candidate_minus_1;
self.cols[j] |= 1 << candidate_minus_1;
self.squares[i / L][j / L] |= 1 << candidate_minus_1;
self.grid[i][j] = candidate_minus_1 + 1;
if let result @ Some(_) = self.solve() {
return result;
}
self.rows[i] = old_rows;
self.cols[j] = old_cols;
self.squares[i / L][j / L] = old_squares;
self.grid[i][j] = 0;
}
}
self.zeroes[self.n_zeroes] = (i, j);
self.n_zeroes += 1;
None
}
}
remove_title_lines(&mut read("p096_sudoku.txt").unwrap())
.array_chunks()
.map(|ch| Sudoku::new(ch).unwrap().solve().unwrap())
.sum()
}
fn main() {
assert_eq!(e96(), 24_702);
}
Even compiling with aggressive compilation options this code contains a lot (about 15) array bound tests. And many of those lines seem a very good fit for Idx:
let (i, j) = self.zeroes[best_index];
...
let old_rows = self.rows[i];
let old_cols = self.cols[j];
let old_squares = self.squares[i / L][j / L];
So I've applied the Idx (that's a poor's man implementation of a built-in feature of Ada language, that allows to define the type of variables as indexes of specific arrays):
#![feature(
array_chunks,
const_assume,
const_evaluatable_checked,
const_generics,
const_option,
const_panic,
const_trait_impl,
core_intrinsics,
default_free_fn,
nonzero_leading_trailing_zeros,
step_trait,
step_trait_ext,
)]
#![allow(incomplete_features)]
use std::default::default;
use std::fs::read;
use std::ops::{Index, IndexMut, RangeInclusive};
use std::iter::Zip;
use std::slice::Iter;
use std::intrinsics::assume;
const fn is_nonzero(n: usize) -> usize {
assert!(n != 0);
n
}
mod idx {
use std::iter::Step;
use super::is_nonzero;
use super::assume;
#[derive(Debug, Copy, Clone, PartialOrd, PartialEq, Default)]
#[repr(transparent)]
pub struct Idx<const N: usize>(usize);
impl<const N: usize> Idx<N> where [(); is_nonzero(N)]: {
#[inline(always)]
pub const fn new(i: usize) -> Option<Idx<N>> {
if i < N {
Some(Idx(i))
} else {
None
}
}
#[inline(always)]
pub const unsafe fn new_unchecked(i: usize) -> Idx<N> {
Idx(i)
}
#[inline(always)]
pub const fn get(&self) -> usize {
unsafe { assume(self.0 < N); }
self.0
}
}
unsafe impl<const N: usize> Step for Idx<N> where [(); is_nonzero(N)]: {
fn steps_between(start: &Self, end: &Self) -> Option<usize> {
Some(end.0 - start.0)
}
fn forward_checked(start: Self, count: usize) -> Option<Self> {
Self::new(start.0 + count)
}
fn backward_checked(start: Self, count: usize) -> Option<Self> {
Some(Self(start.0.checked_sub(count)?))
}
}
}
use idx::Idx;
const fn is_contained(n: usize, m: usize) -> usize {
assert!(n <= m);
n
}
impl<T, const N: usize, const M: usize> Index<Idx<N>> for [T; M]
where [(); is_nonzero(N)]: ,
[(); is_contained(N, M)]: {
type Output = T;
#[inline(always)]
fn index(&self, i: Idx<N>) -> &Self::Output {
unsafe { self.get_unchecked(i.get()) }
}
}
impl<T, const N: usize, const M: usize> IndexMut<Idx<N>> for [T; M]
where [(); is_nonzero(N)]: ,
[(); is_contained(N, M)]: {
#[inline(always)]
fn index_mut(&mut self, i: Idx<N>) -> &mut Self::Output {
unsafe { self.get_unchecked_mut(i.get()) }
}
}
trait ArraySpan<const N: usize> {
fn span(&self) -> RangeInclusive<Idx<N>>;
}
impl<T, const N: usize> const ArraySpan<N> for [T; N]
where [(); is_nonzero(N)]: {
#[inline(always)]
fn span(&self) -> RangeInclusive<Idx<N>> {
unsafe {
Idx::new_unchecked(0) ..= Idx::new_unchecked(N - 1)
}
}
}
trait ArrayEnumerate<T, const N: usize> {
fn enumerate(&self) -> Zip<RangeInclusive<Idx<N>>, Iter<'_, T>>;
}
impl<T, const N: usize> ArrayEnumerate<T, N> for [T; N]
where [(); is_nonzero(N)]: {
fn enumerate(&self) -> Zip<RangeInclusive<Idx<N>>, Iter<'_, T>> {
self.span().zip(self.iter())
}
}
fn e96() -> u32 {
const L: usize = 3;
const SIDE: usize = L * L;
const SIDE_SQ: usize = SIDE * SIDE;
const SIDE_8: u8 = SIDE as _;
const SIDE_32: u32 = SIDE as _;
type Grid = [[u8; SIDE]; SIDE];
type V = [u32; SIDE];
type Squares = [[u32; L]; L];
type IdxS = Idx<SIDE>;
type Zeroes = [(IdxS, IdxS); SIDE_SQ];
fn remove_title_lines(bytes: &mut [u8]) -> &[u8] {
let mut current = 0;
let mut i = 0;
while i < bytes.len() {
match bytes[i] {
b'G' => {
// Skip the title line.
i += 1;
while i < bytes.len() && bytes[i] != b'\n' {
i += 1;
}
},
b'0' ..= b'9' => {
bytes[current] = bytes[i];// 1/5 ##########
current += 1;
},
_ => {},
}
i += 1;
}
&bytes[.. current]// 2/5 ##########
}
struct Sudoku {
grid: Grid,
rows: V,
cols: V,
squares: Squares,
zeroes: Zeroes,
n_zeroes: usize,
}
impl Sudoku {
fn new(data: &[u8; SIDE_SQ]) -> Option<Self> {
let mut grid: Grid = default();
for i in 0 .. SIDE {
for j in 0 .. SIDE {
if let d @ b'0' ..= b'9' = data[i * SIDE + j] {
grid[i][j] = d - b'0';
} else {
return None;
}
}
}
let mut rows: V = default();
let mut cols: V = default();
let mut squares: Squares = default();
let mut zeroes: Zeroes = [default(); SIDE_SQ];
let mut n_zeroes = 0;
for i in rows.span() {
for j in cols.span() {
if grid[i][j] > 0 {
rows[i] |= 1 << (grid[i][j] - 1);
cols[j] |= 1 << (grid[i][j] - 1);
squares[i.get() / L][j.get() / L] |= 1 << (grid[i][j] - 1);
} else {
zeroes[n_zeroes] = (i, j);// 3/5 ##########
n_zeroes += 1;
}
}
}
Some(Self { grid, rows, cols, squares, zeroes, n_zeroes })
}
fn solve(&mut self) -> Option<u32> {
let mut best_mask = 0;
let mut best_index = Idx::<SIDE_SQ>::default();
let mut best_num_bits = None;
for (index, &(i, j)) in self.zeroes.enumerate().take(self.n_zeroes) {//SLOW
let mask = self.rows[i] | self.cols[j] | self.squares[i.get() / L][j.get() / L];
let num_bits = Some(mask.count_ones());
if num_bits > best_num_bits {
best_index = index;
best_mask = mask;
best_num_bits = num_bits;
}
}
if best_num_bits.is_none() {
return Some(u32::from(self.grid[0][0]) * 100 +
u32::from(self.grid[0][1]) * 10 +
u32::from(self.grid[0][2]));
} else if let Some(SIDE_32) = best_num_bits {
return None;
}
let (i, j) = self.zeroes[best_index];
self.zeroes[best_index] = self.zeroes[self.n_zeroes - 1];// 4/5 ##########
self.n_zeroes -= 1;
for candidate_minus_1 in 0 .. SIDE_8 {
if (best_mask & (1 << candidate_minus_1)) == 0 {
let old_rows = self.rows[i];
let old_cols = self.cols[j];
let old_squares = self.squares[i.get() / L][j.get() / L];
self.rows[i] |= 1 << candidate_minus_1;
self.cols[j] |= 1 << candidate_minus_1;
self.squares[i.get() / L][j.get() / L] |= 1 << candidate_minus_1;
self.grid[i][j] = candidate_minus_1 + 1;
if let result @ Some(_) = self.solve() {
return result;
}
self.rows[i] = old_rows;
self.cols[j] = old_cols;
self.squares[i.get() / L][j.get() / L] = old_squares;
self.grid[i][j] = 0;
}
}
self.zeroes[self.n_zeroes] = (i, j);// 5/5 ##########
self.n_zeroes += 1;
None
}
}
remove_title_lines(&mut read("p096_sudoku.txt").unwrap())
.array_chunks()
.map(|ch| Sudoku::new(ch).unwrap().solve().unwrap())
.sum()
}
fn main() {
assert_eq!(e96(), 24_702);
}
This was a success, the code is still simple and readable as before, but now I think there are about only 5 array bound tests left. Unfortunately this line slows down the code:
for (index, &(i, j)) in self.zeroes.enumerate().take(self.n_zeroes) {//SLOW
So I have had to restore index
as a normal usize
, so the code has now about 7 array bound tests. And it's about 5% faster than the original version. A better built-in implementation of this idea should allow to leave only 5 array bound tests. Such tests slow down code and introduce possible panics. And no-panic code is a worthy goal to desire, even if those tests are invisible it's good to have code that has being mechanically verified as safer.
The remaining 5 array bound tests can be removed with preconditions/invariants/postconditions that someday I'd like to see as built-in features of Rust core language. In particular it should be sufficiently easy to write down invariants for the remove_title_lines
to verify that the current
index is always <= i
, and i
is always in-bound.
Currently LLVM is able to remove the first array bound test inside remove_title_lines
if you replace the loop test with:
while i < bytes.len() and j < bytes.len() {
But currently this is not an improvement because it slows down code. In future LLVM will probably be able to remove that test even with this loop test (see https://github.com/rust-lang/rust/issues/80075 https://bugs.llvm.org/show_bug.cgi?id=48544 ):
while i < bytes.len() and j <= i {
Also this line is probably easy to demonstrate panic-less because n_zeroes
gets incremented at most every time the Cartesian product of the two loops:
zeroes[n_zeroes] = (i, j);// 3/5 ##########
In this code there's also an example of code with an index that can't go past the array length, so it can't be memory-unsafe, but it could be still logically wrong, because zeroes
is an array that has meaning only for its first n_zeroes
items (this was done to avoid heap-allocated memory). So here verifying both lack of out of bound indexing and logical out of bounds is a bit different. If you use a Vec and you grow it only as much as necessary, the logically used part of the sequence is as long as the Vec.
Later there are instructions to increment and decrement self.n_zeroes
in the recursive calls, this makes the in-bound demonstration a bit more elaborate. In the end you can decide how much you want to work to remove array bound tests in your code.