I'm afraid the program will not be very useful as it requires long preparation step and then audits results of preparation.
Essentially I have this trait:
pub trait ReadAtSync: Send + Sync {
/// Fill the buffer by reading bytes at a specific offset
fn read_at(&self, buf: &mut [u8], offset: usize) -> io::Result<()>;
}
It is implemented for &[u8]
:
impl ReadAtSync for [u8] {
fn read_at(&self, buf: &mut [u8], offset: usize) -> io::Result<()> {
if buf.len() + offset > self.len() {
return Err(io::Error::new(
io::ErrorKind::InvalidInput,
"Buffer length with offset exceeds own length",
));
}
buf.copy_from_slice(&self[offset..][..buf.len()]);
Ok(())
}
}
And File
:
impl ReadAtSync for File {
fn read_at(&self, buf: &mut [u8], offset: usize) -> io::Result<()> {
self.read_exact_at(buf, offset as u64)
}
}
And then I have created a following wrapper that opens file multiple times and implements the same trait leveraging above File
implementation of the trait:
pub struct RayonFiles {
files: Vec<File>,
}
impl ReadAtSync for RayonFiles {
fn read_at(&self, buf: &mut [u8], offset: usize) -> io::Result<()> {
let thread_index = rayon::current_thread_index().ok_or_else(|| {
io::Error::new(
io::ErrorKind::Other,
"Reads must be called from rayon worker thread",
)
})?;
let file = self.files.get(thread_index).ok_or_else(|| {
io::Error::new(io::ErrorKind::Other, "No files entry for this rayon thread")
})?;
file.read_at(buf, offset)
}
}
impl RayonFiles {
pub fn open(path: &Path) -> io::Result<Self> {
let files = (0..rayon::current_num_threads())
.map(|_| {
let file = OpenOptions::new()
.read(true)
.advise_random_access()
.open(path)?;
file.advise_random_access()?;
Ok::<_, io::Error>(file)
})
.collect::<Result<Vec<_>, _>>()?;
Ok(Self { files })
}
}
All file operations above are proving OS hints about random file reads using these cross-platform utility traits (they implement cross-platform version of read_exact_at
as described before): https://github.com/subspace/subspace/blob/c14b3ff0d6a2547f4d9155f1acf231f3181dc68d/crates/subspace-farmer-components/src/file_ext.rs
Now what I actually do with that is running some CPU-intensive work interleaved with random file reads (~20kiB each per 1GB of space used) using rayon on a large file (2.7TB in above case).
So above numbers are not just disk reads, they represent the workload I actually care about where reads are only a component, but it is clear that there is a massive difference depending on how files are read on Windows, it is even larger than above results show due to CPU being a significant contributor to the performance.