The specific use case I have in mind is an indexed BinaryHeap in a very performance sensitive context. The BinaryHeap consists of entries with both unique id, and the priority value, like so
struct KeyedBinaryHeapEntry {
id: u32, // unique id's
priority: u32, // implement partial order on only this, ignoring id
/* other fields */
}
soa_derive
doesn't have the means to make priority
an arbitrary collection (such as Heap, Queue, etc), or to have finer-grained control to optimise cache friendliness.
This could possibly be resolved by annotating with #[derive(StructOfBinHeap)]
This, however, might be problematic when you try to couple together two different structs.
type Id = NonZeroU32;
type Priority = NonZeroUsize;
type HeapPos = usize;
struct IndexedBinHeap {
// push() will sort with this first, can cache the movements for updateing the table.
// holds an ID so it can update the entries table to lookup heap entry by Id in O(n)~
#[collected_field = "BinaryHeap"]
pr: id,
#[couple_with(id)]
priority: Priority,
// By coupling the two collections at language implementation level, I'm thinking can optimise
// some of the workload, such as opportunistic updating of HeapPos
#[collected_field = "HashMap"]
entries: Entry<Id, HeapPos>,
}
#[bench]
fn split_test_idx_bin_heap_layout()
// repeats bench with two of member vecs together as one Vec of a tuple, and compares
// with the bench run with default layout
#[bench_layout(Vec<(id, priority)>)]
let testee = IndexedBinHeap::new();
/* performance torture test */
as an alternative to the #[bench]
split testing, you could start to give clues to a run-time profiler/optimiser, allowing it to seek out ways to play around with the layout to maximise cache affinity.
Another limitation of soa_derive
is the limitations that comes with proc_macros. Being able to have the ergonomics that come with this crate become seamlessly integrated with the language is something that I think is worth discussing.