I investigated issue with small structs: [ER] Optimization of tuple equality vs array equality · Issue #83585 · rust-lang/rust · GitHub
Actually, some struct with 4 or 8 u8 fields with derived PartialEq isn't zero cost: LLVM generates branched code with shifts for it. Much better option is to transmute such struct into [u8; 8] array to make entire struct comparions one assembler operation.
I think, we could do something like bit equality types
or bit equality optimization
.
First one maybe is simpler and possibly could be done by changing #[derive(PartialEq)]
.
We could add some static method to trait like:
// default in trait
#[inline(always)]
unsafe fn is_bit_equaliable()->bool{
false
}
// in derive macro
#[inline(always)]
unsafe fn is_bit_equaliable()->bool{
std::mem::size_of::<Self>() == sum_of_field_sizes() // Check padding
// for every field
&& type0::is_bit_equaliable()
...
&& type_last::is_bit_equaliable()
}
fn eq(&self, other: &Self)->bool{
if Self::is_bit_equaliable() {
let a: &[u8; std::mem::size_of::<Self>()] = transmute(self);
let b: &[u8; std::mem::size_of::<Self>()] = transmute(other);
a==b
}
else{
// old field enumeration
}
}
// for primitives like u8, i32, bool, char BUT not for f32, pointers and etc
#[inline(always)]
unsafe fn is_bit_equaliable()->bool{
true
}
Another option is to do this optimization on MIR level. We need to somehow detect cases when all bytes of one struct compared with other struct bytes and replace them with direct bytes comparison. It would be nice because it would work everywhere but I don't ever imagine complexity of implementation for such feature.