Just make a type that is a 64-byte aligned array of 16 i32s (or just 64 u8s, although if you are using it as i32s it’s more elegant to use i32s), make a Vec of that, and then a newtype over that Vec that implements Deref to &i32 by transmuting.
Add an extra size field to the newtype if you need to keep track of size at primitive granularity.