- It is not in Rust, so it is hard to tell how well it would port over.
- I only see microbenchmarks. Those are often not very useful, as in a real program a on paper slower algorithm that uses significantly less instruction cache can be faster, if the program as a whole is cache bound or memory bandwidth bound.
Especially this last point is important: lookup tables tend to actually be terrible in real code, but look good in benchmarks. And you have some really big lookup tables. No thanks!