Vectorisation for min/max sorting

drdozer · November 22, 2019, 1:02pm

Hi - I'm playing with optimised sort routines. I have compared code generated from some C++ and rust that both compute a 32 element sorting network. The C++ code benchmarks at about 1/3 faster than the rust code (approx 70ns vs 100ns). The assembly for the C++ is using vectorised ops, where as the rust is not.

c++: https://rust.godbolt.org/z/prJRnu rust: https://rust.godbolt.org/z/mQxUgk

The main difference that I can see is that the C++ code 'cheats' by providing a hand-rolled implementation of min/max. Rustc is ending up producing something that ultimately looks pretty similar to the hand-rolled assembly, but in the rust case it isn't then going on to vectorising it.

My target is i7-8700, skylark, msse4.2.

So my question is two-fold. Why is the c++ code faster? Why is rustc not able to achieve this?

Ixrec · November 22, 2019, 2:34pm

https://users.rust-lang.org/ would be the better forum for questions like this. There's loads of other "why is this code slower?" threads there. Of course, the obligatory first question is "Did you build with --release?", so either retry with it or state in your new post that you did use it.

Tom-Phinney · November 22, 2019, 2:41pm

To extend the above reply, this forum is for discussion of proposed changes to the Rust language and its compiler and related tooling. Queries such as yours belong in the Users forum, whose URL is in the prior post.

drdozer · November 25, 2019, 11:18am

Thanks

system · March 3, 2020, 12:58am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Compile time performance changes for 1.8	4	1854	March 25, 2019
Rust Compiler Performance Working Group announcements	44	12348	March 25, 2019
Optimization comparison: Vec vs array and for vs while compiler	5	1187	July 2, 2022
[x-post from users forum] Long compile times for a vector with 50K u64 values internals	2	973	March 11, 2020
Using a custom optimisation pass pipeline compiler	7	3854	March 25, 2019

Vectorisation for min/max sorting

Related topics