EDIT: The scope of this pre-RFC is much smaller than it originally was.
This is a proposal to provide
slice::unstable_sort in libcore, implemented as pattern-defeating quicksort (pdqsort).
There would also be variants
slice::sort is by contract a stable sort. This is a decision that can’t be changed now, but stable sorting is a good default.
Rust users have expressed interest in unstable sorts and explained that they very rarely (some even say “never”) need stable sorting, and that it unnecessarily constrains the potential for faster sorting.
Another dissatisfaction is expressed towards libcore, which does not include a sorting method at all.
Wait, what is stable sorting and why would I care?
What does “stable” mean when we talk about sorting? A sorting algorithm is stable if it guarantees that it will not reorder equal elements. For example:
[(0, 5), (0, 4)].sort_by_key(|p| p.0); // Will preserve the order. [(0, 5), (0, 4)].unstable_sort_by_key(|p| p.0); // May or may not preserve the order.
Why would anyone care about stability? Take this table as an example.
You want to have a table of albums sorted by release year, while sorting by number of sales within the same year. To do that, first you click the “Claimed sales” column to sort by sales, and then click the “Year” column to sort by year. The original order within the same year will be preserved because that wiki page uses a stable sorting algorithm.
If the second sort were unstable, then the albums within the same year may have been unpredictably shuffled. This is a classic example of stable sorting being useful in practice.
Let’s reiterate why users don’t want stable sorting in most cases:
- Because stable sorting is generally slower than unstable sorting.
- Because stable sorting usually allocates a lot of memory. Technically, it doesn’t have to, but then it would be even slower!
This creates a fairly clear split between two kinds of sort functions:
- Stable sort, which is allowed to allocate.
- Unstable sort, which is not allowed to allocate.
In order to make an educated decision on API design, we must first understand what kinds of sorting algorithms are offered, and what are their advantages and deficiencies. The following algorithms are where the state of art is.
| Algorithm | Stable? | Allocates? | Comparison-based? | Speed | Complexity | |-----------|---------|---------------|-------------------|------------|------------| | timsort | Yes | Yes | Yes | 1 .. 1 | 1 | | wikisort | Yes | No | Yes | 1.7 .. 1.7 | 3 | | introsort | No | No | Yes | 1 .. 1 | 1.5 | | pdqsort | No | No | Yes | 0.6 .. 1 | 2 | | radixsort | Yes | ~2kB on stack | No | 0.4 .. N/A | 1.5 |
- Speed is my rough estimate, where lower means faster. The first number indicates how fast the algorithm is when sorting primitive data types (e.g. integers), and the second how fast it is when sorting using a non-trivial comparison function.
- Complexity is roughly how much code is required and how difficult it is to implement & understand. Higher number means more complex.
- Timsort is the current implementation of
- Timsort and wikisort are similar - the difference is that wikisort does not allocate.
- Introsort is a typical fusion of quicksort with fallback on heapsort to guarantee
O(n log n)worst-case.
- Pdqsort is a strict improvement over introsort. When sorting primitive integers or using branchless comparison functions, speed is fantastic - that is the estimated
0.6. In other cases, the performance is comparable to introsort.
- Pdqsort and radixsort are not always faster than timsort. Timsort wins when sorting few concatenated sorted lists, but other than that timsort is generally slower.
All right, so which algorithms of the bunch should we use?
- Timsort was a good choice for
- Now I propose pdqsort for
Why not use radix sort?
- It’s difficult to make it perform as well as pdqsort on partially sorted inputs, and it’s not much faster than pdqsort on random inputs anyway
- It’s not applicable as a general purpose comparison-based sort, so it would be used only as a specialization for sorting integers. This further complicates implementation.
How We Teach This
In the documentation for
slice::sort just mention that there’s a faster non-allocating alternative:
More complexity and code to review. However, pdqsort really isn’t that much more code.
slice::unstable_sort might be slower than
slice::sort, which could be surprising. An example of such cases would be several concatenated sorted lists. However, such cases are not too common and I’m not aware of any others.
Another possibility would be to simply provide pdqsort in an external crate for users who need it.
Since stable and unstable sorting are indistinguishable from each other when sorting integers, it might be a good idea to specialize
slice::sort for those cases and switch to pdqsort. But I’m not too fond of this idea. I’d prefer to have more consistent and predictable performance.