Update (2018-05-17)
- I published the crate
binary-heap-plus
for those who need custom ordered heap NOW. crates.io: Rust Package Registry FnMut
based solution might be better thanCompare<T>
for RFC inclusion, but it requires some unstable features to be written nicely.- I won't (can't) write the
FnMut
based RFC for the time being because I'm not familiar with the related unstable features.
Thanks for all suggestions!
Pre-RFC
- Feature Name: binary_heap_flexivility
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)
Summary
Extend BinaryHeap such that it supports max/min/custom heap in a backward-compatible way.
Motivation
The current BinaryHeap
implementation only supports max-heap. It is useful if heap could be used in arbitrary orders.
In C++ version of the same library, priority_queue
can pop elements in arbitrary orders. The following C++ codes are excerpt from std::priority_queue - cppreference.com
[C++] Max heap
std::priority_queue<int> q;
for(int n : {1,8,5,6,3,4,0,9,7,2})
q.push(n);
print_queue(q); // prints: 9 8 7 6 5 4 3 2 1 0
[C++] Min heap
std::priority_queue<int, std::vector<int>, std::greater<int> > q2;
for(int n : {1,8,5,6,3,4,0,9,7,2})
q2.push(n);
print_queue(q2); // prints: 0 1 2 3 4 5 6 7 8 9
[C++] Custom heap using lambda
// Using lambda to compare elements.
auto cmp = [](int left, int right) { return (left ^ 1) < (right ^ 1);};
std::priority_queue<int, std::vector<int>, decltype(cmp)> q3(cmp);
for(int n : {1,8,5,6,3,4,0,9,7,2})
q3.push(n);
print_queue(q3); // prints: 8 9 6 7 4 5 2 3 0 1
Guide-level explanation
[Rust] Max heap (Backward Compatible)
Existing code should work as max heap like before.
let heap = BinaryHeap::from(vec![1, 8, 5, 6, 3, 4, 0, 9, 7, 2]); // Max heap
assert_eq!(
itertools::unfold(heap, |heap| heap.pop()).collect::<Vec<_>>(),
vec![9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
);
[Rust] Max heap
You can explicitly declare it as a max heap.
let heap: BinaryHeap<_, MaxComparator> = BinaryHeap::from(vec![1, 8, 5, 6, 3, 4, 0, 9, 7, 2]); // Max heap
assert_eq!(
itertools::unfold(heap, |heap| heap.pop()).collect::<Vec<_>>(),
vec![9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
);
[Rust] Min heap
You can explicitly declare it as a min heap.
let heap: BinaryHeap<_, MinComparator> = BinaryHeap::from(vec![1, 8, 5, 6, 3, 4, 0, 9, 7, 2]); // Min heap
assert_eq!(
itertools::unfold(heap, |heap| heap.pop()).collect::<Vec<_>>(),
vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
);
[Rust] Custom heap
You can declare custom heap.
let heap: BinaryHeap<_, CustomComparator> = BinaryHeap::from(vec![1, 8, 5, 6, 3, 4, 0, 9, 7, 2]);
assert_eq!(
itertools::unfold(heap, |heap| heap.pop()).collect::<Vec<_>>(),
vec![8, 9, 6, 7, 4, 5, 2, 3, 0, 1]
);
In order to implement CustomComparator
, you need to read the reference (later in this document).
It requires more setup (defining CustomComparator
) compared to C++ version, but the Rust version is potentially faster because compare
function could be statically resolved and therefore inlined.
[Edit]
[/Edit]
Reference-level explanation
The Compare<T>
trait
Compare<T>
is a light-weight version of Ord
that only carries one compare()
function.
The difference than Ord
is:
- Much simpler:
- To implement
Ord
, you need to implement all of the family traits (Ord
,PartialOrd
,Eq
andPartialEq
). Compare<T>
requires only one function.Compare<T>
is nothing to do with operator overloading.
- To implement
- Support multiple ordering:
Ord
can only define one ordering per typeT
.Compare<T>
can define arbitrary ordering for typeT
.
We'll provide the default Compare<T>
for T: Ord
in std
:
pub struct MaxComparator;
impl<T: Ord> Compare<T> for MaxComparator {
fn compare(a: &T, b: &T) -> Ordering {
a.cmp(&b)
}
}
pub struct MinComparator;
impl<T: Ord> Compare<T> for MinComparator {
fn compare(a: &T, b: &T) -> Ordering {
b.cmp(&a)
}
}
In the user code, you'd implement Compare<T>
for CustomComparator
like this:
pub struct CustomComparator;
impl<T: Ord> Compare<T> for CustomComparator {
fn compare(left: &T, right: &T) -> Ordering {
(left ^ 1).cmp(right ^ 1)
}
}
Change to BinaryHeap<T>
The current type signature is:
pub struct BinaryHeap<T> { /* fields omitted */ }
We'll change this to:
pub struct BinaryHeap<T, C = MaxComparator> where C: Compare<T> { /* fields omitted */ }
Drawbacks
- It does not support the following convenient method:
let mut heap = BinaryHeap::with_cmp(|&a, &b| (a ^ 1).cmp(b ^ 1)); // custom heap creation using lambda
Rationale and alternatives
[Alt1] Use type alias
My experimental crate used slightly different design and achieved good backward compatibility. After implemented it, I read the API revolution RFC (RFC 1105) and concluded using the default type parameter is a better option.
pub struct BinaryHeapPlus<T, C> where C: Compare<T>;
/// A backward-compatibility type that can be used as a max heap.
pub type BinaryHeap<T> = BinaryHeapPlus<T, MaxComparator>;
[Edit] simplified to illustrate more clearly. [/Edit]
See GitHub - sekineh/binary-heap-plus-rs: Enhancement over Rust's `std::collections::BinaryHeap`. Supports other than max heap. for the details.
[Alt2] Include function pointer as a struct member
This would enable BinaryHeap::with_cmp(|&a, &b| expr)
use case, but we need to give up the optimization opportunities such as inlining.
It's fine in most other general-purpose languages, but Rust favors zero-cost abstraction. I'm not sure if it is acceptable in embedded, for example.
[Edit]
The above is wrong.
Using F: Fn
trait, the implementation could be monomorphized into concrete f
.
[/Edit]
[Alt3] Do nothing
Use new type pattern and implement all of Ord
trait families.
The code below is modified from a real-world example:
/// QueryOnNode orderd by frequency
#[derive(Debug, Clone, Eq)]
pub struct NodeRangeByFrequency(QueryOnNode);
impl Ord for NodeRangeByFrequency {
fn cmp(&self, other: &Self) -> Ordering {
// implement this
}
}
impl PartialOrd for NodeRangeByFrequency {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(&other)) // boilerplate
}
}
impl PartialEq for NodeRangeByFrequency {
fn eq(&self, other: &Self) -> bool {
self.cmp(other) == Ordering::Equal // boilerplate
}
}
Prior art
- C++ priority_queue
- Rust:
heapless::BinaryHeap
- URL: https://japaric.github.io/heapless/heapless/binary_heap/index.html
- supports Max and Min heap depending on type args.
- Rust:
BinaryHeapPlus
- URL: GitHub - sekineh/binary-heap-plus-rs: Enhancement over Rust's `std::collections::BinaryHeap`. Supports other than max heap.
- Imcomplete prototype for this RFC
Unresolved questions
Is it really backward compatible?
RFC 1105 described that "adding defaulted type parameters" is a minor change, but is it always possible to implement in a compatible way?
// MINOR CHANGE
// Before
struct Foo { .. }
// After
struct Foo<A = u8> { .. }
In my experiment, BinaryHeap::from() was broken when I tried to follow this pattern.
https://github.com/sekineh/binary-heap-plus-rs/tree/eliminate_binaryheapplus/src produces the following error:
sekineh@MX3-sekineh MINGW64 /c/workmx3/01WaveletMatrix/binary-heap-plus (eliminate_binaryheapplus)
$ cargo test
Compiling binary-heap-plus v0.1.0 (file:///C:/workmx3/01WaveletMatrix/binary-heap-plus)
error[E0282]: type annotations needed
--> src\lib.rs:25:20
|
25 | let heap = BinaryHeap::from(data);
| ---- ^^^^^^^^^^^^^^^^ cannot infer type for `C`
| |
| consider giving `heap` a type
error: aborting due to previous error
For more information about this error, try `rustc --explain E0282`.