Summary
Add a new trait Comparator
to std::cmp
, as a generics argument to all ordered collections.
All codes of RFC can be found at Rust Playground
I prefer the feature gate comparator_api
.
Motivation
There are several ordered collections in std library, such as BTreeMap
, BTreeSet
, BinaryHeap
.
While we want to customize the behaviour of the ordering in the collection, the only way is to add a wrapper type on top of original entry type, and implement Ord
for them. The solution works in many scenarios (although tedious), but introduce heavy overhead while we need runtime generated comparator.
A typical usecase is database, while we compare two rows in database, the order depends on some runtime schemas and rules, e.g. ORDER BY (x ASC, y DESC, z ASC)
. We may introduce a dynamic rule struct OrderRule
to represents the order.
struct Row {
x: i32,
y: i32,
z: i32,
}
struct OrderRule {
// true represents ASC, and false represents DESC
x_is_asc: bool,
y_is_asc: bool,
z_is_asc: bool,
}
The only way to store Row in the current BTreeMap is that, create a wrapper type:
struct OrderedRow {
inner: Row,
rule: Rc<OrderRule>,
}
impl Ord for OrderedRow {
fn cmp(&self, rhs: &self) -> Ordering {
assert!(self.rule == other.rule);
...
}
}
The implementation have two problems:
- We have to store extra 16 bytes for Rc in the entry, but the original Row is only 24 bytes, which significantly wastes the space.
- We have to assert
self.rule == other.rule
during run time, although we know they will be always equal.
In many other languages, we can define runtime comparator for ordered collections (of course, they also allow customization of <
on entry type), e.g.:
- C++: Compare in std::map
- Java: Comparator in SortedMap
The RFC want to introduce similar thing in rust.
Interface
We can introduce a new trait Comparator
in std::cmp
.
pub trait Comparator<T> {
fn compare(&self, lhs: &T, rhs: &T) -> Ordering;
}
The most importent and required method is compare
, which receives two items and returns Ordering, and the other methods are trivial.
For compatibility and usability, we also need to introduce a default implementation DefaultComparator
:
// A default Comparator implementation, which call Ord directly.
pub struct DefaultComparator<T>(PhantomData<T>);
impl<T> Comparator<T> for DefaultComparator<T>
where
T: Ord,
{
fn compare(&self, lhs: &T, rhs: &T) -> Ordering {
lhs.cmp(rhs)
}
}
DefaultComparator
is ZST, and call Ord::cmp
directly, which means that it will not introduce any overhead, then we can add a generics argument for BTreeMap
with default implementation:
pub struct BTreeMap<K, V, C = DefaultComparator<K>> {
// If C is DefaultComparator, then c will be zero-sized.
c: C,
// TODO
_phantom: PhantomData<(K, V)>,
}
impl<K, V, C> BTreeMap<K, V, C> {
pub fn new_in_comparator(c: C) -> Self {
Self {
c,
_phantom: PhantomData,
}
}
pub fn method_without_order(&self) {
todo!()
}
}
impl<K, V, C> BTreeMap<K, V, C>
where
C: Comparator<K>,
{
pub fn method_with_order(&self) {
let k1 = (todo!()) as K;
let k2 = (todo!()) as K;
self.c.compare(&k1, &k2);
}
}
We omit allocator_api
here, which is an unstable feature.
The BTreeMap keeps the original behaviour without any break changes:
- The default implementation is trivial and zero-overhead.
- We can still create a BTreeMap with
K: !Ord
, and calls a part of methods.
Use Case
We can handle the case above easily without any overhead, all we need to do is impl Comparator for OrderRule
:
impl Comparator<Row> for OrderRule {
fn compare(&self, lhs: &Row, rhs: &Row) -> Ordering {
match (self.x_is_asc, lhs.x.cmp(&rhs.x)) {
(true, Ordering::Less) | (false, Ordering::Greater) => Ordering::Less,
(false, Ordering::Less) | (true, Ordering::Greater) => Ordering::Greater,
(_, Ordering::Equal) => match (self.y_is_asc, lhs.y.cmp(&rhs.y)) {
(true, Ordering::Less) | (false, Ordering::Greater) => Ordering::Less,
(false, Ordering::Less) | (true, Ordering::Greater) => Ordering::Greater,
(_, Ordering::Equal) => match (self.z_is_asc, lhs.z.cmp(&rhs.z)) {
(true, Ordering::Less) | (false, Ordering::Greater) => Ordering::Less,
(false, Ordering::Less) | (true, Ordering::Greater) => Ordering::Greater,
(_, Ordering::Equal) => Ordering::Equal,
},
},
}
}
}
We can use it easily:
fn f() {
let rule = OrderRule {
x_is_asc: true,
y_is_asc: false,
z_is_asc: true,
};
let map = BTreeMap::<Row, (), _>::new_in_comparator(rule);
map.method_with_order();
}
Others
We can create some util methods to simplify the usage, but i prefer implementing them in other feature gates, so we can discuss the details later.
pub struct CompareFn<T, F> {
inner: F,
_phantom: PhantomData<T>,
}
impl<T, F> Comparator<T> for CompareFn<T, F>
where
F: Fn(&T, &T) -> Ordering,
{
fn compare(&self, lhs: &T, rhs: &T) -> Ordering {
(self.inner)(lhs, rhs)
}
}
pub fn compare_fn<T, F>(f: F) -> CompareFn<T, F>
where
F: Fn(&T, &T) -> Ordering,
{
CompareFn {
inner: f,
_phantom: PhantomData,
}
}
fn f2() {
BTreeMap::<i32, (), _>::new_in_comparator(compare_fn(|_x: &i32, _y: &i32| Ordering::Equal));
}
Alternatives
Some languages use key-based customization instead of compare-based one, e.g. Python key functions, which is easier to use in some scenarios, but the compare-based one is more generic.
I prefer to add key_fn
and KeyFn impl Comparator
, which are similar to compare_fn
and CompareFn
, and keep the compare-based core design.
TBD
-
compare
orcmp
? - The order of
Comparator
andAllocator
, I prefer<K, V, C, A>