`OrderBy` struct in `std::cmp`?

A few times I've had the desire to sort objects in a list or tree by a custom key/lambda, but currently that is a bit cumbersome. You have to define a new-type wrapper and impl Ord for it. It would be nicer if the std::cmp module had a helper for this. It already has one for std::cmp::Reverse...

This is what I propose:

#[derive(Debug, Copy, Clone)]
pub struct OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
    inner: T,
    by: F,
}

impl<T, F, U> PartialEq for OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
    fn eq(&self, other: &Self) -> bool {
        (self.by)(&self.inner).eq(&(other.by)(&other.inner))
    }
}

impl<T, F, U> Eq for OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
}

impl<T, F, U> PartialOrd for OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

impl<T, F, U> Ord for OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
    fn cmp(&self, other: &Self) -> Ordering {
        (self.by)(&self.inner).cmp(&(other.by)(&other.inner))
    }
}

impl<T, F, U> OrderBy<T, F, U>
where
    F: Fn(&T) -> U,
    U: Ord,
{
    pub fn new(inner: T, by: F) -> Self {
        Self { inner, by }
    }

    pub fn into_inner(self) -> T {
        self.inner
    }
}
2 Likes

What's the benefit of having this in the standard library vs. an external crate?

1 Like

Minor nit: This would be OrderByKey, not just OrderBy.

I think this would be difficult to use for now, because without abstract type it's hard to name the type it's using to store anywhere. I can store a BinaryHeap<Reverse<T>> in a struct easily, but storing the BinaryHeap<OrderByKey<T, ???, U>> would be awkward.

7 Likes

This is awkward as written, because each object stores a different instance of the comparison (or key extraction) closure. Ideally the function generic would be a pure compile-time concept, either as a const generic function pointer, or just as a F: Default where it's always a new instance getting called. (Are stateless closure/function types Default?)

1 Like

I think the answer is, “no, but it has been discussed before”.

2 Likes

cc Call Site Dependency Injection

Storing a closure/using Ord isn’t the most flexible API possible, it doesn’t workout lifetimes in some cases (main example is storing a tree of indices, which are compared by the Ord of an object they point to). A maximally flexible API would take an Ord closure at the point where you use the tree.

It’d be cool if we can come up with something ergonomic and more general here!

An interesting example of a flexible API is hashbrown raw entry API. One drawback of it I’ve recently learned in a painful way is that it’s possible to accidentally mixup natural ordering and he one from provided closures: https://github.com/rust-analyzer/rowan/pull/111

1 Like

It's simple and is directly part of the standard library's interface, while also being just complex enough to be difficult to get exactly right (like dbg). Assuming it also shakes out to have a "definitely correct" API, it sounds like a great candidate for a one struct crate that eventually just gets merged to std.

1 Like

Hmm... thanks all! There are some good objections mentioned, to which I don't have good answers. I guess I'm back to the drawing board....

update: this is my fault that assume F could not be optimized by rustc.


that could be a totally waste of memory. std::cmp::Reverse<T>'s size is actually the size of T, but here you must store F in each OrderBy struct, which could waste a lot of memory. even if you're using &F or Box

#[derive(Debug, Copy, Clone)]
pub struct OrderBy<T, F:'static+?Sized>
where
    F: Fn(&T) -> std::cmp::Ordering,
{
    inner: T,
    by: &'static F,
}
use std::mem::size_of;
fn main(){
  println!("{}",size_of::<i32>());
  println!("{}",size_of::<std::cmp::Reverse<i32>>());
  println!("{}",size_of::<OrderBy<i32,dyn Fn(&i32)->std::cmp::Ordering>>())
}

output:

4
4
24

maybe, 24==size_of::()+padding+fat pointer(2*size_of::()) here.

you may using extra 16 bytes for a struct converts to OrderBy<>, when you want to swap two different struct, you must swap extra 16 bytes regradless that these 16 bytes are actually the same for each other.

I have no good idea, maybe you could using marco to impl things like OrderByName<T>,OrderByGender<T>,...

I could not see any benefit just write down a OrderBy<T,F> or OrderBy<T,F,U>

But if it's using a closure, F can be a ZST:

let x = std::cmp::Reverse(123);
println!("{}", std::mem::size_of_val(&x));
let y = OrderBy {
    inner: 123,
    by: |x| -(x / 2),
};
println!("{}", std::mem::size_of_val(&y));

output

4
4

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a0fd4c5661711190641c1c1a68f4fb33

(But you'll notice I side-stepped the "how do I write the type?" problem with size_of_val.)

1 Like

It is better to write let f=|x:&i32|->i32{-(x / 2)};and let y = OrderBy {...by:f}

otherwise we may meet such error:

error[E0308]: mismatched types
  --> src/main.rs:77:22
   |
70 |         by: |x| -(x / 2),
   |             ------------ the expected closure
...
74 |         by: |x| -(x / 2),
   |             ------------ the found closure
...
77 |     println!("{}", y<z);
   |                      ^ expected closure, found a different closure
   |
   = note: expected struct `OrderBy<{integer}, [closure@src/main.rs:70:13: 70:25], {integer}>`
              found struct `OrderBy<{integer}, [closure@src/main.rs:74:13: 74:25], {integer}>`
   = note: no two closures, even if identical, have the same type
   = help: consider boxing your closure and/or using it as a trait object

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error
1 Like

I'd rather like to see Compare Trait in std. It supports broader use cases than just OrderByKey. In that crate, compare::Extract struct is an equivalent of the OrderByKey struct.

https://crates.io/crates/compare doc: compare - Rust