`<[T]>::sort_by_index` and `<[T]>::sort_by_key_and_index`

What do you think about chances of getting something like sort_by_index into std?

pub fn sort_by_index<K, F>(&mut self, mut f: F)
where
    F: FnMut(usize) -> K,
    K: Ord;

pub fn sort_by_key_and_index<K, F>(&mut self, mut f: F)
where
    F: FnMut(usize, &T) -> K,
    K: Ord;

pub fn sort_unstable_by_index<K, F>(&mut self, mut f: F)
where
    F: FnMut(usize) -> K,
    K: Ord;

pub fn sort_unstable_by_key_and_index<K, F>(&mut self, mut f: F)
where
    F: FnMut(usize, &T) -> K,
    K: Ord;

// and `cached` too?

One possible use case is to sort one array by keys from some other array. It comes up very frequently for me, and for data oriented programming in general. permutation - Rust can do this, but it requires computing intermediate permutation and additional heap space.

I don't see a good way for it to be an external crate, as it would basically have to reimplement the sorting form std.

1 Like

Any implementation of this needs to allocate a temporary vector of indices and sort that. There is no reasonable way of doing this in place.

Why? You can just subtract pointers and pass the index to caller, isn't it?

The indices will be changing during sorting as you're swapping elements.

1 Like

Oh :sweat_smile:

Well, then unstable case is impossible Indeed.

To avoid internal allocation you could oblige caller to supply a scratch array of [usize; N].

EDIT: Concretely, suppose we added this function to slices

/// for 0 <= i < self.len() move item i to position permutation[i]
/// panics if self.len() != permutation.len()
/// (if possible make that a type constraint)
/// or if permutation is not actually a permutation of 0..self.len()
pub fn permute(&mut self, permutation: &[usize])

then I think the existing <[usize]>::sort_(unstable_)by_key could do the rest of the job?

Note that https://doc.rust-lang.org/std/primitive.slice.html#method.sort_by_cached_key does that internally, so maybe you could find a way to encode it in that?

Though that doesn't guarantee the order of the calls to the key generator (just that it's at most once per item), so you can't just calculate the index inside the closure.

Maybe some API may be composed in order to sort multiple slices with one leader? Structure of arrays basically.

First thing coming to mind is something like

let slice_a;
let slice_b;

slice_a.manual_sort_unstable(|slice, i, j| {
    mem::swap(&mut slice[i], &mut slice[j]);
    mem::swap(&mut slice_b[i], &mut slice_b[j]);
});
1 Like

I would love something like that too. I end up with some variations of parallel arrays / structure-of-arrays quite often.

That's technically quite close to sort_by_cached_key, but I almost never end up using actual sort_by_cached_key, because typically I have to build all the keys first using some elaborate process that can't make keys on demand, so I have to make my own "cached_key" storage, but still need the "sort" part.

I'm not sure if |slice, i, j| could optimize out bounds checks. I've tried to make LLVM remove bounds checks in select_nth_unstable and found that length checks fail to propagate across non-trivial conditions. But something that takes a slice of keys and a slice of values to sort, performs a one-time length check, and then swaps with unsafe indexing under the hood if necessary, would be fine for me.

[values].sort_by_keys(&mut [keys])
// or
[keys].sort_values(&mut [values])

The latter could work with multiple homogeneous parallel arrays via [keys].sort_values(&mut [[values]; N]). With some hypothetical variadic generics, or enough macro hacks, maybe it could support multiple different parallel arrays [keys].sort_values((&mut [T], &mut [U])).