Is this just an omission or is there a technical reason why it is hard (or incorrect?) to implement PartialEq
for BinaryHeap
?
I'm guessing it would be very unperformant because two observably indentical heaps could have very different internal structure.
Annoying though, when I need to write tests that compare structs which happen to contain heaps!
Agreed, the problem is probably that equality testing is not as easy as just comparing the two array elements.
However, I don't think it should be too bad: it could be done in O(N log N)
time as far as I understand: an O(1)
length check, and N lookups from the first heap into the second, each taking log(N) time.
The bigger problem, I would assume, is that it requires effectively random lookup for larger heaps compared to sequential iteration in the case of plain slices, which means it's not very cache-friendly.
If you only need it in tests, you could consider just comparing a.into_vec() == b.into_vec()
instead.
I don't think you can do an O(log(N))
lookup in a heap, because it's just in heap order (which takes O(n)
time to produce) not in binary-search-tree order (which takes O(n*log(n))
time to produce, since it's a sort).
Not when the BinaryHeap is nested somewhere in the struct I want to compare.
Now you made me wonder, and that might very well be right, I'm not sure anymore.
Would it instead be possible to produce a copy of both heaps such that only the subarrays representing each level are sorted? That sounds like strictly less than log(N) * N / 2 * log(N / 2)
operations, plus O(N)
for the comparison, so something like O(N log^2 N)
.
However, this analysis is just off the top of my head, and of course making two allocations sounds entirely inappropriate in a PartialEq
impl.
Trying to help convince myself is why I included the sorting complexity note Basically, if there was an O(n) way to arrange data such that one could do O(log n) contains checks, it'd probably be a standard technique by now. But binary search depends on fully-sorted, and there's a proof for comparison sorts being at best O(n log n).
I think if this were acceptable, then we'd just sort the copy, which brings things back to O(n log n) overall.
Then I think the easiest solution is to manually implement PartialEq
for the struct(s) that contain the BinaryHeap.
Ideally it should be possible to just use derivative (or other custom derive crate) and derive a (cfg(test) only?) comparison function with a custom one specified for the binary heap field of the struct. Though I'm not sure if I'd myself judge the dependency to pull its weight.
I did something similar. I created a wrapper for BinaryHeap
:
pub struct ComparableBinaryHeap<T: Ord>(BinaryHeap<T>);
with Deref
and DerefMut
impls and then defined a PartialEq
impl only for tests:
#[cfg(test)]
impl<T: PartialEq + Ord> PartialEq for ComparableBinaryHeap<T> {
fn eq(&self, other: &Self) -> bool {
// ...
}
}
And, in the outermost struct, derived PartialEq
only for tests:
#[cfg_attr(test, derive(PartialEq))]
pub struct StructToTest {
heap: ComparableBinaryHeap<Foo>,
// fields omitted...
}
It's ugly, but makes the tests a lot less fragile.
I haven't used derivative
before but it looks like that approach would be much better, as I wouldn't have to define a mostly useless wrapper.
Yep. Much better! Thanks for the tip.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.