Is 6 the right B for BTtree? how to benchmark?

Eh2406 · September 13, 2016, 6:30pm

However in the future we would like to further explore choosing the optimal search strategy based on the choice of B, and possibly other factors.

I'd like to start some of that discussion.

Currently B = 6 and we use linear search. In the Python world good performance has been found with 100 < B < 10000 and binary search. (and depth limited to 2)

I'd like to do some benchmarking to ground the decision on what should be the default. But benchmarking Binary search is not always straightforward. I.E. I am jumping in somewhat over my head.

(Note, also this link)

What evidence do the libs team want to see? What pitfalls do I need to watch out for in benchmarking? Am I redoing work someone else is doing?

sfackler · September 13, 2016, 9:34pm

I believe this was fairly heavily benchmarked when configurability of the B value was removed before 1.0 and it turned out to not really matter much if at all. I can’t find that PR right now but it would be worth tracking down for background.

Eh2406 · September 14, 2016, 3:25pm

This is the PR where const B: usize = 6; was added.

I have not found benchmarks of changing the size of B, nor changing to binary or adaptive search. Links of course welcome.

Eh2406 · October 7, 2016, 12:59am

I have been doing some research. https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/ is releavent.

matklad · October 7, 2016, 12:20pm

There is an interesting variation of binary search, where you lay out elements of the array in a cache friendly manner. That is, if text book binary search will look at the middle element first, and than at .25 and .75 quantiles (which may be far away cache wise), this variation will look on the first and second or third element: http://bannalia.blogspot.ru/2015/06/cache-friendly-binary-search.html

bluss · October 8, 2016, 2:41pm

The old with_b constructor used a runtime-level B, there was some doubt whether it would be compatible with future extensions of BTreeMap (generic parameter-selected B?).

Eh2406 · October 9, 2016, 1:55am

Yes, The move from a runtime-level B to a constant was important for many performance improvements. Definitely a win! I hope we can get with_b back when we get type level numbers, but we can have that conversation when they arrive. In the meantime I just wonder if 6 is the best choice, and on what evidence we can decide.

I love how open rust is to enthusiastic respectful new contributors. I’d love to help do the benchmarking to decide these things. I just want to make sure that I am being respectful of the team’s time, by bringing up evidence that they will find useful.

system · March 25, 2019, 8:26am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Suggestion: BTreeMap/BTreeSet O(log(n)) n-th element access libs	4	2101	May 26, 2019
BTreeMap: searching with a predicate? libs	10	2589	March 25, 2019
Interface improvements for BTreeMap libs	8	2127	October 16, 2022
BTreeMap nth() improvements	1	1025	December 7, 2021
Adding a const `B` parameter to BTreeMap? libs	10	1770	May 17, 2021

Is 6 the right B for BTtree? how to benchmark?

Related topics