I think you might be missing the real performance characteristics of the hardwar...

kazinator · on June 24, 2024

Like the article, I'm only concerned with comparison counts, trying to explain why B-trees look like balanced BSTs, and why that gets tighter and tighter with increasing numbers of children per node.

It's simply because the nodes themselves contain the equivalent of a balanced binary tree, in the form of a binary-searched array.

In the ultimate case, we set the number of children to be large enough to contain all the data, so then there is only one B-tree node, which is binary searched to find the element. Then we have the same number of comparisons as well balanced binary search tree.

sqeaky · on June 27, 2024

Where M is amount of data per node and N is the amount of all data, each node that you descend skips a proportion M-1/M of all the remaining data as long as N is large enough to require multiple levels of nesting.

So for practical sizes of M, as in a few cache lines, and sufficiently large datasets of size N a binary tree will eliminate 1/2 of data each node checked and a B tree with M set to 8 would skip 7/8ths of the data each node. When not descending nodes they are both in Log2(N) time, but when jumping nodes this hypothetical B-Tree is using Log8(N) time.

So yeah if the N and M are close then they are similar, but those aren't interesting cases. Datasets that small are fast to iterate even if unsorted, and large unwieldy values of M are silly and impractical. Once you have thousands or millions of items in N then the savings of eliminating downtree data from checked at all becomes real. Or put another way binary trees are always averaging Log2(N) amount of checks but B trees have some mix of Log2(N)+LogM(N) amount of checks which should be lower in most cases and so close in edge cases as to not matter.

But again, I state that counting comparisons is silly. On any modern hardware (past 20 years) load time dominates the comparison time and B-Trees will just have fewer loads, memory complexity is the "interesting" part of this problem unless you are a C64 or some zany future company with hyper fast RAM.

9659 · on June 24, 2024

Just because it was first discovered 15 years ago, does not mean it wasn't still a good idea.

Keep on thinking. You never know what else you will come up with.

sqeaky · on June 25, 2024

They measured a speedup. SIMD is a super common way to speed stuff up and good compilers do it automatically.

A lot of string comparisons use SIMD now. In C++ if you use std:string on GCCit has done it for years. I bet Rust and Python (in the compiled runtime) ate doing it too.

No sense leaving easy performance gains on the table.