Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is really cool. We are doing something similar with CruzDB [0], but we are using red-black trees. What made you choose the radix tree? We've looked at several persistent data structures, and continue to look for alternatives.

[0]: https://nwat.xyz/blog/2018/02/15/cruzdb-architecture-part-1-...



> What made you choose the radix tree?

Library was initially written for use with Netflow collector. More precisely for storing filtered and aggregated traffic information. Typical load for such applications is: a lot of updates in "hot" dataset, fewer inserts, deletions are rare and mostly batched, no needs for fast sequential scan (user can wait for a few seconds or even more to get report). Radix trees are perfectly fits this requirements.

Later we made some tests and decided to use it for IP (plus some other information) lookups without underlying file ("RAM-only" mode).

We have to support some old network hardware (for example almost abandoned Tilera TILE-Gx), that's why we're trying to avoid using OS and platform-specific features in library.

Radix trees has disadvantages, the most notable is that it's hard to implement user-defined sorting order for keys and they required more memory for sparse nodes compared to B-trees (however, this statement is not true for on-disk TKVDB database, we don't store empty pointers)


Red black trees have crap performance due to pointer chasing.


For small maps (compressed) radix trees (patricia, judy) are even faster than hash maps, esp. concurrent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: