vmxdev's comments

vmxdev · 2025-07-06T21:50:09 1751838609

I opened shownew just to make sure, 10 out of 30 posts there are AI-related. I caught myself thinking that because of this AI-hype I read HN less and less. Sure, it is clear that AI is interesting to people (or at least someone wants more hype around AI), but it would be nice to read about real hacker news and projects somewhere.

randomNumber7 · 2025-07-06T23:33:11 1751844791

I think the problem isn't AI, but the hype attracted a lot of low IQ script kids that think they are programmers because chatgpt can write simple code.

vmxdev · on Aug 26, 2018

> What made you choose the radix tree?

Library was initially written for use with Netflow collector. More precisely for storing filtered and aggregated traffic information. Typical load for such applications is: a lot of updates in "hot" dataset, fewer inserts, deletions are rare and mostly batched, no needs for fast sequential scan (user can wait for a few seconds or even more to get report). Radix trees are perfectly fits this requirements.

Later we made some tests and decided to use it for IP (plus some other information) lookups without underlying file ("RAM-only" mode).

We have to support some old network hardware (for example almost abandoned Tilera TILE-Gx), that's why we're trying to avoid using OS and platform-specific features in library.

Radix trees has disadvantages, the most notable is that it's hard to implement user-defined sorting order for keys and they required more memory for sparse nodes compared to B-trees (however, this statement is not true for on-disk TKVDB database, we don't store empty pointers)

vmxdev · on Aug 26, 2018

Yes, the first macro is the common trick, and the second is the mistake (wondering, but it works). I update the code. Thanks a lot for pointing it out!

vmxdev · on Aug 25, 2018

> Can you point out the pros and cons vs above projects?

Briefly speaking, the above projects are way more mature, full-featured and use different data structures for storing data. BDB, LMDB are B-tree-based, LevelDB (and forks), SQLite4 LSM are using log-structured merge-trees.

TKVDB is a lot simpler, a bit more low-level, uses Radix trees and not tested as well as other key-value engines.

> I read this as: only use it single threaded.

If you protect access to data with your OS locks, everything will be fine.

We're using TKVDB in multi threaded environment (in RAM-only mode) for IP lookups with Netmap and DPDK. On 40Gbe full line rate there is only few hundred CPU cycles for decision on network packet, so the locking tricks becomes important.

Intel has guarantees about writing properly aligned data - if one core is writing 1, 2, 4 or 8 bytes, it will be written atomically. We're exploiting this feature when updating our tree - final update operation is 8 byte (4 byte for 32 bit CPU's) aligned pointer update.

So, in some cases you don't need to use locks at all (you may add HW memory barrier (__sync_synchronize()) after tree update, but it will slow down writer). If only one core is writing data, another cores can safely (well, almost) read it. Warning in docs was about this lockless case.