> Simultaneous fetches of thousands (sometimes tens of thousands) of objects started becoming inefficient, especially when fetching from collections of tens of millions of objects
Why not try leveldb. We're doing random reads of 170,000 vectors (/130M) a second. No startup needed.
RocksDB started as a fork of LevelDB (Dean/Ghemawat).
The performance imperatives have changed with the hardware, but NAND flash at that time had an asymmetry between reads and writes in terms of the amount of data: once you were writing even one byte you had effectively paid for writing a whole block and were therefore incentivized to get your money’s worth and write to a “log”, which then would be “merged”, in some “structured” way, hence LSM.
This is old news these days but it was quite the novelty at the time!
Why not try leveldb. We're doing random reads of 170,000 vectors (/130M) a second. No startup needed.