If I have fairly fixed documentation and documents (won't be updated in months),...

fzliu · on March 25, 2023

If you have a small number of fixed documents e.g. <100k or so, then I agree that pickling the vectors or storing them as bytearrays would work better.

Once you reach a certain scale, it's helpful to potentially use distributed querying and/or different index types, even if you have a fairly static dataset. You can check out a billion-scale search benchmark we recently did here: https://zilliz.com/resources/milvus-performance-benchmark (you'll need to supply your email unfortunately). Here's the framework we used as well: https://github.com/zilliztech/vectordb-benchmark