Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I have fairly fixed documentation and documents (won't be updated in months), what's the benefit of using a vector database (e.g. pinecone or supabase w/ vectors) rather than just saving the pickle (pkl) file and looking it up every time?

Shouldn't using the pickle file be much faster/more efficient?



If you have a small number of fixed documents e.g. <100k or so, then I agree that pickling the vectors or storing them as bytearrays would work better.

Once you reach a certain scale, it's helpful to potentially use distributed querying and/or different index types, even if you have a fairly static dataset. You can check out a billion-scale search benchmark we recently did here: https://zilliz.com/resources/milvus-performance-benchmark (you'll need to supply your email unfortunately). Here's the framework we used as well: https://github.com/zilliztech/vectordb-benchmark




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: