We managed 200M long and short form embeddings (patents), indexed in scann at runtime and a metadata layer on leveldb. Some simple murmur hash sharding and a stable K8s cluster on GCP was all we needed. Low millisecond retrieval and rerank augmenting a primary search.
I think in 0 cases would we go back and use vector dbs or managed services if they were available to us (to include lucene or relational db add-ons)
I think in 0 cases would we go back and use vector dbs or managed services if they were available to us (to include lucene or relational db add-ons)