We are optimizing for latency and vector search is sufficient in 80-90% of cases...

7thpower · on May 23, 2024

Latency of search isn’t much of a concern, I was speaking to quality but did not word it well.

We have just found that vector search does not play well with numbers and does not provide consistent results, so we end up needing more chunks which results compounding token usage, slower responses, and higher chances of incorrect responses due to the customer facing model getting confused by similar results. I’m sure we could optimize our approach but full text has worked far more reliably than expected so we have invested more resources into how we handle documents, latency reduction, and pulling in structured data.