farsa's comments

farsa · 2025-11-04T23:37:43 1762299463

The distinction is more clear when indexing actual text and applying tokenization. A "typical" index on a database column goes like "column(value => rows)". When people mention inverted indexes its usually in the context of full text search, where "column value" usually goes through tokenization and you build an index for all N tokens of a column "column:(token 1 => rows)", "column:(token 2 => rows)",... "column:(token N => rows)".

farsa · 2025-08-10T00:48:06 1754786886

What was your experience like putting such thing together?

farsa · 2025-08-10T00:45:56 1754786756

Not the person you have asked but at work (we are a CRM platform) we allow our clients to arbitrarily query their userbase to find matching users for marketing campaigns (email, sms, whatsapp). These campaigns can some times target a few hundred thousand people. We are on a really ancient version of ES, but it sucks at this job in terms of throughput. Some experimenting with bigquery indicates it is so much better at mass exporting.

whakim · 2025-08-10T01:23:43 1754789023

Fair; my question was mostly in the context of ANN, since that was the discussion point - I have to assume ES (as a search engine) would not necessarily be the right tool for data warehousing types of workloads.

farsa · on Dec 15, 2024

Hey, you have some silly thing rendered at your product's landing page chewing CPU.

downrightmike · on Dec 16, 2024

Its not mine, just learned about it is all

farsa · on Aug 6, 2024

It's still in technical preview.

farsa · on Sept 22, 2023

It's not exactly simple as it involves some postgres specific knowledge, but you can make it reliable when working with transaction ids (see https://event-driven.io/en/ordering_in_postgres_outbox/).