Hacker Newsnew | past | comments | ask | show | jobs | submit | farsa's commentslogin

The distinction is more clear when indexing actual text and applying tokenization. A "typical" index on a database column goes like "column(value => rows)". When people mention inverted indexes its usually in the context of full text search, where "column value" usually goes through tokenization and you build an index for all N tokens of a column "column:(token 1 => rows)", "column:(token 2 => rows)",... "column:(token N => rows)".


What was your experience like putting such thing together?


Not the person you have asked but at work (we are a CRM platform) we allow our clients to arbitrarily query their userbase to find matching users for marketing campaigns (email, sms, whatsapp). These campaigns can some times target a few hundred thousand people. We are on a really ancient version of ES, but it sucks at this job in terms of throughput. Some experimenting with bigquery indicates it is so much better at mass exporting.


Fair; my question was mostly in the context of ANN, since that was the discussion point - I have to assume ES (as a search engine) would not necessarily be the right tool for data warehousing types of workloads.


Hey, you have some silly thing rendered at your product's landing page chewing CPU.


Its not mine, just learned about it is all


It's still in technical preview.


It's not exactly simple as it involves some postgres specific knowledge, but you can make it reliable when working with transaction ids (see https://event-driven.io/en/ordering_in_postgres_outbox/).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: