Hacker Newsnew | past | comments | ask | show | jobs | submit | constantinum's commentslogin

Posthog


Thanks for sharing — that article points at the same core tension. Determinism isn’t about rejecting probabilistic systems, it’s about deciding where uncertainty is allowed to live.

What keeps breaking in practice is when probabilistic reasoning leaks into places that expect reproducibility and accountability.


MonicaHQ

Audiobookshelf

Ghost crm

Immich for photo hosting

Mattermost

Matomo


Vimeo’s editor’s pick is my go to place for getting/staying inspired…


All Pulitzer price winning non-fiction books — specifically investigative journalism - is always a great read


What matters most is how well OCR and structured data extraction tools handle documents with high variation at production scale. In real workflows like accounting, every invoice, purchase order, or contract can look different. The extraction system must still work reliably across these variations with minimal ongoing tweaks.

Equally important is how easily you can build a human-in-the-loop review layer on top of the tool. This is needed not only to improve accuracy, but also for compliance—especially in regulated industries like insurance.

Other tools in this space:

LLMWhisperer/Unstract(AGPL)

Reducto

Extend Ai

LLamaparse

Docling


- Permutation City — Greg Egan

- She Said — Jodi Kantor & Megan Twohey

- Conclave — Robert Harris

- First You Write a Sentence — Joe Moran

- Eleanor Oliphant is Completely Fine — Gail Honeyman

- The Master and Margarita — Mikhail Bulgakov

More books here -> https://pastebin.com/XVeacpHM


Selfish Gene, Thinking fast and slow, Pale blue dot


At instances where data accuracy is of paramount importance, i think a hybrid route of non-llm ocr for data parsing and LLMs for structured data extraction is the safe passage to tread on. Seen better results for LLMWhisperer(OCR)[1] and Latest Gemini.

[1] - https://pg.llmwhisperer.unstract.com/


Sentry's homepage has a toggle called "no-marketing mode" which when turned on removes all the marketing fluff.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: