Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The criticism I have through kafka is the "at-least-once" semantics.

How do you manage exactly-once semantics? If kafka performance is based on reading small batches of 50 messages, in case of crash of the consumer, some of them will be processed twice. Depending on your business logic this may be ok, or may be create a new problem that must be solved farther in the process by adding an external data store.



You can't have exactly-once in distributed system. Crashes happen, network or power goes down and your ack get's lost[0].

The best one can do is accept that and make your processing idempotent.

[0] The most interesting case I've witnessed was power going down because a hot air balloon crashed into power line.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: