Hacker News new | past | comments | ask | show | jobs | submit login

Regarding eventual consistency, a CQRS\ES system can also be synchronous, or partially. You could have listeners for events that need to supply a strongly consistent model and others events that feed parts of the system that don't need strong consistency.

"However the events in a event store are immutable and can’t be deleted, to undo an action means sending the command with the opposite action"

Well they don't have to be immutable. I don't see why you can't update/migrate events.




The author has explained the problem with eventual consistency and the CAP theorem, but he is trying to blame the problem at least partly on CQRS/ES - the problem will exist independently of what pattern you use if your system is distributed.

CQRS/ES is a pattern and doesn't need to be shoe-horned into every solution.

I work at Transport for London and we have a CQRS/ES system for managing the data driven design data. It is synchronous and is incredibly useful for ensuring it has both strong business logic and a fast readable side, while providing auditing for free.

We still have problems with EC and CAP, and they are not due to CQRS/ES. Make the system distributed and you will have to handle all the new scenarios. Those are the trade-offs.


> can also be synchronous

Then you're giving up several of the benefits of CQRS, and might as well just not bother with the additional complexity.

> Well they don't have to be immutable. I don't see why you can't update/migrate events.

The events are your source of truth. When you "migrate" your source of truth, you're in somewhat dangerous waters (and if you're using CQRS for scale, a 1% failure to migrate data might be millions of records). You also now have the issue that your source of truth is being migrated, and probably can't accept writes (or you need to emit writes in both new format and old format for whenever your change over happens)


>> can also be synchronous > Then you're giving up several of the benefits of CQRS, and might as well just not bother with the additional complexity.

Even without asynchronous read/write, there are still benefits worth the (arguably, small) additional complexity. For instance, the ability to add new functionality without having to migrate existing data is amazing.


You still have migrations, but they exist on the read store, which means they can be done semi-transparently to clients (pause writes and let them queue up, migrate existing read store to new instance, point read calls and indexers to new store, resume writes).

CQRS adds a lot of complexity. Sometimes it's absolutely worth it, especially if you've already invested in the expertise and tooling to support it. It drastically changes the scaling math, both on the low end (at the very least you need a write store, a read store, a queue, an indexer and an api) and on the high end (you can scale any part of the system as needed in relative isolation). You add in timing issues, rollbacks, asynchronous error handling, and delays between reads/writes.


I don't think you necessarily have to have all this in a useful CQRS system.

E.g., here's a real-world example of a general pattern in which CQRS is pretty simple and useful:

A walking/running app which tracks distance, time, and other relevant information over the course of a workout.

It collects a series of events like "location changed at time X", "user started workout", "user paused workout" etc., over the course of the workout period (actually starting before the user officially starts the workout), and converts these to time, distance and other stats.

This fits CQRS really well since the input is inherently a series of events and the output is information gleaned from processing those events.

You get a full CQRS system by simply fully logging the events.

The advantage is that you can go back and reprocess the sequence if you want to glean new/different information from the sequence. E.g., in the walking/running app you could, after the fact and at the user's discretion, detect and fix the case where the user forgets to start or stop a workout. Or recalc distance in the case where you detect a bug or misapplication of your smoothing algorithm, etc. Or draw a pace graph or whatever.

In all these cases you can process the events synchronously.

I put this all in terms of a workout app, but there is a general pattern of an event-driven session-based activity or process, where you may want/need to derive new information from the events and the cost is to log the events (in a high-fidelity form so they could be fully reproduced.

Whether or not you need to use queues and distributed data stores is an independent decision.


That's not "Command Query Responsibility Segregation" (CQRS). That's modeling your data as a time series - which is a totally valid and perfectly useful model in many cases, but has nothing to do with the architectural pattern known as CQRS.

Martin Fowler gives the following simple definition of CQRS:

> At its heart is the notion that you can use a different model to update information than the model you use to read information.

CQRS takes data (events, time series, plain old records, whatever), stores it into a write store which acts as a singular source of truth, and queues that data to be stored in a (usually eventually consistent) read store as a projection - not simply duplicating the write store, but building a read record meant to be consumed directly by some client - with potentially several projections, one for each set of client needs.


But what is the distinction between what Fowler describes and what I describe? There's not really a contradiction with what you describe either.

I have a distinct write store and read store, with very different models. As you say, the write store is the source of truth. Since the read store is updated synchronously with the write store there's no need for a queue between them. Indeed there are also multiple projections for different client needs (e.g. the pace chart, vs. workout progress), though in this case I don't generally need them until after the sequence of events is complete, which simplifies things.

Maybe we're just have a pointless debate on semantics, in which case, never mind.

It's just that I see this as a quite valuable pattern without necessarily bringing distributed stores into it. Indeed, part of its value is that you can start simple and later extend it to a scaling distributed system without disrupting the whole pattern.


You're right. Nothing about CQRS demands any kind of asynchronization or distributed system. It is quite simply a system where you have different models for updating and querying the system. You don't even need to have an event store for it to be a CQRS system, but it makes so much sense that I have a hard time imaging using CQRS without and ES of some kind.


No need to pause writes or migrate stores. You can have the old and new versions of the read model co-exist and read from the same event stream without disabling writes. Once the new version is deployed and running, you shut down the old version (isn't this already your deployment model?)

It does change the scaling math, but don't automatically assume that every ES+CQRS system is intended for thousand-writes-per-second, terabytes-per-day kind of scales. If the system stays at ten-writes-per-second, megabytes-per-day, a read store (beyond in-memory projections), a queue or an indexer are not necessary.

As for asynchronous error handling, and delays between reads/writes : why would that be a consequence of ES+CQRS ? It would be a consequence of implementing ES+CQRS with asynchronous or eventually consistent behaviour...


I also share same opinion based on my experience. Events can be modified and deleted but that must be an exceptional situation (GDPR and other compliances, etc.). But even if it's exceptional you have to provide a clear and easy way to do so and that increases complexity of the solution by a lot.

Another thing is strongly consistent models, there may be valid requirements in some problem areas to have a strongly consistent and normalized model and use it for command validation. This helps especially well in the case when all requirements are not known upfront and/or business domain changes very frequently. A small change in business may require to completely redo the aggregate roots and logic if you follow standard approach, this is very expensive. A better decision could be to use a normalized SQL database instead of an aggregate root. Such approach may be more flexible in certain cases and have it's own benefits as well as cost and drawbacks.


Encryption + lose the key policies seem to satisfy the GDPR. So you don't have to actually delete an event, you can just give up your ability to read its payload.


If the decision to abandon strong consistency involved careful analysis of the performance/maintenance trade-offs, then by definition the lack of consistency is less expensive than keeping a consistent but low-performance model, and you're just paying the price of having to solve a Hard Problem.

But if strong consistency was abandoned because someone wrote general statements in favor of eventual consistency...




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: