CRDTs (and I guess MRDTs are no different in this respect) are often touted as an enabling technology for collaborative local-first applications (see e.g. recent discussion: https://news.ycombinator.com/item?id=21581444) but I fail to see how they are useful beyond the simplest of cases: the emphasis is on automatic convergence, but it seems almost impossible to preserve non-trivial application-specific invariants without manual merging (I am thinking of such applications as collaborative source code editing or vector graphics editing). Am I missing something?
The point is often you don't need to. If you use Google docs, the chances are you will be editing it disjointly most of the time and occasionally you'll edit the same bit of the document. Then you don't really care if what comes out is a meaningful sentence, but that all parties involved (regardless in which order they see the operations) see the same version of the document so that they can fix it. That's what CRDTs buy you.
Without them, you have to use either explicit synchronisation which is a bad idea if you want local-first software (doesn't work offline, needs to be done even when editing disjoint bits of the file, etc.) or something like CRDTs. Now there is a great paper by Victor Gomes, Martin Kleppmann, and Dominic Mulligan that proves (in a proof assistant!) that pretty much all other alternatives that do not use strong consistency is flawed in edge cases.
As for triviality of the operations, I know for a fact that Martin Kleppmann has a paper in the works that show (and prove) how to do tree move efficiently as a CRDT. That means you can basically do arbitrary edits on JSON. Hopefully, that should clear up any confusion about for which applications it might be useful.
> The point is often you don't need to. If you use Google docs, the chances are you will be editing it disjointly most of the time and occasionally you'll edit the same bit of the document. Then you don't really care if what comes out is a meaningful sentence, but that all parties involved (regardless in which order they see the operations) see the same version of the document so that they can fix it. That's what CRDTs buy you.
Ok, so the hope is that semantic conflicts are rare and if they do happen, users can easily see them and promptly correct them. I guess this works well if the invariants are local and synchronization is fairly frequent (i.e. on the scale of seconds, not weeks).
> That means you can basically do arbitrary edits on JSON. Hopefully, that should clear up any confusion about for which applications it might be useful.
Not quite. Of course you can model everything in JSON, but the question remains as to how much additional implicit structure is needed atop the explicit tree-like structure that JSON provides and how easily this implicit structure is disturbed by merges.
I think the subtlety with CRDTs is that how you choose to represent your data model via CRDT data structures encodes the semantics of how conflicts are resolved. As a simple example there are multiple versions of Set (remove once, add only, remove observed, etc.) that have different semantics.
As a more concrete example, Apple Notes edits tables and resolves conflicts via CRDTs. Now for editing tables in a word processor, you can have multiple users moving, adding, and deleting columns, in addition to editing the text within a cell. (One user adds a row, another adds a column. One user deletes a column and another moves it. etc.)
If you just naively represent this as a list of lists, the semantics expected by the user breaks. (One user adds a row, another adds a column, and you have a short row with cells in the wrong place.)
Apple ended up representing the columns as an ordered set of ids (piecing this together from other CRDTs) the rows as a second ordered set of ids, and then the text blobs as a map of column id, to a map of row id to a text crdt object. With the additional assumption that any "missing" text is in the blank, initial state. (I might have this transposed, it's from memory, but that's the basic idea.)
It works, and works well, but a lot of thought and effort was put into modeling the domain as CRDTs to capture the required semantics.
(Here, for ordered set, you can use a list CRDT, drop duplicates, and tack on an add-only set of tombstones to keep move/delete conflicts from resurrecting rows/columns.)
"often" isn't that useful to me as an application developer. If I'm going to build local-first software and put it out on hundreds of thousands of desktops, once it's deployed, the database is outside my control.
Yes, I need convergance, but I also need to ensure application invariants are maintained.
I agree with parent that a CRDT isn't that useful for local-first software unless you can encode arbitrarily complex programmatic logic in the transitions. That way I as a developer have control over how merges happen globally and all invariants. It's too complex to reason about otherwise.
It's not clear to me if this paper gives me what I'm looking for -- I'm looking forward to digging into it.
There are other approaches to CRDTs that I'm more excited about: Statebox (https://github.com/mochi/statebox) from years ago is a CRDT that encodes arbitrary transitions and converges by rewinding and replaying in a consistent order. This is basically the same trick blockchains typically do to merge forks. Radicle (https://radicle.xyz/) is another related approach.
I think things like this are the better path forward for local-first software, personally.