The linked projects (https://vlcn.io and https://github.com/vlcn-io/cr-sqlite) are both extremely impressive, and open up the possibility of adding "multiplayer" to apps that rely on SQLite for local storage—including web apps, via a WASM build of SQLite.
Yep! That was one plan for deploying cr-sqlite to "the cloud."
D1, unfortunately, doesn't support loading extensions into SQLite.
fly.io, and their sqlite solution, does however!
I've been using fly.io to run cr-sqlite in the cloud and act as a sync server for non peer to peer use cases. Eventually I'll have finished up a re-write of strut.io[1] atop vlcn/cr-sqlite which'll serve as a good end to end example of how to make applications with this sort of architecture.
I believe that the entire point of using CRDTs in your database is eventual consistency. (At least, to some degree. The definition/requirements of "consistent" may change from app to app.)
Could you elaborate? What problem would using CRDTs solve in the context of a web-based app, especially one built with WASM? I've heard of offline-first web apps but am not aware of much use cases for CRDTs there.
It's basically an offline-first flashcard webapp. CR-Sqlite allows for incremental syncing.
With Anki (the app from which I'm taking my inspiration), syncing is _not_ incremental - basically it just copies SQLite files around. So for example, the app could be on an iPhone with cards a card `A` reviewed, but the app on an iPad could make changes to the template on which card `A` is based, and that's enough to cause a conflict - you must take changes from only the iPad or only the iPhone. (To be clear - Anki does have some incremental syncing capabilities - I'm picking an intentionally pathological example.) CR-SQLite will mean that everything is incremental, however.
Basically makes 3 way merges a breeze (or n-way merges, really).
Very cool. CRDTs seem to be in the air. I was searching for something unrelated and bumped into https://mycelial.com/ which is also doing CRDTs with sqlite. Very helpful intro to CRDTs video on their youtube channel.
Interesting that is uses sqlite as a file format as an example for rich documents. Probably would work reasonably well, but hadn't thought to do that until now. Have tables for pages and items on the page? Would that make document search more robust?
Wonder if/how it would combine with the distributed litestream stuff recently touted on fly.io? I love postgres but have to admit these features are tempting.
It's kind of a mindset shift, but with logical clocks you kind have to let go of the concept of ordering with one global universal time - it's just not feasible to do accurately with clock skew.
They're purely used to provide a partial order of events - partial because the system cannot say for sure whether some events happened before or after each other.
Even with terrestrial applications that have very low clock skew, it can be very helpful to have a notion of which events could be effected by previous events and which couldn't have.
But you're bang on the money with the physics analogy. Lamport himself says that his knowledge of special relativity helped him come up with the concept.
What about: you place a clock somewhere in the universe that emits its value with light, then you define time for a given event as this vale? This seems to be defined, and having the property that if t(B) >= t(A) then B can't happen before A, and conversely, if AB is timelike, then T(B) > t(A).
This seems to be such a nice construction, but I don't know its name.
If it is emitting its value by light, as you move away or closer to it, it's value will shift up/down (red/blue shift), and it's rate of 'ticking' will change up/down. You can't prove that something far away happened in the past or the future using your construction.
It's value won't shift down. Yes, you don't measure it with a wristwatch, but you measure it with a device that receives the value on the clock, and display it.
Just like any other physical measures. You don't measure nearly any of them with a wristwatch (only proper time/aging) but with a device that is constructed according to the definition of that physical measure.
If you are sending pulses, the pulses will get further apart as you accelerate away and shorter as you slow down. This will cause your clocks to desynchronize and you’ll find that a different amount of time has passed for you than your own clock. This is called relativity. You can say that your flashing clock is The One True Clock and if everything in the universe agrees with you, then I suppose it’s true. But one persons clock won’t be flashing at the same speed as another ones. This includes earth where one season we will be moving away (slower) and another we will be getting closer (speeding up). That’s be pretty annoying.
Think of it this way: If we say that we perceive the world in 3D space and 1D time (x,y,z,t), the t dimension is special: You can't change x,y,z if you don't advance t. Inversely, if x,y,z is constant, t doesn't need to advance. That's why it's enough for the logical clocks to count the number of committed transactions.
This said, in a truly decentralized setting, where CRDTs make most sense, logical clocks are not useful as they are not not byzantine fault tolerant. You need merkle trees.
Imagine there is an object recorded at location x,y,z at time t. This is all we know.
If I later tell you the object was recorded at location x,y,z, you have no idea what time it was captured.
If I later tell you the object was recorded at any location other than x,y,z (even if only z changed by some epsilon) you know that that it was captured at some time other than t.
If you guarantee the second recording I'm showing you was not taken before time t, you know it is a more recent recording, because it would be not earlier than and not equal to time t.
I'll try to explain with an example: If you have an object at position (x1,y1,z1), at some time t1, and you want to move this object to a different position so that at time t2, it's at position (x2,y2,z2), you must have t2>t1. If t2==t1 then (x1,y1,z1)==(x2,y2,z2).
So when time stops, everything else stops. It is tautological when you think about it, but I find that it's an easy way to explain why logical clocks work.
In a client-server sync setup you can drop a client's deletes (for certain crdts like observed-remove sets) once they send their changes to the server. The server, of course, needs to retain them.
At my current window size the table of contents on the side overlaps with the text. Otherwise really nice and readable, I agree. I especially like the font.
wonder if there is an estimate to quantify what “eventual” means as a function of the variables of a peer ecosystem … e.g., the number of nodes, the frequency of updates, etc. …
It'd depend on the network topology and rate of syncing.
For a simple case of two nodes, they'll be divergent until each node sends the other their current state.
For client-server, the "whole system" wouldn't be in the same state until all clients report all their changes to the server and the server re-broadcasts all those changes to all clients.
A pathological case (but happens when people leave an ipad in a drawer or something) -- a node could be offline for 6 months and the rest of the system never gets those changes until that node comes back online. There's technically some divergence: the other nodes are missing whatever state the ipad had, but does it really matter if that node was unimportant enough to be left offline?