Looks very promising! We've looked at Cockroach for a particular project, and we've been concerned that performance wasn't good enough.
Cockroach performance seems to scale linearly, but single-connection performance, especially for small transactions, seems rather dismal. Some casual stress testing against a 3-node cluster on Kubernetes showed that small transactions modifying a single row could take as much as 7-8 seconds, where Postgres would take just a few milliseconds.
The documentation recommends that you batch as many updates as possible, but obviously that doesn't work for low-latency applications like web frontends that need to be able to do small, fine-grained modifications.
7-8 seconds? Something definitely sounds misconfigured. I've been running a 1.1.x cluster for quite a while and I've never seen a single row transaction take that long. And even the slowest queries took at most ~500ms, and that was with:
- Replication factor increased to 5x (rather than the 3x default)
- 8 indexes on the table being modified which also needed to be updated
- Nodes spread across North America, incurring higher RTT latency between nodes
- Relatively high contention on the data triggering client-side retries
- HDD's as the storage medium (RockDB is optimized for SSDs)
7-8 seconds seems extremely long. Human beings performing the raft consensus algorithm using paper and pencil over Skype wouldn't be much slower than that. Are you sure everything was working correctly?
" small transactions modifying a single row could take as much as 7-8 seconds"
That's surprising. I wasn't expecting CockroachDB to be really fast, given the constraints they work within. But that sounds more like a bug or config error. Unless perhaps you mean a really high number of processes trying to update the same row at the same time? Like a global counter or something?
Indeed, the stress test updates just one row, which mirrors certain write patterns in our application. I just started this testing, so we'll see what happens when I extend it to more than one row.
[cockroach employee here] Have you hit us up on Gitter or Stack Overflow to help debug and tune? We'd also love to learn more about how you're using K8s, what your setup looks like, surprises you're running into with it, etc.
We have a collaborative, Google Docs-like application that currently issues a write every time someone types into a text field. Now, clearly it's suboptimal and something that should be optimized to batch the updates, but on the other hand, with Postgres we've had zero incentive to make such an optimization, because it's able to handle thousands of writes per node in real time with no queuing happening on the client. I don't expect this from Cockroach, but I would definitely want low latency.
Lordy, relational databases are not the way to go for that problem... With a single shared resource (document), you're going to be encountering write conflicts left and right.
Have you explored implementing a CRDT based solution like WOOT instead?
Definitely. The application is conceptually a transaction log of field/subfield patches, which would lend itself to something like an LSM, and we're looking at possible alternatives.
CRDTs could be a solution, but from what I gather they require too much context information to be viable for a text editing application. Our app currently uses something similar to OT.
> issues a write every time someone types into a text field
With more than a handful of people, this is getting into conflict territory pretty rapidly, especially if the document is structured as a single row (hopefully it's more granular than that). Time for some back of the envelope maths:
Assuming that an average person types at around 200 words per minute (number pulled from https://www.livechatinc.com/typing-speed-test/#/), that's a character every 300ms on average. With 10 people editing the document, that's a character every 30ms on average, which can easily lead to conflicts if they're all trying to update the same resource.
Perhaps it's event sourcing based. Every time someone types into a field it writes a row that something was typed which is a record of what was typed. Play it back and you have the full document with no conflicts.
With the caveat that the stored events are not conflicting with each other. So a central instance needs to check each event for validity before allowing it into the log. The check cannot be parallelized easily without introducing races between event insertions.
No event sourcing architecture I know of advocates that at all.
One of the strengths of event sourcing is that you can fix things after the fact. Say you got a wrong event like you suggested that conflicts. You don't check it at the time before allowing it in log. You notice the wrong event, delete it, and replay the log.
You can always replay the log to get you current point of time.
If you're talking about "insert char X at location Y" leading to undesirable changes: sure. But that has nothing to do with the DB type, and everything to do with how changes are resolved. CRDT / etc can all be written on top of any DB.
Updating a DB every 30ms should be trivial. Heck, you should be able to grab an exclusive row lock, double check your state, and write your change without even coming close - 100% conflict or deadlock free, regardless of the number of writers, simply by using a DB as it is designed to be used. In this case, by using the biggest reasonable hammer available: make everything sequential at the DB level. You can absolutely build other systems on a relational DB that don't have that limitation.
Cockroach performance seems to scale linearly, but single-connection performance, especially for small transactions, seems rather dismal. Some casual stress testing against a 3-node cluster on Kubernetes showed that small transactions modifying a single row could take as much as 7-8 seconds, where Postgres would take just a few milliseconds.
The documentation recommends that you batch as many updates as possible, but obviously that doesn't work for low-latency applications like web frontends that need to be able to do small, fine-grained modifications.