Looks very promising! We've looked at Cockroach for a particular project, and we...

thanatos_dem · on March 30, 2018

7-8 seconds? Something definitely sounds misconfigured. I've been running a 1.1.x cluster for quite a while and I've never seen a single row transaction take that long. And even the slowest queries took at most ~500ms, and that was with:

  - Replication factor increased to 5x (rather than the 3x default)
  - 8 indexes on the table being modified which also needed to be updated
  - Nodes spread across North America, incurring higher RTT latency between nodes
  - Relatively high contention on the data triggering client-side retries
  - HDD's as the storage medium (RockDB is optimized for SSDs)

jetrink · on March 29, 2018

7-8 seconds seems extremely long. Human beings performing the raft consensus algorithm using paper and pencil over Skype wouldn't be much slower than that. Are you sure everything was working correctly?

lacker · on March 29, 2018

I don't know about you, but it would take me a lot longer than 7 seconds to perform the raft consensus algorithm with paper and pencil.

danicgross · on March 30, 2018

Who are you to judge? Did you win the Putnam or something?

Gigablah · on March 30, 2018

I think people are generally allowed to judge themselves without industry credentials...

ComputerGuru · on March 30, 2018

That was tongue-in-cheek (that still had no business on HN), a reference to @lacker's two Putnam awards.

grzm · on March 30, 2018

The Putnam is also the source of one of the best burns on HN: https://news.ycombinator.com/item?id=35079

tyingq · on March 29, 2018

" small transactions modifying a single row could take as much as 7-8 seconds"

That's surprising. I wasn't expecting CockroachDB to be really fast, given the constraints they work within. But that sounds more like a bug or config error. Unless perhaps you mean a really high number of processes trying to update the same row at the same time? Like a global counter or something?

atombender · on March 30, 2018

Indeed, the stress test updates just one row, which mirrors certain write patterns in our application. I just started this testing, so we'll see what happens when I extend it to more than one row.

orangechairs · on March 30, 2018

[cockroach employee here] Have you hit us up on Gitter or Stack Overflow to help debug and tune? We'd also love to learn more about how you're using K8s, what your setup looks like, surprises you're running into with it, etc.

atombender · on March 31, 2018

Thanks, I'll do that when I get back to testing.

welder · on March 30, 2018

Did you use the 2.0 beta version or the latest stable release? They improved performance a lot in the 2.0 beta released this month.

atombender · on March 30, 2018

I used 1.1.6. Looking forward to testing 2.0.

jnordwick · on March 29, 2018

> low-latency applications like web frontends

...

atombender · on March 29, 2018

We have a collaborative, Google Docs-like application that currently issues a write every time someone types into a text field. Now, clearly it's suboptimal and something that should be optimized to batch the updates, but on the other hand, with Postgres we've had zero incentive to make such an optimization, because it's able to handle thousands of writes per node in real time with no queuing happening on the client. I don't expect this from Cockroach, but I would definitely want low latency.

thanatos_dem · on March 30, 2018

Lordy, relational databases are not the way to go for that problem... With a single shared resource (document), you're going to be encountering write conflicts left and right.

Have you explored implementing a CRDT based solution like WOOT instead?

atombender · on March 30, 2018

Definitely. The application is conceptually a transaction log of field/subfield patches, which would lend itself to something like an LSM, and we're looking at possible alternatives.

CRDTs could be a solution, but from what I gather they require too much context information to be viable for a text editing application. Our app currently uses something similar to OT.

jacquesm · on March 30, 2018

He's saying it works. Why change it if it works?

Groxx · on March 30, 2018

Why write conflicts? Contention, sure, but contention isn't an issue until you have literal tons of it.

thanatos_dem · on March 30, 2018

From OP's description

  > issues a write every time someone types into a text field

With more than a handful of people, this is getting into conflict territory pretty rapidly, especially if the document is structured as a single row (hopefully it's more granular than that). Time for some back of the envelope maths:

Assuming that an average person types at around 200 words per minute (number pulled from https://www.livechatinc.com/typing-speed-test/#/), that's a character every 300ms on average. With 10 people editing the document, that's a character every 30ms on average, which can easily lead to conflicts if they're all trying to update the same resource.

dalore · on March 30, 2018

Perhaps it's event sourcing based. Every time someone types into a field it writes a row that something was typed which is a record of what was typed. Play it back and you have the full document with no conflicts.

gmueckl · on March 30, 2018

With the caveat that the stored events are not conflicting with each other. So a central instance needs to check each event for validity before allowing it into the log. The check cannot be parallelized easily without introducing races between event insertions.

dalore · on April 1, 2018

No event sourcing architecture I know of advocates that at all.

One of the strengths of event sourcing is that you can fix things after the fact. Say you got a wrong event like you suggested that conflicts. You don't check it at the time before allowing it in log. You notice the wrong event, delete it, and replay the log.

You can always replay the log to get you current point of time.

Groxx · on March 30, 2018

If you're talking about "insert char X at location Y" leading to undesirable changes: sure. But that has nothing to do with the DB type, and everything to do with how changes are resolved. CRDT / etc can all be written on top of any DB.

Updating a DB every 30ms should be trivial. Heck, you should be able to grab an exclusive row lock, double check your state, and write your change without even coming close - 100% conflict or deadlock free, regardless of the number of writers, simply by using a DB as it is designed to be used. In this case, by using the biggest reasonable hammer available: make everything sequential at the DB level. You can absolutely build other systems on a relational DB that don't have that limitation.

zzzcpan · on March 29, 2018

Poor UI choices these days do not provide feedback and reload, hence making low-latency necessary to even be tolerated by users.