Hacker News new | past | comments | ask | show | jobs | submit login

The log segments committed by FaunaDB contain batches of transactions, which means our throughput is constrained not by our consensus protocol, but rather by conflicts as transactions are resolved. The benchmarked 3300 transactions/second mentioned is for complex transactions with dozens of reads and writes. Additionally, read-only transactions are not run through the transaction pipeline, since they can be served consistently with low latency from replicas using FaunaDB's snapshot storage.

More important for most applications than theoretical benchmarks is taking a sound approach and using a database you can depend on. Asking your development team to manage isolation is going to introduce more costs than benefits, except in the most specialized of circumstances.

FaunaDB's model is always ACID, so you never have to guess what's in the database, even in the presence of region failures or other anomalies.




We built HopsFS on NDB, not CalvinDB, because we needed performance for cross-partition transactions. Some workloads need it. In the filesystem workload, when FS path components are normalized and stored on different partitions, practically all transactions cross partitions. So if you serialize them, then writing a file in /home/jchan will block writing a file in /home/jim. This is what most distributed filesystems already do - have a global write lock on the filesystem metadata. I like having the freedom to build my the concurrency model on top of READ_COMMITTED, as i can squeeze out 16X performance improvemnts by doing so (ref: https://www.usenix.org/conference/fast17/technical-sessions/... )


Sure, if you don't care about serializability, you don't need to pay consensus costs to enforce it, but you will be subject to read and write skew.


> This is what most distributed filesystems already do - have a global write lock on the filesystem metadata

Absolutely untrue. Stop lying about other systems.


I should add that when you don't serialize cross-partition transactions, you have the hard problem of system recovery - transaction coordinators need to reach consensus on a consistent set of transactions to recover. Here's NDB's protocol for doing so: NDB’s solution to this problem was only published this year in Mikael Ronstrom's book on MySQL Cluster : https://drive.google.com/file/d/1gAYQPrWCTEhgxP8dQ8XLwMrwZPc...


I should add that it is not correct to say 'our throughput is constrained not by our consensus protocol'. A trivial example would be a workload of transactions, where each transaction has at least two non-conflicting writes on different partitions. FaunaDB will serialize those transactions, and you will bottleneck on the consensus protocol - compared to NDB.


What he's saying is that the throughput number they mention is constrained by transaction conflicts. The limit for non-conflicting transactions allowed by the consensus protocol is no doubt much higher.


Can you partition your batch transactions, so that all up to the conflict succeed?


My understanding is yes, we commit to the log in batches, but abort conflicts at the transaction level. So only the conflicts have to suffer the retry loop, everything else is durably committed.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: