Twitter Drops MySQL For Cassandra

jbellis · on March 1, 2010

I tried to explain to the reporter why Cassandra's data model (particularly that it supports an arbitrary number of columns per row) makes it support denormalization better than traditional rdbmses, and somehow that got turned into "cassandra supports an arbitrary number of rows."

This and a few other poor explanations make me wince reading this. So I don't particularly recommend this article, although I've seen worse. :)

TFA does at least link the original interview w/ Ryan King of Twitter (http://nosql.mypopescu.com/post/407159447/cassandra-twitter-...) which is much better for people at HN level.

My own article at http://www.rackspacecloud.com/blog/2010/02/25/should-you-swi..., although high level, also has some useful links for those who want to drill down for more details.

JulianMorrison · on March 1, 2010

I wish Ryan King had said why the other DBs named were rejected. That could be useful info.

larrywright · on March 2, 2010

This was actually previously discussed by Twitter's Evan Weaver. I can't find the article I'm thinking of, but this is close: http://blog.evanweaver.com/articles/2009/07/06/up-and-runnin...

My understanding is that it came down to Cassandra being the only one that was well tested in multi-datacenter configurations. I think there were other reasons, but that was the big one.

socratees · on March 1, 2010

Sites similar to twitter are worried about Read/Writes and up time rather than concurrency and locking mechanisms. All flavors of NOSQL databases are best suited for this. Sites like amazon.com, or any transaction processing site can't think of using NOSQL. tl:dr; NOSQL is good until transaction processing & concurrency comes in to picture.

rbranson · on March 2, 2010

Amazon pioneered Dynamo, a NoSQL-style scalable distributed key-value store, for building many of their site's internal services upon. It's backed by BDB and can be tuned to guarantee disk writes across multiple nodes if needed. See the paper below.

http://www.allthingsdistributed.com/2007/10/amazons_dynamo.h...

andrewtj · on March 2, 2010

Scalaris (http://code.google.com/p/scalaris/) is consistent and has transactions.

As an aside, three sentences doesn't really warrant a "tl;dr".

z8000 · on March 2, 2010

Not being snarky here but how can anyone seriously consider Scalaris for any data that's even remotely important?

http://code.google.com/p/scalaris/wiki/FAQ#Is_the_store_pers...?

andrewtj · on March 2, 2010

It certainly limits it from being useful if you don't have dollars to throw at high-end infrastructure, but it's not quite as terrible as it sounds. Effectively all it means is that you can't cold start it. How big a problem this is depends like anything else on the application in question.

z8000 · on March 2, 2010

As I write this comment I just know that in 5 years I will look at it and chuckle at myself but... it boggles my mind to consider a pure-RAM system that relies on at least one copy of the dataset being available forever in some system's "memory grid".

http://highscalability.com/are-cloud-based-memory-architectu...

JulianMorrison · on March 2, 2010

MongoDB does implicit transactions around single queries/commands on single documents, that can be complex enough to be used to create always-recoverable or always-resumable protocols.

jbellis · on March 2, 2010

It doesn't take a _whole_ lot of familiarity with the other systems to see which of his criteria they failed.

justinsb · on March 1, 2010

As Jonathan has pointed out, the article has lots of inaccuracies, but it also has a very good explanation of the NoSQL Faustian pact: "Cassandra doesn't do joins... doesn't guarantee referential integrity, where the user knows the data being used reflects the latest updates... can't process transactions, with a guarantee that the transaction will either be completed or discarded, the way relational systems do" because it focuses on "more immediate goals than the pristine data handling rules of relational systems."

As the expression goes: at the poker table, if you don't know who the sucker is, it's you. Guess who's responsible for those fuddy-duddy, old-fashioned things like querying your data, or making sure that you've written data, or that your database isn't corrupted... it's you.

If you're Twitter, and you're fail-whaling every day, then maybe the work required to make this trade-off work makes sense. But I can't help but feel there's got to be a better way.

raganwald · on March 1, 2010

I won't defend Cassandra's design choices here, but I will say that the relational model is only about joins and normal forms. The other things you mention happen to be built into all modern relational databases but are orthogonal concepts.

A distributed hash table can be built with transactions, isolation, and so forth. Such a system would offer a different set of trade-offs that might satisfy a different set of users.

justinsb · on March 1, 2010

Completely agree with you on the theoretical level; it's a (lazy) shorthand to contrast the typical NoSQL trade-offs with the ACID model that most relational databases employ.

In practice though, I think if you introduce (multi-'row') ACID into any of today's NoSQL database, you'd just end up with a bad traditional database (a 'relational' database without a strong theoretical grounding, without the ability to do joins and without a powerful query language.) This whole NoSQL movement feels like a re-run of the evolution of the relational database - those that don't know their history are doomed to repeat it.

jbellis · on March 1, 2010

> I think if you introduce (multi-'row') ACID into any of today's NoSQL database, you'd just end up with a bad traditional database

Strongly disagree, although there is truth to the converse (adding sharding to a traditional database yields a bad nosql implementation :).

There's a sketch of how Google added multirow transactions to bigtable for app engine (via a layer above bigtable called megastore): http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore....

The key idea is that "a per-entity-group transaction log is used. One of the rows that stores the entity group is the entity group’s root. The log is stored with the root, which is replicated like all rows in Big Table."

This is basically a version of the approach advocated by Pat Helland in his paper on "Life Beyond Distributed Transactions" -- http://www.cidrdb.org/cidr2007/papers/cidr07p15.pdf -- namely, that the most sane approach to distributed transactions is to redefine the problem, and restrict transactions to be within "entities" that fit on a single machine. Which, as App Engine demonstrates, turns out to be enough to do an awful lot.

justinsb · on March 1, 2010

Interesting links - thanks. There's definitely a continuum here - the full ACID model at one end; key-value stores at the other. It looks like Google is moving in the right direction - adding more ACID - which certainly plays to my personal bias :-)

cx01 · on March 1, 2010

> Strongly disagree

Would you mind explaining why you disagree?

> the most sane approach to distributed transactions is to redefine the problem, and restrict transactions to be within "entities" that fit on a single machine.

How is disallowing distributed transactions a sane approach? There's no harm in allowing distributed transactions using 2PC and letting the developer decide if he wants to incur the performance penalty or wants to colocate the data-structures to the same node to avoid 2PC overhead.

jbellis · on March 1, 2010

[replying to cx01]

> Would you mind explaining why you disagree?

I did, that's what the rest of the comment is. :)

> How is disallowing distributed transactions a sane approach?

That's the subject of Pat's entire paper, which I linked.

RyanMcGreal · on March 1, 2010

>in what's becoming a not-uncommon move.

Or as Orwell put it: "A not unblack dog was chasing a not unsmall rabbit across a not ungreen field."

telemachos · on March 1, 2010

Litotes: my favorite figure of speech.

http://en.wikipedia.org/wiki/Litotes

DanielBMarkham · on March 1, 2010

Mine too.

And when combined with a thinly veiled insult, gives the wonderful "back-handed compliment"

"He's not as unintelligent as he looks" "She's not the liar she might have been given her poor upbringing"

ableal · on March 1, 2010

> my favorite figure of speech

my not-unpreferred figure of speech, please. If you want to make it not unpossible to believe you.

(Thanks, really; I've developed a belated soft spot for rhetorical devices.)

santry · on March 1, 2010

Unlike the litotes in your Orwell quote, "not uncommon" and "common" mean different things.

ryanwaggoner · on March 2, 2010

Namely? And based on what?

jemfinch · on March 2, 2010

If there is an axis of "commonness" upon which things can be measured, with "never happens" on the left side, and "always happens" on the right side, then "uncommon" would be a small section on the left, and "common" would be a small section on the right; everything between these sections could be "not uncommon."

balding_n_tired · on March 1, 2010

The war situation has developed not necessarily to MySQL's advantage.

freshfunk · on March 1, 2010

I'm a fan of the NoSql movement and have been exploring Cassandra as an option for data storage.

I had a conversation with an engineer who works at a pretty well-known company here in SV and their sys admins are dropping Cassandra and pushing all the engineers back to MySql. I don't know the whole story but it seemed to be implied that open-sourced Cassandra had issues and supposedly Facebook had a much different version they were using.

Of course this is all second hand, so I tried to search on the experiences of other people using Cassandra (with decent volume). Unfortunately most of the threads I found had people just like me, at the exploratory stage. Or they hadn't been live with it for long.

If there were any pitfalls or hairy parts with maintain Cassandra, that would be good to know. Also, examples of clients who have decent load and have been using it for a while.

jbellis · on March 1, 2010

> their sys admins are dropping Cassandra and pushing all the engineers back to MySql

I'm curious what you are thinking of, because I have better picture of companies using Cassandra than most. :)

I do know of one company that fits your description, where some of the mysql DBAs were very anti-cassandra because, frankly, it's not mysql and that's what they were used to. But that has been resolved (the most vocal DBA left) and the Cassandra migration is continuing.

> If there were any pitfalls or hairy parts with maintain Cassandra, that would be good to know

Other than the obvious (it's not a relational database), we've documented the main limitations here: http://wiki.apache.org/cassandra/CassandraLimitations

physcab · on March 1, 2010

I'm curious why a lot of people / companies seem to be picking up Cassandra lately. I'm not one to peg one "NoSQL" software system against another, but I've been using HBase for a few months (albeit on a rather limited cluster) and feel that it fits great especially with its compatibility with Hadoop. We use Hadoop + Map/Reduce extensively at Grooveshark. Does anyone have experience in using both systems and can offer a candid account of both?

simonw · on March 1, 2010

I have no experience with either, but here's a recent blog entry about moving from HBase to Cassandra:

http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-wh...

bjclark · on March 1, 2010

More accurate title: Twitter migrates parts of it's system away from MySQL to other data stores including Cassandra.

sown · on March 1, 2010

So what kind of performance can you get from Cassandra? How big can the values be? I was thinking of using it as a backend for a mail system.

jbellis · on March 1, 2010

> So what kind of performance can you get from Cassandra?

~10k ops/second per quad-core node. Scales roughly linearly w/ node and core counts. ("Roughly" means, obviously there is network overhead as you move from single node to multiple, that kind of thing.)

> How big can the values be?

2 GB, although you probably don't want to max that out.

Cassandra is mostly used for smaller pieces of data, although I do know at least one person using it as an S3 replacement by chunking files into 64MB chunks; each file is one row consisting of columns that each contain one such chunk.

> I was thinking of using it as a backend for a mail system.

It should work fine for that.

davidw · on March 1, 2010

Whoever does a thorough comparison of a number of these new "nosql" systems, including features and some benchmarks, will have him or herself an extremely popular article.

jbellis · on March 1, 2010

One I wrote: http://www.rackspacecloud.com/blog/2009/11/09/nosql-ecosyste...

Another, that goes into even more detail (imo, too much for one article, but a good article all the same): http://www.vineetgupta.com/2010/01/nosql-databases-part-1-la...

Benchmarking systems w/ very different data models is difficult to impossible, which is why you don't see that in this kind of survey piece. You're best off by picking one segment and focusing on that. Yahoo did that with the ColumnFamily stores (cassandra, hbase, and one they wrote internally) here: http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf (note that Cassandra 0.5 results are on page 16 and 17, not inline w/ the rest)

davidw · on March 1, 2010

Maybe one way to do it would be to pick a problem, or several problems, and implement solutions in each one.

Hrm, maybe it's a book more than an article, although I don't think I'd buy it, given that it'd be out of date so soon.

The point being, as a casual observer of these things, I don't yet have a good feel at all for which ones might be good for what.

Roridge · on March 1, 2010

informationLastweek

mahmud · on March 1, 2010

Not everyone is plugged into the "scene" RSSes. Publications purpose is to discover trends in news signals, and distill that into a coherent format supported by argument.

Roridge · on March 2, 2010

fair point, I cheerfully withdraw my comment.

lallysingh · on March 1, 2010

about...fracking...time