Hacker News new | past | comments | ask | show | jobs | submit login

Our uptime could use some improvement, surely. You will bear in mind that it has improved substantially over the last year. It's largely because of things like Gizzard...

I think one of the compelling things about Gizzard is that unending list of crazy and obscure failure conditions that cascade in unpredictable ways and take the site down--all of those that have occurred up till now are encoded into the design of Gizzard so that they do not occur again.

We have not fixed those "unknown unknowns", to be revealed in the future, that can cause a Gizzard system to crash. But my guess is that, over the last year, we have built in fail-safes for scenarios that exceed anything most people have ever experienced or ever will experience-- and by a long shot.

I love Cassandra, I think it's awesome. Twitter plans to move its Tweet storage to Cassandra. Suffice to say that Cassandra is young and the reliability of Cassandra with our throughput and our (very very large) corpus can use some improvement. Cassandra is not yet ready for production use at Twitter despite the fact that it has been deployed successfully at Digg and elsewhere. The Cassandra community is improving the database very rapidly. In a year or two I expect Cassandra to be a reasonable option for many popular web sites. But, even then, Cassandra's design has certain limitations that might prompt you to design a custom store (or consider alternatives). And indeed, there is a lot of stuff we have at Twitter that we have no plans to store in Cassandra unless there are substantial design changes.

An advantage of Gizzard here is that it is more flexible. You could, for example, build a document-partitioned inverted index with Gizzard such that you could do local intersections on each shard and merge these intersections in Gizzard itself. You might, in fact, use Gizzard in front of Lucene to do just this. There is no way to do that with Cassandra though perhaps they will build such features in the future. (To anticipate an objection, Lucandra is not document partitioned so it will suffer from fundamental efficiency problems for a certain class of search queries.)




There's no disputing you guys have massively increased your stability in the last year, and while growing enormously as well which makes it much more the victory.

But it would be good if you could go into more detail on your blog about what parts of twitter are dependent on Gizzard and especially distancing it from your remaining problems rather than just exclusively talking about how it works.


I think you've given a good suggestion for upcoming blog posts.

Gizzard is used by two systems at Twitter. One of them is called FlockDB and we are working on open-sourcing it (indeed, FlockDB is why we open-sourced Gizzard). FlockDB stores Twitter's social graphs. I cannot tell you (yet) how many QPS we do do or how many edges there are in the various graphs, but suffice to say its a lot and we run lots of complicated queries like "who follows both @aplusk and @oprah but does not want any of @aplusk's retweets," etc.

Gizzard is NOT perfect. But we think you'll find it is resilient to a large class of failure scenarios (that have, in fact, occurred over the last year, so this is not just the theoretical fault tolerance of a project that's been deployed on a small web site with simple requirements). With your help, it could be even better.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: