With all the downtime and seemingly permanent "Older tweets are temporary unavailable" messages why would anyone want to build on platforms they've created?
Maybe they have good ideas but I think part of the reason people are looking at Cassandra etc is because the high-profile sites using/developing these platforms actually work.
This is such a cynical and short-sighted comment. Did twitter have scalability problems? Yes. Does this mean that any and all technology that comes out of the company is junk? On the contrary. If anything, getting battle tested software that was built in response to solve massive scalability issues is going to be better than not.
Clearly they've worked through a lot of their problems at this point; I'd imagine this is some of the code that let them do so.
Nevermind the fact that a reply like this is disrespectful towards people who are giving you the fruits of their labor for free. Shame on you.
Maybe they nailed everything with Gizzard, but it'd be more convincing if their scalability and other problems were actually solved. I'm not sure why you alluded to those problems in the past tense - http://status.twitter.com/.
If I wanted to be disrespectful I'd have said their problems are still very real and very often and for that you'd have to be retarded to assume they've gotten anything but marketing right so far.
Would you build on gizzard right now? Their status blog is full of very recent reasons why it's premature for them to release anything and even more so why it's premature to trust or be grateful to them for doing so.
Our uptime could use some improvement, surely. You will bear in mind that it has improved substantially over the last year. It's largely because of things like Gizzard...
I think one of the compelling things about Gizzard is that unending list of crazy and obscure failure conditions that cascade in unpredictable ways and take the site down--all of those that have occurred up till now are encoded into the design of Gizzard so that they do not occur again.
We have not fixed those "unknown unknowns", to be revealed in the future, that can cause a Gizzard system to crash. But my guess is that, over the last year, we have built in fail-safes for scenarios that exceed anything most people have ever experienced or ever will experience-- and by a long shot.
I love Cassandra, I think it's awesome. Twitter plans to move its Tweet storage to Cassandra. Suffice to say that Cassandra is young and the reliability of Cassandra with our throughput and our (very very large) corpus can use some improvement. Cassandra is not yet ready for production use at Twitter despite the fact that it has been deployed successfully at Digg and elsewhere. The Cassandra community is improving the database very rapidly. In a year or two I expect Cassandra to be a reasonable option for many popular web sites. But, even then, Cassandra's design has certain limitations that might prompt you to design a custom store (or consider alternatives). And indeed, there is a lot of stuff we have at Twitter that we have no plans to store in Cassandra unless there are substantial design changes.
An advantage of Gizzard here is that it is more flexible. You could, for example, build a document-partitioned inverted index with Gizzard such that you could do local intersections on each shard and merge these intersections in Gizzard itself. You might, in fact, use Gizzard in front of Lucene to do just this. There is no way to do that with Cassandra though perhaps they will build such features in the future. (To anticipate an objection, Lucandra is not document partitioned so it will suffer from fundamental efficiency problems for a certain class of search queries.)
There's no disputing you guys have massively increased your stability in the last year, and while growing enormously as well which makes it much more the victory.
But it would be good if you could go into more detail on your blog about what parts of twitter are dependent on Gizzard and especially distancing it from your remaining problems rather than just exclusively talking about how it works.
I think you've given a good suggestion for upcoming blog posts.
Gizzard is used by two systems at Twitter. One of them is called FlockDB and we are working on open-sourcing it (indeed, FlockDB is why we open-sourced Gizzard). FlockDB stores Twitter's social graphs. I cannot tell you (yet) how many QPS we do do or how many edges there are in the various graphs, but suffice to say its a lot and we run lots of complicated queries like "who follows both @aplusk and @oprah but does not want any of @aplusk's retweets," etc.
Gizzard is NOT perfect. But we think you'll find it is resilient to a large class of failure scenarios (that have, in fact, occurred over the last year, so this is not just the theoretical fault tolerance of a project that's been deployed on a small web site with simple requirements). With your help, it could be even better.
I handle a similar rate but different flavour of data, and near-real time. Not really comparable though.
I'm not saying what twitter is doing can't or won't be good, just that you should think very hard before you consider it for anything until they've visibly left their own scaling issues behind.
My guess is that a lot of this infrastructure is to support them moving to new technologies while the old ones are in place. Changing the engine while the car is in motion requires a lot of piecemeal engineering work, and I think you're seeing that effort here.
Maybe they have good ideas but I think part of the reason people are looking at Cassandra etc is because the high-profile sites using/developing these platforms actually work.