I would tend to say that MongoDB is the new MySQL in a different way - it's the default not-really-suitable hammer that the current generation of amateur software carpenters are using to turn every problem into a nail. It's not super elegant, but most of the time it works, sort of.
On the bright side, one could say that at least half of the PHP/MySQL tag team has been improved significantly by being replaced with Python/MongoDB, as theological issues aside, Python is a lot less broken than PHP :)
"I would tend to say that MongoDB is the new MySQL in a different way - it's the default not-really-suitable hammer that the current generation of amateur software carpenters are using to turn every problem into a nail. It's not super elegant, but most of the time it works, sort of."
that's an excellent - and much shorter - way of putting it.
For what projects (besides enterprise -> Oracle, embedded -> SQLite and Windows -> SQL server) would you consider MySQL inappropriate? What would you use instead?
I see people angry about not using the "right tool for the job" all the time with regard to web applications. What other tools do you feel aren't used when they should be?
If you don't have need for relational integrity, then MySQL is a liability. MongoDB does not have relational integrity which is one of the reasons why it's faster.
eg Real-time analytics
If you need a pre-planned schema from database over making one on the fly (even accidentally), then MySQL is better as MongoDB is schema-less.
eg Rapidly changing input
If you are worried about Oracle's acquisition of Sun and MySQL (Switching to Postgre is the closest alternative though).
MySQL sharding is hard, MongoDB can automate it to a certain extent (MySQL requires third party tools I believe).
eg Distributing large datasets
Note: Other databases, eg Cassandra can also automate 'sharding' for example.
If you don't have need for relational integrity, then MySQL is a liability. MongoDB does not have relational integrity which is one of the reasons why it's faster.
I have a client synchronization utility that uses MongoDB instead of SQLite because it's lightweight but I have all ease of use of MongoDB. Install it as a windows service with our installer and we are good to go.
I have a client synchronization utility that uses MongoDB instead of SQLite because it's lightweight but I have all ease of use of MongoDB. Install it as a windows service with our installer and we are good to go.
And in this context, MySql would be way to heavy.
Your comment is confusing. Can you please explain:
You replaced sqlite with mongodb, as an embedded database? How can mongodb run without a server?
Are you saying mongodb is more lightweight than sqlite? If so I find this very surprising, can you elaborate?
It's not an embedded scenario per se, but we use it like we would use sqlite. We install MongoDB as a service with our installer. But it's on the local machine, and only that machine (as our our application runs as the server).
MongoDB isn't as light as sqlite, but for our usage the difference was negligable with MongoDB running with around 3MB memory footprint. In the end we get speed and ease of use/flexibility of not having to deal with sql or schemas.
Didn't consider it/didn't think of it. Main reason why we went with MongoDB was that it's fast, lightweight, and had a c# driver (unofficial) already written at the time. I'm sure those all apply to BDB also, but we knew more about MongoDB.
I think that one of the reasons that MongoDB is getting traction is that many people don't use relational databases in a relational way for many projects. Lots of data just isn't as relational as we hope it would be and also people are terrible at modeling data in a relational way.
Relational databases were designed around the idea of minimizing storage footprint, and we have nearly infinite storage capacity relative to many databases, so many devs don't care about only having one copy of a piece of data in the DB.
Also, SQL is great as a data retrieval language, but it is awful for inputting data. Yes, it works, but writing data to MongoDB in general has felt more natural than generating SQL to shotgun in data.
You could argue that ORM's solve a lot of the uglyness of inputting data into a DB and I agree with you, but you still have to deal with table migrations and being able to just add a field in your code and not have to go hold the database's hand or write a migration to make it work is incredibly convenient.
In the end, MongoDB solves a lot of convenience issues for programs that don't need relational data or programmers who don't want to use an ORM, create SQL strings, or write migrations to get their database to store their data.
It's not for everybody, but if it fits your needs, it solves some problems much more conveniently than MySQL.
I agree with your first comment: a lot of people suck at creating models that are relational. It's also difficult to use relational data outside of a relational system: when you retrieve it from your RDBMS and get it as, say, a Python dictionary. This is the big problem with using SQL for the web.
However, I disagree with your other statements:
(1) we do not have infinite storage. Most data stores keep everything in RAM, and that is pretty limited.
(2) It's really easy to input data into SQL. There are in fact a bunch of methods to do so (almost all still rely on SQL).
(3) (more of a comment, than disagreement) ORM's don't solve much. They just remove the need for engineering a data access API and trade it for performance and cleanliness of code.
Lots of datastores don't keep everything in RAM though. CouchDB uses up a whole ton of disk, because disk is cheap and the way the database is structured the disk reads aren't even a problem for performance.
>also people are terrible at modeling data in a relational way.
While true, I don't think this is quite fair. Not all of our clients/employers are huge entities with business processes and data requirements that are stable for decades. Many, if not most of the projects that come across the desk of an IT developer are temporary projects with indefinite scope.
One of my current projects is modelling a rather complex set of intertwined business processes. Part of the problem is just getting the client to explain their existing processes completely. Another part of the problem is that the processes are always changing. As soon as I think I finally have all the data models figured out, there's one more tweak. I am not great at data modelling or SQL, but tweaking a dozen 50 line joins on a daily basis is not happy fun time for anyone. At some point the specced data model, the model as implemented, and the conceptual model that's in my head get out of sync in some mysterious way, the plot is lost and it all turns to shit. On top of this I'm also trying to manage an object model that sorta maybe corresponds to the data model. In cases where the spec changes frequently, the higher cognitive load of maintaining a decent relational model is simply not worth it.
Minimizing storage footprint was important, but the main selling point was data integrity, or at least data consistency. In large organizations, databases are still the most common point of integration between applications, and even those that are only accessed by a single application tend to outlive them as requirements and technologies change.
"Relational databases were designed around the idea of minimizing storage footprint, and we have nearly infinite storage capacity"
Well, foreign keys are not just a way to minimize storage: it speeds up future updates. If you duplicate your data everywhere for faster read access, your write cost just increased.
MongoDB is not the new MySQL, because software with as much inertia and adoption as MySQL will not see easy or even complete replacement. Case-in-point: IE6. IE6 is still with us today, as much as we hate it. You can make all the arguments for a modern browser that you want, but businesses and a few people say, "But I like it better."
Are there easy code migrations to MongoDB? No! Are there easy query migrations? Not really. It isn't a linear transition from one to the other. So no only do you have to rewrite your software, but you've got to pitch your SQL references out the window along with your queries.
I believe that Postgres will replace MySQL. It's mature, SQL-based and similar enough that only tweaks are needed to get a code base running. Oh, and it's free.
New projects may support MongoDB, but I'd be surprised if Wordpress ever came out with a version to support it.
Of course you can measure customer sentiment. You can even measure the correlation between your brand name and a selection of happy words as they appear on Twitter.
What you should probably not do, however, is elide the essential difference between these two measurements. Twitter is not a representative sample of anything but Twitter. Much of Twitter is spammers and shills, not customers. A lot of Tweets aren't even from humans. And people can mention MongoDB without knowing the slightest thing about it, and I am sure many do. I'm doing it now.
where we agree then, i think, is the fact that Twitter's just another datapoint. and from a statistical perspective, it cannot be considered representative of the wider population for any number of reasons, not least the fact that it's observational rather than a random sample.
that said, it's an interesting proxy for examining questions of sentiment, and its predictive ability has been examined several times academically within specific contexts (elections and markets - e.g. this one by Cornell http://arxiv.org/abs/1010.3003) and found to be relatively accurate.
so agreed, it's not representative. but that doesn't mean it's not interesting and potentially useful.
i tried to explicitly make the point that MongoDB is not a replacement for MySQL, it is rather following a similar path from a licensing, usage and market reaction standpoint. and that mongo's course within the nosql world may follow the pattern we saw mysql track w/in the relational database market.
from this comment, it doesn't seem like i was successful on that score.
Apparently I managed to overlook your statement about it not being a replacement. It's just that the title of the post seems to be in contradiction with that.
MongoDB is indeed an interesting software and it's on a promising start. I'll be interested to see where it's going. Maybe one day the title of your post will be true.
I think it's early to call a NoSQL winner/hammer now but I personally expected one eventually, http://www.quora.com/Will-there-be-a-new-de-facto-standard-o... although others aren't so sure. Just seems more convenient to use one database even if it's not an ideal fit for every task.
Not sure if I totally agree with the article, but MongoDB really is developer friendly: easy to set up and admin, slaves make create sources of analytics data (put a slave on each server that has analytics applications), useful for both large and small projects, etc.
Personally, PostgreSQL and MongoDB meet just about all of my non-graph data store needs. For graph data, I keep switching between Neo4j, Sesame, and AllegroGraph - can't make up my mind since they have different capabilities (Sesame and AG for fast indexed SPARQL queries, Neo4j for graph traversal).
The thing that I've found that makes MongoDB super easy for folks migrating from SQL is that it is easy to add indexes to arbitrary fields, and they work the same way as they do in MySQL. It's just a b-tree and if you can setup MySQL indexes, then you can build smart, well indexed structures and systems with Mongo.
Being able to do this without jumping all the way into writing map-reduce bits, but saving the time of setting up a rigid schema, makes it easy to see why MongoDB is so popular. To me, that is why MongoDB is the new MySQL.
I am confused. How hard is it to create an index in MySQL? I don't have much experience with MySQL, but with PostgreSQL, it is very easy.
In addition, there are some "advanced" features that may not be provided by MongoDB: you can choose your index type, rebuild the index, update the stats that will help the query optimizer decide if the index should be used at all, etc.
It's not hard, but it's something no-sql databases aren't terribly good at, making it a differentiator for MongoDB. In SQL everything ends up being an exercise in joins, in No-SQL everything ends up being an exercise in map-reduce. The ability to create indexes, like you can easily do in SQL databases, is a handy feature when you have relational or quasi-relational data.
It's not that the actual command you enter to create an index is hard. It's just that when you have a table with millions or billions of rows, you have to figure out how to create the index without stopping the world for hours.
With postgresql, create index does not block read, only write. You can also use the CONCURRENTLY parameter and writes/updates won't be blocked, but the index creation takes longer (the db has to wait until no transaction potentially involving the index has stopped).
Not sure what other strategies exist to reliably build an index.
The irony of this is that it's actually quite easy to build a document store on top of PostgreSQL that performs very well indeed. I tried it. I used a similar approach to the one described at [1]. You get the advantage of years of experience (never underestimate this!), MVCC, transactions, consistency and replication as well.
(yes I know it doesn't "perform" as well as the other NoSQL stores but performance is not without tradeoffs [2])
To me, the biggest advantage of MongoDB is autosharding. If you read the friendfeed article, you'll see that they also have to deal with 'eventual consistency' on the indexes.
I'd say that's not necessarily an advantage. Sharding is incredibly complicated to get right considering all factors such as balancing and recovery.
I'd rather partition the data based on function onto distinct clusters, you know like eBay do.
You can update the indexes in a transaction too, so that's not necessarily an issue. MySQL has problems with this due to locking but the MVCC implementation in PostgreSQL allows much better concurrency.
it's not either one of you, it's our setup. that and the fact that i neglected to directly cache the page before it hit, so our server is about to burst into flames.
soon as i can get back into to shut things down and cache everything it'll go back to normal. ish.
A decade later, MySQL – a feature-poor database relative to the commercial alternatives at the time – was the most popular relational database on the planet.
I would have thought the most popular relational database on the planet would be SQLite....since it ships with every android and iOS device, and with Firefox and Chrome.
I like and use both mongo and mysql. Where I think mysql shines is when you've been working on an applicaiton for 5 years, and end up connecting every piece to every other piece in ways you never imagined doing at the start. I think that's a lot trickier to do well in mongo, which favors denormalization since it's a document store. Given that most db's do not actually grow beyond what you can put on a single box, I suspect a lot of people may be avoiding problems they don't have while picking up problems they don't need by blindly switching to mongo.
That said, when you have a problem that mongo solves well, boy howdy is it nice.
The article doesn't describe but one of the biggest difference is scalability. MongoDB supports horizontal scaling as a built-in feature. It's known that MySQL can be scaled out by sharding but you lose ACID with it.
Of course scalability is required only when the service goes well. But it's important to considerate it if you intend to be successful.
It can be said scaled MySQL is a no-SQL. MongoDB is the New MySQL with this point of view.
According to the chart in the linked article (http://www.flickr.com/photos/sog/5909237515/), MySQL has had 73 (I assume distinct) committers in the past 12 months, but the most recent commit was over one year ago.
I'll admit that I'm not familiar with Ohloh, but I don't see how both of those statements can be true.
MySQL and MongoDB have strengths and weaknesses based on use case. MongoDB isn't a drop in replacement for MySQL and you probably wouldn't choose it for something like a message queue.
They're really different tools with different use-cases; I'm not sure it makes sense to directly compare them.
Redis is a fast key-value store with the nice property that values can be things like lists and sets instead of just plain strings.
MongoDB is a full-on document store for JSON-ish documents with support for complex ad hoc javascript queries over those documents, indexing, map-reduce style aggregation, and other fun arbitrarily complex search and collection scenarios that don't map into a simple key-value look up system.
Ah, fair enough. That's pretty cool. I find that if I want to store something complex like a dict/hash in redis, I simply encode it in JSON and store it as a string.
On the bright side, one could say that at least half of the PHP/MySQL tag team has been improved significantly by being replaced with Python/MongoDB, as theological issues aside, Python is a lot less broken than PHP :)