We offer a platform for video- and audio-clips, photos and vector-grafics. We started with MySQL as the database backend and recently included MongoDB for storing all meta-information of the files, because MongoDB better fits the requirements. For example: photos may have Exif information, videos may have audio-tracks where we to want to store the meta-information of, too. Videos and vector-graphics don't share any common meta-information, etc. so I know, that MongoDB is perfect to store this unstructured data and keep it searchable.
However, we continue developing our platform and adding features. Now one of the next steps will be providing a forum for our users. The question that now arises is: use the MySQL database, which would be a good choice for storing forums and forum-posts, etc. or use MongoDB for this, too?
So the question is: when to use MongoDB and when to use a RDBMS. What would you take, mongoDB or MySQL, if you had the choice and why would you take it?
Note that Mongo essentially stores JSON. If your app is dealing with a lot of JS Objects (with nesting) and you want to persist these objects then there is a very strong argument for using Mongo. It makes your DAL and MVC layers ultra thin, because they are not un-packaging all the JS object properties and trying to force-fit them into a structure (schema) that they don't naturally fit into.
We have a system that has several complex JS Objects at its heart, and we love Mongo because we can persist everything really, really easily. Our objects are also rather amorphous and unstructured, and Mongo soaks up that complication without blinking. We have a custom reporting layer that deciphers the amorphous data for human consumption, and that wasn't that difficult to develop.
You know, all this stuff about the joins and the 'complex transactions' -- but it was Monty himself who, many years ago, explained away the "need" for COMMIT / ROLLBACK, saying that 'all that is done in the logic classes (and not the database) anyway' -- so it's the same thing all over again. What is needed is a dumb yet incredibly tidy and fast data storage/retrieval engine, for 99% of what the web apps do.
After two years using MongoDb for a social app, I have witnessed what it really means to live without a SQL RDBMS.
I believe that 98% of all projects probably are way better with a typical SQL RDBMS than with NoSQL.
I've seen at lot of companies are using MongoDB for realtime analytics from application logs. Its schema-freeness really fits for application logs, where record schema tends to change time-to-time. Also, its Capped Collection feature is useful because it automatically purges old data to keep the data fit into the memory.
That is one area I really think MongoDB fits for, but MySQL/PostgreSQL is more recommended in general. There're a lot of documentations and developer resources on the web, as well as their functionality and robustness.
Like said previously, you can choose between a lot of choices, take a look at all those choices: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
What I suggest is to find your best combination: MySQL + Memcache is really great if you need ACID and you want to join some tables MongoDB + Redis is perfect for document store Neo4J is perfect for graph database
What i do: I start with MySQl + Memcache because I'm use to, then I start using others database framework. In a single project, you can combine MySQL and MongoDB for instance !
The 2 main reason why you might want to prefer Mongo are
It is suitable for big data applications. RDBMS is not good for big data.
I would say use an RDBMS if you need complex transactions. Otherwise I would go with MongoDB - more flexible to work with and you know it can scale when you need to. (I'm biased though - I work on the MongoDB project)
Who needs distributed, sharded forums? Maybe Facebook, but unless you're creating a Facebook-competitor, just use Mysql, Postgres or whatever you are most comfortable with. If you want to try MongoDB, ok, but don't expect it to do magic for you. It'll have its quirks and general nastiness, just as everything else, as I'm sure you've already discovered if you really have been working on it already.
Sure, MongoDB may be hyped and seem easy on the surface, but you'll run into problems which more mature products have already overcome. Don't be lured so easily, but rather wait until "nosql" matures, or dies.
Personally, I think "nosql" will wither and die from fragmentation, as there are no set standards (almost by definition). So I will not personally bet on it for any long-term projects.
Only thing that can save "nosql" in my book, is if it can integrate into Ruby or similar languages seamlessly, and make the language "persistent", almost without any overhead in coding and design. That may come to pass, but I'll wait until then, not now, AND it needs to be more mature of course.
Btw, why are you creating a forum from scratch? There are tons of open source forums which can be tweaked to fit most requirements, unless you really are creating The Next Generation of Forums (which I doubt).
to store this unstructured data
As you said, MongoDB is best suitable to store unstructured data. And this can organize your data into document format. These RDBMS altenatives called NoSQL data stores (MongoDB, CouchDB, Voldemort) are very useful for applications that scales massively and require faster data access from these big data stores.
And the implementation of these databases are simpler than the regular RDBMS. Since these are simple key-valued or document style binary objects directly serialized into disk. These data stores don't enforce the ACID properties, and any schemas. This doesn't provide any transaction abilities. So this can scale big and we can achieve faster access (both read and write).
But in contrast, RDBM enforces ACID and schemas on datas. If you wanted to work with structured data you can go ahead with RDBM.
I would choose MySQL for creating forums for this kind of stuff. Because this is not going to scale big. And this is a very simple (common) application which has structured relations among the data.
Source: Stackoverflow.com