[database] NoSql vs Relational database

Recently NoSQL has gained immense popularity.

What are the advantages of NoSQL over traditional RDBMS?

The answer is


Just adding to all the information given above

NoSql Advantages:

1) NoSQL is good if you want to be production ready fast due to its support for schema-less and object oriented architecture.

2) NoSql db's are eventually consistent which in simple language means they will not provide any lock on the data(documents) as in case of RDBMS and what does it mean is latest snapshot of data is always available and thus increase the latency of your application.

3) It uses MVCC (Multi view concurrency control) strategy for maintaining and creating snapshot of data(documents).

4) If you want to have indexed data you can create view which will automatically index the data by the view definition you provide.

NoSql Disadvantages:

1) Its definitely not suitable for big heavy transactional applications as it is eventually consistent and does not support ACID properties.

2) Also it creates multiple snapshots (revisions) of your data (documents) as it uses MVCC methodology for concurrency control, as a result of which space get consumed faster than before which makes compaction and hence reindexing more frequent and it will slow down your application response as the data and transaction in your application grows. To counter that you can horizontally scale the nodes but then again it will be higher cost as compare sql database.


RDBMS focus more on relationship and NoSQL focus more on storage.

You can consider using NoSQL when your RDBMS reaches bottlenecks. NoSQL makes RDBMS more flexible.


The biggest advantage of NoSQL over RDBMS is Scalability.
NoSQL databases can easily scale-out to many nodes, but for RDBMS it is very hard.
Scalability not only gives you more storage space but also much higher performance since many hosts work at the same time.


The history seem to look like this:

  1. Google needs a storage layer for their inverted search index. They figure a traditional RDBMS is not going to cut it. So they implement a NoSQL data store, BigTable on top of their GFS file system. The major part is that thousands of cheap commodity hardware machines provides the speed and the redundancy.

  2. Everyone else realizes what Google just did.

  3. Brewers CAP theorem is proven. All RDBMS systems of use are CA systems. People begin playing with CP and AP systems as well. K/V stores are vastly simpler, so they are the primary vehicle for the research.

  4. Software-as-a-service systems in general do not provide an SQL-like store. Hence, people get more interested in the NoSQL type stores.

I think much of the take-off can be related to this history. Scaling Google took some new ideas at Google and everyone else follows suit because this is the only solution they know to the scaling problem right now. Hence, you are willing to rework everything around the distributed database idea of Google because it is the only way to scale beyond a certain size.

C - Consistency
A - Availability
P - Partition tolerance
K/V - Key/Value


NOSQL has no special advantages over the relational database model. NOSQL does address certain limitations of current SQL DBMSs but it doesn't imply any fundamentally new capabilities over previous data models.

NOSQL means only no SQL (or "not only SQL") but that doesn't mean the same as no relational. A relational database in principle would make a very good NOSQL solution - it's just that none of the current set of NOSQL products uses the relational model.


NoSQL is better than RDBMS because of the following reasons/properities of NoSQL

  1. It supports semi-structured data and volatile data
  2. It does not have schema
  3. Read/Write throughput is very high
  4. Horizontal scalability can be achieved easily
  5. Will support Bigdata in volumes of Terra Bytes & Peta Bytes
  6. Provides good support for Analytic tools on top of Bigdata
  7. Can be hosted in cheaper hardware machines
  8. In-memory caching option is available to increase the performance of queries
  9. Faster development life cycles for developers

EDIT:

To answer "why RDBMS cannot scale", please take a look at RDBMS Overheads pdf written by Stavros Harizopoulos,Daniel J. Abadi,Samuel Madden and Michael Stonebraker

RDBMS's have challenges in handling huge data volumes of Terabytes & Peta bytes. Even if you have Redundant Array of Independent/Inexpensive Disks (RAID) & data shredding, it does not scale well for huge volume of data. You require very expensive hardware.

Logging: Assembling log records and tracking down all changes in database structures slows performance. Logging may not be necessary if recoverability is not a requirement or if recoverability is provided through other means (e.g., other sites on the network).

Locking: Traditional two-phase locking poses a sizeable overhead since all accesses to database structures are governed by a separate entity, the Lock Manager.

Latching: In a multi-threaded database, many data structures have to be latched before they can be accessed. Removing this feature and going to a single-threaded approach has a noticeable performance impact.

Buffer management: A main memory database system does not need to access pages through a buffer pool, eliminating a level of indirection on every record access.

This does not mean that we have to use NoSQL over SQL.

Still, RDBMS is better than NoSQL for the following reasons/properties of RDBMS

  1. Transactions with ACID properties - Atomicity, Consistency, Isolation & Durability
  2. Adherence to Strong Schema of data being written/read
  3. Real time query management ( in case of data size < 10 Tera bytes )
  4. Execution of complex queries involving join & group by clauses

We have to use RDBMS (SQL) and NoSQL (Not only SQL) depending on the business case & requirements


If you need to process huge amount of data with high performance

OR

If data model is not predetermined

then

NoSQL database is a better choice.


From mongodb.com:

NoSQL databases differ from older, relational technology in four main areas:

Data models: A NoSQL database lets you build an application without having to define the schema first unlike relational databases which make you define your schema before you can add any data to the system. No predefined schema makes NoSQL databases much easier to update as your data and requirements change.

Data structure: Relational databases were built in an era where data was fairly structured and clearly defined by their relationships. NoSQL databases are designed to handle unstructured data (e.g., texts, social media posts, video, email) which makes up much of the data that exists today.

Scaling: It’s much cheaper to scale a NoSQL database than a relational database because you can add capacity by scaling out over cheap, commodity servers. Relational databases, on the other hand, require a single server to host your entire database. To scale, you need to buy a bigger, more expensive server.

Development model: NoSQL databases are open source whereas relational databases typically are closed source with licensing fees baked into the use of their software. With NoSQL, you can get started on a project without any heavy investments in software fees upfront.


Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to database-design

What are OLTP and OLAP. What is the difference between them? How to create a new schema/new user in Oracle Database 11g? What are the lengths of Location Coordinates, latitude and longitude? cannot connect to pc-name\SQLEXPRESS SQL ON DELETE CASCADE, Which Way Does the Deletion Occur? What are the best practices for using a GUID as a primary key, specifically regarding performance? "Prevent saving changes that require the table to be re-created" negative effects Difference between scaling horizontally and vertically for databases Using SQL LOADER in Oracle to import CSV file What is cardinality in Databases?

Examples related to nosql

Firestore Getting documents id from collection What is Hash and Range Primary Key? Mongodb: Failed to connect to 127.0.0.1:27017, reason: errno:10061 Explanation of JSONB introduced by PostgreSQL DynamoDB vs MongoDB NoSQL Querying DynamoDB by date Delete all nodes and relationships in neo4j 1.8 When to use CouchDB over MongoDB and vice versa Difference between scaling horizontally and vertically for databases NoSQL Use Case Scenarios or WHEN to use NoSQL

Examples related to relational-database

Laravel - Eloquent "Has", "With", "WhereHas" - What do they mean? What is the difference between a candidate key and a primary key? Does the join order matter in SQL? Difference between 3NF and BCNF in simple terms (must be able to explain to an 8-year old) How to perform a LEFT JOIN in SQL Server between two SELECT statements? Difference between a theta join, equijoin and natural join Foreign Key to multiple tables What is the difference between a Relational and Non-Relational Database? Difference between one-to-many and many-to-one relationship NoSql vs Relational database

Examples related to rdbms

IN vs ANY operator in PostgreSQL What is the difference between DBMS and RDBMS? NoSql vs Relational database What are database constraints? What are the different types of keys in RDBMS? What does the term "Tuple" Mean in Relational Databases? Drop all tables command Relational Database Design Patterns?