[database] Difference between scaling horizontally and vertically for databases

I have come across many NoSQL databases and SQL databases. There are varying parameters to measure the strength and weaknesses of these databases and scalability is one of them. What is the difference between horizontally and vertically scaling these databases?

This question is related to database database-design nosql scalability

The answer is


Scaling horizontally ===> Thousands of minions will do the work together for you.

Scaling vertically ===> One big hulk will do all the work for you.

enter image description here


Yes scaling horizontally means adding more machines, but it also implies that the machines are equal in the cluster. MySQL can scale horizontally in terms of Reading data, through the use of replicas, but once it reaches capacity of the server mem/disk, you have to begin sharding data across servers. This becomes increasingly more complex. Often keeping data consistent across replicas is a problem as replication rates are often too slow to keep up with data change rates.

Couchbase is also a fantastic NoSQL Horizontal Scaling database, used in many commercial high availability applications and games and arguably the highest performer in the category. It partitions data automatically across cluster, adding nodes is simple, and you can use commodity hardware, cheaper vm instances (using Large instead of High Mem, High Disk machines at AWS for instance). It is built off the Membase (Memcached) but adds persistence. Also, in the case of Couchbase, every node can do reads and writes, and are equals in the cluster, with only failover replication (not full dataset replication across all servers like in mySQL).

Performance-wise, you can see an excellent Cisco benchmark: http://blog.couchbase.com/understanding-performance-benchmark-published-cisco-and-solarflare-using-couchbase-server

Here is a great blog post about Couchbase Architecture: http://horicky.blogspot.com/2012/07/couchbase-architecture.html


SQL databases like Oracle, db2 also support Horizontal scaling through Shared disk cluster. For example Oracle RAC, IBM DB2 purescale or Sybase ASE Cluster edition. New node can be added to Oracle RAC system or DB2 purescale system to achieve horizontal scaling.

But the approach is different from noSQL databases (like mongodb, CouchDB or IBM Cloudant) is that the data sharding is not part of Horizontal scaling. In noSQL databases data is shraded during horizontal scaling.


Adding lots of load balancers creates extra overhead and latency and that is the drawback for scaling out horizontally in nosql databases. It is like the question why people say RPC is not recommended since it is not robust.

I think in a real system we should use both sql and nosql databases to utilize both multicore and cloud computing capabilities of today's systems.

On the other hand, complex transactional queries has high performance if sql databases such as oracle being used. NoSql could be used for bigdata and horizontal scalability by sharding.


The accepted answer is spot on the basic definition of horizontal vs vertical scaling. But unlike the common belief that horizontal scaling of databases is only possible with Cassandra, MongoDB, etc I would like to add that horizontal scaling is also very much possible with any traditional RDMS; that too without using any third party solutions.

I know of many companies, specially SaaS based companies that do this. This is done using simple application logic. You basically take a set of users and divide them over multiple DB servers. So for example, you would typically have a "meta" database/table that would store clients, DB server/connection strings, etc and a table that stores client/server mapping.

Then simply direct requests from each client to the DB server they are mapped to.

Now some may say this is akin to horizontal partitioning and not "true" horizontal scaling and they will be right in some ways. But the end result is that you have scaled your DB over multiple Db servers.

The only difference between the two approaches to horizontal scaling is that one approach (MongoDB, etc) the scaling is done by the DB software itself. In that sense you are "buying" the scaling. In the other approach (for RDBMS horizontal scaling), the scaling is built by application code/logic.

Buy vs Build


Traditional relational databases were designed as client/server database systems. They can be scaled horizontally but the process to do so tends to be complex and error prone. NewSQL databases like NuoDB are memory-centric distributed database systems designed to scale out horizontally while maintaining the SQL/ACID properties of traditional RDBMS.

For more information on NuoDB, read their technical white paper.


You have a company and there is only 1 worker but you got 1 new project at that time you hire new candidate -- this is horizontal scaling. where new candidate is new machines and project is new traffic/calls to your api's.

Where as 1 project with an IIT/NIT guy handling all request to your api/traffic. If any time more request to your api's then fire him and replacing him with a high IQ NIT/IIT guy -- this is vertical scaling.


Let's start with the need for scaling that is increasing resources so that your system can now handle more requests than it earlier could.

When you realise your system is getting slow and is unable to handle the current number of requests, you need to scale the system.

This provides you with two options. Either you increase the resources in the server which you are using currently, i.e, increase the amount of RAM, CPU, GPU and other resources. This is known as vertical scaling.

Vertical scaling is typically costly. It does not make the system fault tolerant, i.e if you are scaling application running with single server, if that server goes down, your system will go down. Also the amount of threads remains the same in vertical scaling. Vertical scaling may require your system to go down for a moment when process takes place. Increasing resources on a server requires a restart and put your system down.

Another solution to this problem is increasing the amount of servers present in the system. This solution is highly used in the tech industry. This will eventually decrease the request per second rate in each server. If you need to scale the system, just add another server, and you are done. You would not be required to restart the system. Number of threads in each system decreases leading to high throughput. To segregate the requests, equally to each of the application server, you need to add load balancer which would act as reverse proxy to the web servers. This whole system can be called as a single cluster. Your system may contain a large number of requests which would require more amount of clusters like this.

Hope you get the whole concept of introducing scaling to the system.


There is an additional architecture that wasn't mentioned - SQL-based database services that enable horizontal scaling without the complexity of manual sharding. These services do the sharding in the background, so they enable you to run a traditional SQL database and scale out like you would with NoSQL engines like MongoDB or CouchDB. Two services I am familiar with are EnterpriseDB for PostgreSQL and Xeround for MySQL. I saw an in-depth post by Xeround which explains why scale-out on SQL databases is difficult and how they do it differently - treat this with a grain of salt as it is a vendor post. Also check out Wikipedia's Cloud Database entry, there is a nice explanation of SQL vs. NoSQL and service vs. self-hosted, a list of vendors and scaling options for each combination. ;)


Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to database-design

What are OLTP and OLAP. What is the difference between them? How to create a new schema/new user in Oracle Database 11g? What are the lengths of Location Coordinates, latitude and longitude? cannot connect to pc-name\SQLEXPRESS SQL ON DELETE CASCADE, Which Way Does the Deletion Occur? What are the best practices for using a GUID as a primary key, specifically regarding performance? "Prevent saving changes that require the table to be re-created" negative effects Difference between scaling horizontally and vertically for databases Using SQL LOADER in Oracle to import CSV file What is cardinality in Databases?

Examples related to nosql

Firestore Getting documents id from collection What is Hash and Range Primary Key? Mongodb: Failed to connect to 127.0.0.1:27017, reason: errno:10061 Explanation of JSONB introduced by PostgreSQL DynamoDB vs MongoDB NoSQL Querying DynamoDB by date Delete all nodes and relationships in neo4j 1.8 When to use CouchDB over MongoDB and vice versa Difference between scaling horizontally and vertically for databases NoSQL Use Case Scenarios or WHEN to use NoSQL

Examples related to scalability

Difference between scaling horizontally and vertically for databases Best way to store chat messages in a database? Does Django scale? A beginner's guide to SQL database design What is considered a good response time for a dynamic, personalized web application?