[sql-server] What are the differences between a clustered and a non-clustered index?

What are the differences between a clustered and a non-clustered index?

The answer is


Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.

Non-clustered indexes are stored separately, and you can have as many as you want.

The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.


A clustered index is essentially a sorted copy of the data in the indexed columns.

The main advantage of a clustered index is that when your query (seek) locates the data in the index then no additional IO is needed to retrieve that data.

The overhead of maintaining a clustered index, especially in a frequently updated table, can lead to poor performance and for that reason it may be preferable to create a non-clustered index.


Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.

Unclustered means it's "only" a logical order.


Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.

Non-clustered indexes are stored separately, and you can have as many as you want.

The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.


A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.

A Non-Clustered Index defines a logical order that does not match the physical order on disk.


Clustered indexes physically order the data on the disk. This means no extra data is needed for the index, but there can be only one clustered index (obviously). Accessing data using a clustered index is fastest.

All other indexes must be non-clustered. A non-clustered index has a duplicate of the data from the indexed columns kept ordered together with pointers to the actual data rows (pointers to the clustered index if there is one). This means that accessing data through a non-clustered index has to go through an extra layer of indirection. However if you select only the data that's available in the indexed columns you can get the data back directly from the duplicated index data (that's why it's a good idea to SELECT only the columns that you need and not use *)


// Copied from MSDN, the second point of non-clustered index is not clearly mentioned in the other answers.

Clustered

  • Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order.
  • The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

Nonclustered

  • Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and
    each key value entry has a pointer to the data row that contains the key value.
  • The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

A clustered index is essentially a sorted copy of the data in the indexed columns.

The main advantage of a clustered index is that when your query (seek) locates the data in the index then no additional IO is needed to retrieve that data.

The overhead of maintaining a clustered index, especially in a frequently updated table, can lead to poor performance and for that reason it may be preferable to create a non-clustered index.


Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.

Unclustered means it's "only" a logical order.


Pros:

Clustered indexes work great for ranges (e.g. select * from my_table where my_key between @min and @max)

In some conditions, the DBMS will not have to do work to sort if you use an orderby statement.

Cons:

Clustered indexes are can slow down inserts because the physical layouts of the records have to be modified as records are put in if the new keys are not in sequential order.


A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.

A Non-Clustered Index defines a logical order that does not match the physical order on disk.


Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.

Unclustered means it's "only" a logical order.


A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.

A Non-Clustered Index defines a logical order that does not match the physical order on disk.


Clustered basically means that the data is in that physical order in the table. This is why you can have only one per table.

Unclustered means it's "only" a logical order.


Pros:

Clustered indexes work great for ranges (e.g. select * from my_table where my_key between @min and @max)

In some conditions, the DBMS will not have to do work to sort if you use an orderby statement.

Cons:

Clustered indexes are can slow down inserts because the physical layouts of the records have to be modified as records are put in if the new keys are not in sequential order.


Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.

Non-clustered indexes are stored separately, and you can have as many as you want.

The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.


You might have gone through theory part from the above posts:

-The clustered Index as we can see points directly to record i.e. its direct so it takes less time for a search. Additionally it will not take any extra memory/space to store the index

-While, in non-clustered Index, it indirectly points to the clustered Index then it will access the actual record, due to its indirect nature it will take some what more time to access.Also it needs its own memory/space to store the index

enter image description here


Pros:

Clustered indexes work great for ranges (e.g. select * from my_table where my_key between @min and @max)

In some conditions, the DBMS will not have to do work to sort if you use an orderby statement.

Cons:

Clustered indexes are can slow down inserts because the physical layouts of the records have to be modified as records are put in if the new keys are not in sequential order.


A clustered index is essentially a sorted copy of the data in the indexed columns.

The main advantage of a clustered index is that when your query (seek) locates the data in the index then no additional IO is needed to retrieve that data.

The overhead of maintaining a clustered index, especially in a frequently updated table, can lead to poor performance and for that reason it may be preferable to create a non-clustered index.


Clustered Index

  1. There can be only one clustered index for a table.
  2. Usually made on the primary key.
  3. The leaf nodes of a clustered index contain the data pages.

Non-Clustered Index

  1. There can be only 249 non-clustered indexes for a table(till sql version 2005 later versions support upto 999 non-clustered indexes).
  2. Usually made on the any key.
  3. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

Clustered Index

  1. There can be only one clustered index for a table.
  2. Usually made on the primary key.
  3. The leaf nodes of a clustered index contain the data pages.

Non-Clustered Index

  1. There can be only 249 non-clustered indexes for a table(till sql version 2005 later versions support upto 999 non-clustered indexes).
  2. Usually made on the any key.
  3. The leaf node of a nonclustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

Clustered Index

  • Only one clustered index can be there in a table
  • Sort the records and store them physically according to the order
  • Data retrieval is faster than non-clustered indexes
  • Do not need extra space to store logical structure

Non Clustered Index

  • There can be any number of non-clustered indexes in a table
  • Do not affect the physical order. Create a logical order for data rows and use pointers to physical data files
  • Data insertion/update is faster than clustered index
  • Use extra space to store logical structure

Apart from these differences you have to know that when table is non-clustered (when the table doesn't have a clustered index) data files are unordered and it uses Heap data structure as the data structure.


An indexed database has two parts: a set of physical records, which are arranged in some arbitrary order, and a set of indexes which identify the sequence in which records should be read to yield a result sorted by some criterion. If there is no correlation between the physical arrangement and the index, then reading out all the records in order may require making lots of independent single-record read operations. Because a database may be able to read dozens of consecutive records in less time than it would take to read two non-consecutive records, performance may be improved if records which are consecutive in the index are also stored consecutively on disk. Specifying that an index is clustered will cause the database to make some effort (different databases differ as to how much) to arrange things so that groups of records which are consecutive in the index will be consecutive on disk.

For example, if one were to start with an empty non-clustered database and add 10,000 records in random sequence, the records would likely be added at the end in the order they were added. Reading out the database in order by the index would require 10,000 one-record reads. If one were to use a clustered database, however, the system might check when adding each record whether the previous record was stored by itself; if it found that to be the case, it might write that record with the new one at the end of the database. It could then look at the physical record before the slots where the moved records used to reside and see if the record that followed that was stored by itself. If it found that to be the case, it could move that record to that spot. Using this sort of approach would cause many records to be grouped together in pairs, thus potentially nearly doubling sequential read speed.

In reality, clustered databases use more sophisticated algorithms than this. A key thing to note, though, is that there is a tradeoff between the time required to update the database and the time required to read it sequentially. Maintaining a clustered database will significantly increase the amount of work required to add, remove, or update records in any way that would affect the sorting sequence. If the database will be read sequentially much more often than it will be updated, clustering can be a big win. If it will be updated often but seldom read out in sequence, clustering can be a big performance drain, especially if the sequence in which items are added to the database is independent of their sort order with regard to the clustered index.


You might have gone through theory part from the above posts:

-The clustered Index as we can see points directly to record i.e. its direct so it takes less time for a search. Additionally it will not take any extra memory/space to store the index

-While, in non-clustered Index, it indirectly points to the clustered Index then it will access the actual record, due to its indirect nature it will take some what more time to access.Also it needs its own memory/space to store the index

enter image description here


Clustered indexes physically order the data on the disk. This means no extra data is needed for the index, but there can be only one clustered index (obviously). Accessing data using a clustered index is fastest.

All other indexes must be non-clustered. A non-clustered index has a duplicate of the data from the indexed columns kept ordered together with pointers to the actual data rows (pointers to the clustered index if there is one). This means that accessing data through a non-clustered index has to go through an extra layer of indirection. However if you select only the data that's available in the indexed columns you can get the data back directly from the duplicated index data (that's why it's a good idea to SELECT only the columns that you need and not use *)


Clustered Index

  • Only one clustered index can be there in a table
  • Sort the records and store them physically according to the order
  • Data retrieval is faster than non-clustered indexes
  • Do not need extra space to store logical structure

Non Clustered Index

  • There can be any number of non-clustered indexes in a table
  • Do not affect the physical order. Create a logical order for data rows and use pointers to physical data files
  • Data insertion/update is faster than clustered index
  • Use extra space to store logical structure

Apart from these differences you have to know that when table is non-clustered (when the table doesn't have a clustered index) data files are unordered and it uses Heap data structure as the data structure.


// Copied from MSDN, the second point of non-clustered index is not clearly mentioned in the other answers.

Clustered

  • Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order.
  • The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

Nonclustered

  • Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and
    each key value entry has a pointer to the data row that contains the key value.
  • The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key.

Clustered indexes physically order the data on the disk. This means no extra data is needed for the index, but there can be only one clustered index (obviously). Accessing data using a clustered index is fastest.

All other indexes must be non-clustered. A non-clustered index has a duplicate of the data from the indexed columns kept ordered together with pointers to the actual data rows (pointers to the clustered index if there is one). This means that accessing data through a non-clustered index has to go through an extra layer of indirection. However if you select only the data that's available in the indexed columns you can get the data back directly from the duplicated index data (that's why it's a good idea to SELECT only the columns that you need and not use *)


Clustered indexes are stored physically on the table. This means they are the fastest and you can only have one clustered index per table.

Non-clustered indexes are stored separately, and you can have as many as you want.

The best option is to set your clustered index on the most used unique column, usually the PK. You should always have a well selected clustered index in your tables, unless a very compelling reason--can't think of a single one, but hey, it may be out there--for not doing so comes up.


A clustered index actually describes the order in which records are physically stored on the disk, hence the reason you can only have one.

A Non-Clustered Index defines a logical order that does not match the physical order on disk.


Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to indexing

numpy array TypeError: only integer scalar arrays can be converted to a scalar index How to print a specific row of a pandas DataFrame? What does 'index 0 is out of bounds for axis 0 with size 0' mean? How does String.Index work in Swift Pandas KeyError: value not in index Update row values where certain condition is met in pandas Pandas split DataFrame by column value Rebuild all indexes in a Database How are iloc and loc different? pandas loc vs. iloc vs. at vs. iat?

Examples related to clustered-index

Difference between clustered and nonclustered index What do Clustered and Non clustered index actually mean? What are the differences between a clustered and a non-clustered index?

Examples related to non-clustered-index

How to drop a unique constraint from table column? Difference between clustered and nonclustered index What do Clustered and Non clustered index actually mean? What are the differences between a clustered and a non-clustered index?