[sql] sql primary key and index

Say I have an ID row (int) in a database set as the primary key. If I query off the ID often do I also need to index it? Or does it being a primary key mean it's already indexed?

Reason I ask is because in MS SQL Server I can create an index on this ID, which as I stated is my primary key.

Edit: an additional question - will it do any harm to additionally index the primary key?

This question is related to sql sql-server tsql indexing primary-key

The answer is


Declaring a PRIMARY KEY or UNIQUE constraint causes SQL Server to automatically create an index.

An unique index can be created without matching a constraint, but a constraint (either primary key or unique) cannot exist without having a unique index.

From here, the creation of a constraint will:

  • cause an index with the same name to be created
  • deny dropping the created index as constraint is not allowed to exists without it

and at the same time dropping the constraint will drop the associated index.

So, is there actual difference between a PRIMARY KEY or UNIQUE INDEX:

  • NULL values are not allowed in PRIMARY KEY, but allowed in UNIQUE index; and like in set operators (UNION, EXCEPT, INTERSECT), here NULL = NULL which means that you can have only one value as two NULLs are find as duplicates of each other;
  • only one PRIMARY KEY may exists per table while 999 unique indexes can be created
  • when PRIMARY KEY constraint is created, it is created as clustered unless there is already a clustered index on the table or NONCLUSTERED is used in its definition; when UNIQUE index is created, it is created as NONCLUSTERED unless it is not specific to be CLUSTERED and such already does not exist;

Primary keys are always indexed by default.

You can define a primary key in SQL Server 2012 by using SQL Server Management Studio or Transact-SQL. Creating a primary key automatically creates a corresponding unique, clustered or nonclustered index.

http://technet.microsoft.com/en-us/library/ms189039.aspx


primary keys are automatically indexed

you can create additional indices using the pk depending on your usage

  • index zip_code, id may be helpful if you often select by zip_code and id

Primary keys are always indexed by default.

You can define a primary key in SQL Server 2012 by using SQL Server Management Studio or Transact-SQL. Creating a primary key automatically creates a corresponding unique, clustered or nonclustered index.

http://technet.microsoft.com/en-us/library/ms189039.aspx


Declaring a PRIMARY KEY or UNIQUE constraint causes SQL Server to automatically create an index.

An unique index can be created without matching a constraint, but a constraint (either primary key or unique) cannot exist without having a unique index.

From here, the creation of a constraint will:

  • cause an index with the same name to be created
  • deny dropping the created index as constraint is not allowed to exists without it

and at the same time dropping the constraint will drop the associated index.

So, is there actual difference between a PRIMARY KEY or UNIQUE INDEX:

  • NULL values are not allowed in PRIMARY KEY, but allowed in UNIQUE index; and like in set operators (UNION, EXCEPT, INTERSECT), here NULL = NULL which means that you can have only one value as two NULLs are find as duplicates of each other;
  • only one PRIMARY KEY may exists per table while 999 unique indexes can be created
  • when PRIMARY KEY constraint is created, it is created as clustered unless there is already a clustered index on the table or NONCLUSTERED is used in its definition; when UNIQUE index is created, it is created as NONCLUSTERED unless it is not specific to be CLUSTERED and such already does not exist;

NOTE: This answer addresses enterprise-class development in-the-large.

This is an RDBMS issue, not just SQL Server, and the behavior can be very interesting. For one, while it is common for primary keys to be automatically (uniquely) indexed, it is NOT absolute. There are times when it is essential that a primary key NOT be uniquely indexed.

In most RDBMSs, a unique index will automatically be created on a primary key if one does not already exist. Therefore, you can create your own index on the primary key column before declaring it as a primary key, then that index will be used (if acceptable) by the database engine when you apply the primary key declaration. Often, you can create the primary key and allow its default unique index to be created, then create your own alternate index on that column, then drop the default index.

Now for the fun part--when do you NOT want a unique primary key index? You don't want one, and can't tolerate one, when your table acquires enough data (rows) to make the maintenance of the index too expensive. This varies based on the hardware, the RDBMS engine, characteristics of the table and the database, and the system load. However, it typically begins to manifest once a table reaches a few million rows.

The essential issue is that each insert of a row or update of the primary key column results in an index scan to ensure uniqueness. That unique index scan (or its equivalent in whichever RDBMS) becomes much more expensive as the table grows, until it dominates the performance of the table.

I have dealt with this issue many times with tables as large as two billion rows, 8 TBs of storage, and forty million row inserts per day. I was tasked to redesign the system involved, which included dropping the unique primary key index practically as step one. Indeed, dropping that index was necessary in production simply to recover from an outage, before we even got close to a redesign. That redesign included finding other ways to ensure the uniqueness of the primary key and to provide quick access to the data.


primary keys are automatically indexed

you can create additional indices using the pk depending on your usage

  • index zip_code, id may be helpful if you often select by zip_code and id

Primary keys are always indexed by default.

You can define a primary key in SQL Server 2012 by using SQL Server Management Studio or Transact-SQL. Creating a primary key automatically creates a corresponding unique, clustered or nonclustered index.

http://technet.microsoft.com/en-us/library/ms189039.aspx


NOTE: This answer addresses enterprise-class development in-the-large.

This is an RDBMS issue, not just SQL Server, and the behavior can be very interesting. For one, while it is common for primary keys to be automatically (uniquely) indexed, it is NOT absolute. There are times when it is essential that a primary key NOT be uniquely indexed.

In most RDBMSs, a unique index will automatically be created on a primary key if one does not already exist. Therefore, you can create your own index on the primary key column before declaring it as a primary key, then that index will be used (if acceptable) by the database engine when you apply the primary key declaration. Often, you can create the primary key and allow its default unique index to be created, then create your own alternate index on that column, then drop the default index.

Now for the fun part--when do you NOT want a unique primary key index? You don't want one, and can't tolerate one, when your table acquires enough data (rows) to make the maintenance of the index too expensive. This varies based on the hardware, the RDBMS engine, characteristics of the table and the database, and the system load. However, it typically begins to manifest once a table reaches a few million rows.

The essential issue is that each insert of a row or update of the primary key column results in an index scan to ensure uniqueness. That unique index scan (or its equivalent in whichever RDBMS) becomes much more expensive as the table grows, until it dominates the performance of the table.

I have dealt with this issue many times with tables as large as two billion rows, 8 TBs of storage, and forty million row inserts per day. I was tasked to redesign the system involved, which included dropping the unique primary key index practically as step one. Indeed, dropping that index was necessary in production simply to recover from an outage, before we even got close to a redesign. That redesign included finding other ways to ensure the uniqueness of the primary key and to provide quick access to the data.


As everyone else have already said, primary keys are automatically indexed.

Creating more indexes on the primary key column only makes sense when you need to optimize a query that uses the primary key and some other specific columns. By creating another index on the primary key column and including some other columns with it, you may reach the desired optimization for a query.

For example you have a table with many columns but you are only querying ID, Name and Address columns. Taking ID as the primary key, we can create the following index that is built on ID but includes Name and Address columns.

CREATE NONCLUSTERED INDEX MyIndex
ON MyTable(ID)
INCLUDE (Name, Address)

So, when you use this query:

SELECT ID, Name, Address FROM MyTable WHERE ID > 1000

SQL Server will give you the result only using the index you've created and it'll not read anything from the actual table.


Making it a primary key should also automatically create an index for it.


Here the passage from the MSDN:

When you specify a PRIMARY KEY constraint for a table, the Database Engine enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. Therefore, the primary keys that are chosen must follow the rules for creating unique indexes.


I have a huge database with no (separate) index.

Any time I query by the primary key the results are, for all intensive purposes, instant.


NOTE: This answer addresses enterprise-class development in-the-large.

This is an RDBMS issue, not just SQL Server, and the behavior can be very interesting. For one, while it is common for primary keys to be automatically (uniquely) indexed, it is NOT absolute. There are times when it is essential that a primary key NOT be uniquely indexed.

In most RDBMSs, a unique index will automatically be created on a primary key if one does not already exist. Therefore, you can create your own index on the primary key column before declaring it as a primary key, then that index will be used (if acceptable) by the database engine when you apply the primary key declaration. Often, you can create the primary key and allow its default unique index to be created, then create your own alternate index on that column, then drop the default index.

Now for the fun part--when do you NOT want a unique primary key index? You don't want one, and can't tolerate one, when your table acquires enough data (rows) to make the maintenance of the index too expensive. This varies based on the hardware, the RDBMS engine, characteristics of the table and the database, and the system load. However, it typically begins to manifest once a table reaches a few million rows.

The essential issue is that each insert of a row or update of the primary key column results in an index scan to ensure uniqueness. That unique index scan (or its equivalent in whichever RDBMS) becomes much more expensive as the table grows, until it dominates the performance of the table.

I have dealt with this issue many times with tables as large as two billion rows, 8 TBs of storage, and forty million row inserts per day. I was tasked to redesign the system involved, which included dropping the unique primary key index practically as step one. Indeed, dropping that index was necessary in production simply to recover from an outage, before we even got close to a redesign. That redesign included finding other ways to ensure the uniqueness of the primary key and to provide quick access to the data.


Making it a primary key should also automatically create an index for it.


Well in SQL Server, generally, primary key is automatically indexed. This is true, but it not guaranteed of faster query. The primary key will give you excellent performance when there is only 1 field as primary key. But, when there are multiple field as primary key, then the index is based on those fields.

For example: Field A, B, C are the primary key, thus when you do query based on those 3 fields in your WHERE CLAUSE, the performance is good, BUT when you want to query with Only C field in the WHERE CLAUSE, you wont get good performance. Thus, to get your performance up and running, you will need to index C field manually.

Most of the time, you wont see the issue till you hits more than 1 million records.


Here the passage from the MSDN:

When you specify a PRIMARY KEY constraint for a table, the Database Engine enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. Therefore, the primary keys that are chosen must follow the rules for creating unique indexes.


Making it a primary key should also automatically create an index for it.


Here the passage from the MSDN:

When you specify a PRIMARY KEY constraint for a table, the Database Engine enforces data uniqueness by creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. Therefore, the primary keys that are chosen must follow the rules for creating unique indexes.


As everyone else have already said, primary keys are automatically indexed.

Creating more indexes on the primary key column only makes sense when you need to optimize a query that uses the primary key and some other specific columns. By creating another index on the primary key column and including some other columns with it, you may reach the desired optimization for a query.

For example you have a table with many columns but you are only querying ID, Name and Address columns. Taking ID as the primary key, we can create the following index that is built on ID but includes Name and Address columns.

CREATE NONCLUSTERED INDEX MyIndex
ON MyTable(ID)
INCLUDE (Name, Address)

So, when you use this query:

SELECT ID, Name, Address FROM MyTable WHERE ID > 1000

SQL Server will give you the result only using the index you've created and it'll not read anything from the actual table.


a PK will become a clustered index unless you specify non clustered


Making it a primary key should also automatically create an index for it.


primary keys are automatically indexed

you can create additional indices using the pk depending on your usage

  • index zip_code, id may be helpful if you often select by zip_code and id

a PK will become a clustered index unless you specify non clustered


I have a huge database with no (separate) index.

Any time I query by the primary key the results are, for all intensive purposes, instant.


Well in SQL Server, generally, primary key is automatically indexed. This is true, but it not guaranteed of faster query. The primary key will give you excellent performance when there is only 1 field as primary key. But, when there are multiple field as primary key, then the index is based on those fields.

For example: Field A, B, C are the primary key, thus when you do query based on those 3 fields in your WHERE CLAUSE, the performance is good, BUT when you want to query with Only C field in the WHERE CLAUSE, you wont get good performance. Thus, to get your performance up and running, you will need to index C field manually.

Most of the time, you wont see the issue till you hits more than 1 million records.


a PK will become a clustered index unless you specify non clustered


Primary keys are always indexed by default.

You can define a primary key in SQL Server 2012 by using SQL Server Management Studio or Transact-SQL. Creating a primary key automatically creates a corresponding unique, clustered or nonclustered index.

http://technet.microsoft.com/en-us/library/ms189039.aspx


primary keys are automatically indexed

you can create additional indices using the pk depending on your usage

  • index zip_code, id may be helpful if you often select by zip_code and id

As everyone else have already said, primary keys are automatically indexed.

Creating more indexes on the primary key column only makes sense when you need to optimize a query that uses the primary key and some other specific columns. By creating another index on the primary key column and including some other columns with it, you may reach the desired optimization for a query.

For example you have a table with many columns but you are only querying ID, Name and Address columns. Taking ID as the primary key, we can create the following index that is built on ID but includes Name and Address columns.

CREATE NONCLUSTERED INDEX MyIndex
ON MyTable(ID)
INCLUDE (Name, Address)

So, when you use this query:

SELECT ID, Name, Address FROM MyTable WHERE ID > 1000

SQL Server will give you the result only using the index you've created and it'll not read anything from the actual table.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to sql-server

Passing multiple values for same variable in stored procedure SQL permissions for roles Count the Number of Tables in a SQL Server Database Visual Studio 2017 does not have Business Intelligence Integration Services/Projects ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database How to create temp table using Create statement in SQL Server? SQL Query Where Date = Today Minus 7 Days How do I pass a list as a parameter in a stored procedure? SQL Server date format yyyymmdd

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to indexing

numpy array TypeError: only integer scalar arrays can be converted to a scalar index How to print a specific row of a pandas DataFrame? What does 'index 0 is out of bounds for axis 0 with size 0' mean? How does String.Index work in Swift Pandas KeyError: value not in index Update row values where certain condition is met in pandas Pandas split DataFrame by column value Rebuild all indexes in a Database How are iloc and loc different? pandas loc vs. iloc vs. at vs. iat?

Examples related to primary-key

Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object What is Hash and Range Primary Key? Can I use VARCHAR as the PRIMARY KEY? Can a foreign key refer to a primary key in the same table? UUID max character length MySQL duplicate entry error even though there is no duplicate entry Creating composite primary key in SQL Server What are the best practices for using a GUID as a primary key, specifically regarding performance? Add primary key to existing table Create view with primary key?