[database] What are the differences between a superkey and a candidate key?

What are the differences between a super key and a candidate key? I have already referred to wiki,dotnet spider and also Database Concepts 4th edition book. But I am unable to understand the concept. Can anyone please explain it with proper example?

This question is related to database

The answer is


A super key is any combination of columns that uniquely identifies a row in a table. A candidate key is a super key which cannot have any columns removed from it without losing the unique identification property. This property is sometimes known as minimality or (better) irreducibility.

A super key ? a primary key in general. The primary key is simply a candidate key chosen to be the main key. However, in dependency theory, candidate keys are important and the primary key is not more important than any of the other candidate keys. Non-primary candidate keys are also known as alternative keys.

Consider this table of Elements:

CREATE TABLE elements
(
    atomic_number   INTEGER NOT NULL PRIMARY KEY
                    CHECK (atomic_number > 0 AND atomic_number < 120),
    symbol          CHAR(3) NOT NULL UNIQUE,
    name            CHAR(20) NOT NULL UNIQUE,
    atomic_weight   DECIMAL(8,4) NOT NULL,
    period          SMALLINT NOT NULL
                    CHECK (period BETWEEN 1 AND 7),
    group           CHAR(2) NOT NULL
                    -- 'L' for Lanthanoids, 'A' for Actinoids
                    CHECK (group IN ('1', '2', 'L', 'A', '3', '4', '5', '6',
                                     '7', '8', '9', '10', '11', '12', '13',
                                     '14', '15', '16', '17', '18')),
    stable          CHAR(1) DEFAULT 'Y' NOT NULL
                    CHECK (stable IN ('Y', 'N'))
);

It has three unique identifiers - atomic number, element name, and symbol. Each of these, therefore, is a candidate key. Further, unless you are dealing with a table that can only ever hold one row of data (in which case the empty set (of columns) is a candidate key), you cannot have a smaller-than-one-column candidate key, so the candidate keys are irreducible.

Consider a key made up of { atomic number, element name, symbol }. If you supply a consistent set of values for these three fields (say { 6, Carbon, C }), then you uniquely identify the entry for an element - Carbon. However, this is very much a super key that is not a candidate key because it is not irreducible; you can eliminate any two of the three fields without losing the unique identification property.

As another example, consider a key made up of { atomic number, period, group }. Again, this is a unique identifier for a row; { 6, 2, 14 } identifies Carbon (again). If it were not for the Lanthanoids and Actinoids, then the combination of { period, group } would be unique, but because of them, it is not. However, as before, atomic number on its own is sufficient to uniquely identify an element, so this is a super key and not a candidate key.


In nutshell: CANDIDATE KEY is a minimal SUPER KEY.

Where Super key is the combination of columns(or attributes) that uniquely identify any record(or tuple) in a relation(table) in RDBMS.


For instance, consider the following dependencies in a table having columns A, B, C, and D (Giving this table just for a quick example so not covering all dependencies that R could have).

Attribute set (Determinant)---Can Identify--->(Dependent)

A-----> AD

B-----> ABCD

C-----> CD

AC----->ACD

AB----->ABCD

ABC----->ABCD

BCD----->ABCD


Now, B, AB, ABC, BCD identifies all columns so those four qualify for the super key.

But, B?AB; B?ABC; B?BCD hence AB, ABC, and BCD disqualified for CANDIDATE KEY as their subsets could identify the relation, so they aren't minimal and hence only B is the candidate key, not the others.

One more thing Primary key is any one among the candidate keys.

Thanks for asking


Super Key: A superkey is any set of attributes for which the values are guaranteed to be unique for all possible set of tuples in a table at all time.

Candidate Key: A candidate key is a 'minimal' super key meaning the smallest subset of superkey attribute which is unique.


Basically, a Candidate Key is a Super Key from which no more Attribute can be pruned.

A Super Key identifies uniquely rows/tuples in a table/relation of a database. It is composed by a set of attributes that combined can assume values unique over the rows/tuples of a table/relation. A Candidate Key is built by a Super Key, iteratively removing/pruning non-key attributes, keeping an invariant: the newly created Key still need to uniquely identifies the rows/tuples.

A Candidate Key might be seen as a minimal Super Key, in terms of attributes.

Candidate Keys can be used to reference uniquely rows/tuples but from the RDBMS engine perspective the burden to maintain indexes on them is far heavier.


One candidate key is chosen as the primary key. Other candidate keys are called alternate keys.


Superkey :A set of attributes or combination of attributes which uniquely identify the tuple in a given relation . Superkey have two properties uniqueness and reducible set

Candidate key: Minimal set of superkey which have following two properties: uniqueness and irreducible set or attribute


A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.

A candidate key of an entity set is a minimal super key.

Let's go on with customer, loan and borrower sets that you can find an image from the link == 1

  • Rectangles represent entity sets == 1
  • Diamonds represent relationship set == 1
  • Elipses represent attributes == 1
  • Underline indicates Primary Keys == 1

customer_id is the candidate key of the customer set, loan_number is the candidate key of the loan set.

Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Borrower set is formed customer_id and loan_number as a relationship set.


Super key is the combination of fields by which the row is uniquely identified and the candidate key is the minimal super key.


A Super key is a set or one of more columns to uniquely identify rows in a table.

Candidate keys are selected from the set of super keys, the only thing we take care while selecting candidate key is: It should not have any redundant attribute. That’s the reason they are also termed as minimal super key.

In Employee table there are Three Columns : Emp_Code,Emp_Number,Emp_Name

Super keys:

All of the following sets are able to uniquely identify rows of the employee table.

{Emp_Code}
{Emp_Number}
{Emp_Code, Emp_Number}
{Emp_Code, Emp_Name}
{Emp_Code, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}

Candidate Keys:

As I stated above, they are the minimal super keys with no redundant attributes.

{Emp_Code}
{Emp_Number}

Primary key:

Primary key is being selected from the sets of candidate keys by database designer. So Either {Emp_Code} or {Emp_Number} can be the primary key.


Super key: super key is a set of atttibutes in a relation(table).which can define every tupple in the relation(table) uniquely.

Candidate key: we can say minimal super key is candidate key. Candidate is the smallest sub set of super key. And can uniquely define each and every tupple.