Keys Overview
Keys Overview
Keys are used to uniquely identify any record or row of data from the table. It is also
used to establish and identify relationships between tables.
Keys are of different types eg: Super key, Candidate key, Primary Key, Foreign key,
etc.
Keys in DBMS are introduced to avoid redundancy in data. A table represents a big box with
unique records, and keys help to identify those records efficiently. Keys in DBMS can be
single or a group of attributes that uniquely identify the records. Using all attributes as a key
is less efficient than choosing the minimum combination of attributes that provide the same
result.
Keys in DBMS are essential for identifying unique records in a table. In a database
application, data must be organized into separate tables to avoid redundancy, which can
create complexity when identifying common relationships between entities. For instance, in a
user and pizza shop ledger example, each user has a unique identifier along with other
attributes, and each transaction in the ledger contains information about the user and the pizza
they bought. This creates redundancy because the user data is copied into the ledger.
However, it is necessary to have this redundancy to link each transaction to the user who
made it. Keys in DBMS help ensure data integrity and consistency by uniquely identifying
each record in a table, and there are different types of keys, such as primary keys, foreign
keys, candidate keys, alternate keys, and composite keys.
User Table
If we can see from the above table how we’re unnecessarily writing the same data of the user.
Even information related to pizza is redundant (this is where we need normalization concepts
in DBMS).
In this kind of database, one can query about user-specific details, pizza-specific details, or
even query related to both, like fetch all the users who have bought any kind of pizza. And, to
fetch the results, we want KEYS.
So, keys in DBMS come to the rescue. DBMS keys are also used to create relationships
between tables. We can use a type of key to solve this redundancy (see types of keys in the
DBMS section to know more).
Relating a table with another table is a part of the basic to complex level of applications. And,
DBMS keys are the way towards it. Let’s learn the type of keys in DBMS, each with an
example.
Super Key
Candidate Key
Primary Key
Alternate Key
Foreign Key
Composite Key
Compound Key
Surrogate Key
Before discussing these keys, let’s first understand an important nomenclature. Have a look at
the table given below.
From the knowledge we gain from the article until now, can we find the key to uniquely
identify the record? (Hey, don’t scroll down. Please try!)
Yes, it’s SNo. And, if we saw values inside, the Age attribute also has unique values. But do
we think using it is valuable in the long run? Age is a kind of value that has a very chance of
repetition. So, that’s why we avoid using it as a key. And, we use SNo to identify each record
of the given table uniquely.
Note: In real life, we don’t make a table first and then decide which attribute(s) are eligible to
be taken as keys. We decide all these things beforehand (while designing the database), so
here we’re doing it just for learning purposes.
Back to nomenclature. To describe the above example on paper, we can use SNo -> Name,
Age.
(by the way, this is called functional dependency, but we can ignore this term for now)
This means that from SNo, we can get the values of three things, i.e., Name, Age, and SNo
itself.
In general, if we say, A, B, C, D, E are attributes of a table, and we’ve given AB -> CDE.
This means that given values of A and B, we can find values of C, D, and E.
1. Super Key
Everything we learned above is nothing but a super key. A super key is a set of all the keys
(with single or multiple attributes) which can uniquely identify the records of the table.
Table Alpha
A B C
Text Text Text
- - -
- - -
- - -
We’ve 3 attributes, A, B, and C. Since, knowing what data the table has is not the primary
concern for the time being, instead of filling some random data into the table, we define it
with dashes (-).
A possible set :
This is a set of keys, and we can use any of those to identify the record. BUT don’t we think
we should be efficient by choosing a key out of all? What, according to we, is the best option
to go for the above example? If we’re thinking of one of A, B, or C, we’re right. It’s a good
choice because given a single value, we can fetch all other values whereas, in other available
keys, we should know 2 or 3 attributes beforehand to fetch values of all other attributes.
Keeping this in mind, let’s discuss other keys.
2. Candidate key
From all the super keys available, the candidate key is the one whose proper subset is not a
super key. Don’t get it? Let’s see it with an example.
ABC-> ABC
AB -> C
BC -> A
A -> BC
Given ABC key, we’re saying if we find its proper subset, i.e., {A, B, C, AB, AC, BC} and
any of them can be a key, then ABC can’t become a candidate key.
AB, which comes from the proper subset of ABC, is also a key. Therefore, ABC is not a
candidate key.
Note: If x is a proper subset of y, then x must NOT have one element that is included in y.
Example,
X = {1,2,3} Y = {1,2,3,4}
Similarly, AB is also not a candidate key because we have got A in its proper subset which,
can be used as a key. Therefore, at last, we found that A is the ONLY candidate key available
for the above schema.
We should keep in mind that we can also have multiple candidate keys. For the above
example, if we have a set of the super key as,
1. ABC-> ABC
2. AB -> C
3. BC -> A
4. A -> BC
5. B -> AC
6. C -> AB
3. Primary key
We learned to find candidate keys from a set of super keys. Now, the primary key is nothing
but a candidate key which has given the right to be called the primary key. All other
candidate keys can also be used as a primary key, but the database administrator (or we) can
choose a single key out of those to be a primary key. We can use the primary key to uniquely
identify the records of a table.
1. ABC-> ABC
2. AB -> C
3. BC -> A
4. A -> BC
5. B -> AC
6. C -> AB
4. Alternate key
We learned the primary key is nothing but a candidate key given primary key rights. But
what do we call all other candidate keys? Yes, we’re right. We call them alternate keys.
Alternate keys are nothing but candidate keys that don’t get rights of the primary key.
A BC
- - -
- - -
- - -
ABC-> ABC
AB -> C
BC -> A
A -> BC
B -> AC
C -> AB
Out of those, A, B, and C are valid candidate keys. If we choose A as the primary key then B
and C will become alternate keys.
5. Foreign key
In starting off our discussion, we discussed that any basic database includes 2 or more tables
at least. And, in most of the cases, we’ll find some relation between them. So, in order to
relate 2 tables with each other, we use the foreign key. We make the primary key of a table as
a foreign key in another table of interest. Let’s see this in action.
Suppose we’ve table user and order, and the order table wants to use the contents of the table
user. The functionality we’re looking for is that we are able to get each user and their
corresponding order data. There’s one bad way and one good way to design this.
Table user
User_id (primary key) Name Age
1 A 18
2 B 19
3 C 16
Table order
Why is this a bad way? First, we’re unnecessarily copying the entire table of the user inside
the order table. Secondly, if a user changes their name, we’ve to modify the user table and the
order table to reflect this change. And it’s just 2 tables. Imagine if we’ve a bunch of tables
then it becomes cumbersome to reflect all those changes. Isn’t it?
So, what is a good way? Use foreign key. See the design below.
Do we get the point now? Instead of copying the entire table records, we set SNo (primary
key in the user table) as a foreign key in the order table. We are leveraging the power of the
primary key as it’s capable of uniquely identifying the records. So why copy? Instead, refer
to those through its primary key and set it as a foreign key in another table.
Now, if we want to change anything inside the user, only the user table has to be modified,
and all the changes will automatically reflect everywhere.
6. Composite key
A B C
- - -
- - -
- - -
1. ABC-> ABC
2. AB -> C
3. BC -> A
Now, 2 possible candidate keys here would be AB and BC. Suppose we make AB the
primary key. Since AB consists of 2 attributes (A and B), therefore it can be referred to as a
composite key.
7. Unique key
Unique key looks very similar to primary key but they’re not and have a very different usage
altogether. Once we set an attribute as a unique key then its column value should be unique,
although only ONE value could be null. Let’s say we want the name of each user to be
unique (for some reason) then we can use this key.
Also, a unique key can be applied to multiple attributes, not just one like in the primary key.
And we can even modify the unique key attribute, which is not possible in the case of the
primary key.
We set a unique key on the Age attribute, which ensures that no value inside the Age attribute
is duplicated, and at the same time, it can accept one value to be null (if required).
8. Surrogate key
A surrogate key is a unique identifier that uniquely identifies the object or entity. It is used
for representing the existence of data analysis. It represents an outside entity as a database
object but is not visible to the user and application.
example.