Normalization of Relations
Normalization of Relations
If you have been working with databases for a while, chances are you have heard the term
normalization. Perhaps someone's asked you "Is that database normalized?" or "Is that in
BCNF?" All too often, the reply is "Uh, yeah." Normalization is often brushed aside as a
luxury that only academics have time for. However, knowing the principles of
normalization and applying them to your daily database design tasks really isn't all that
complicated and it could drastically improve the performance of your DBMS.
Normalization of Relations
The normalization process, as first proposed by Codd (1972a), takes a relation schema
through a series of tests to certify whether it satisfies a certain normal form. The process,
which proceeds in a top-down fashion by evaluating each relation against the criteria for
normal forms and decomposing relations as necessary, can thus be considered as relational
design by analysis. Initially, Codd proposed three normal forms, which he called first,
second, and third normal form. A stronger definition of 3NF—called Boyce-Codd normal
form (BCNF)—was proposed later by Boyce and Codd. All these normal forms are based
on a single analytical tool: the functional dependencies among the attributes of a relation.
Later, a fourth normal form (4NF) and a fifth normal form (5NF) were proposed, based on
the concepts of multivalued dependencies and join dependencies, respectively;
So what is Normalization?
Normalization is the process of efficiently organizing data in a database. There are two aim
of the normalization process: eliminating redundant data in instance, storing the same data
in more than one table and ensuring data dependencies make sense. Only storing related
data in a table. Both of these are worthy goals as they reduce the amount of space a
database consumes and ensure that data is logically stored.
Data Normalization Process
Benefits of Normalization
Normalization provides numerous benefits to a database. Some of the major benefits include
the following :
Organization is brought about by the normalization process, making everyone's job easier,
from the user who accesses tables to the database administrator (DBA) who is responsible
for the overall management of every object in the database. Data redundancy is reduced,
which simplifies data structures and conserves disk space. Because duplicate data is
minimized, the possibility of inconsistent data is greatly reduced. For example, in one table
an individual's name could read STEVE SMITH, whereas the name of the same individual
reads STEPHEN R. SMITH in another table. Because the database has been normalized and
broken into smaller tables, you are provided with more flexibility as far as modifying
existing structures. It is much easier to modify a small table with little data than to modify
one big table that holds all the vital data in the database. Lastly, security is also provided in
the sense that the DBA can grant access to limited tables to certain users. Security is easier
to control when normalization has occurred.
Data Normalization Process
The database community has developed a series of guidelines for ensuring that databases are
normalized. These are referred to as normal forms and are numbered from one (the lowest
form of normalization, referred to as first normal form or 1NF) through five (fifth normal
form or 5NF). With the current practical applications, we will often see 1NF, 2NF, and 3NF
along with the occasional 4NF. Fifth normal form is very rarely seen.
Before we begin our discussion of the normal forms, it's important to point out that they are
guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to
meet practical business requirements. However, when variations take place, it's extremely
important to evaluate any possible ramifications they could have on your system and
account for possible inconsistencies. That said, let's explore the normal forms.
Relational database design can be improved by converting the database into second
normal form (2NF) Two steps
Step 1: Write Each Key Component
on a Separate Line
Write each key component on separate line, then write original (composite) key on last
line
Each component will become key in new table
Step 2: Assign Corresponding Dependent Attributes
Determine those attributes that are dependent on other attributes
At this point, most anomalies have been eliminated
The Boyce-Codd Normal Form also referred to as the "third and half (3.5) normal form",
adds one more requirement:
Reference