We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27
Chapter 4
Introduction to Normalization
Normalization is a method used to validate and
improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. The process of decomposing relations with anomalies to produces smaller, well-structured relations. Cont’d…
The normalization process, as first proposed by Codd
(1972a), takes a relation schema through a series of tests to certify whether it satisfies a certain normal form. The process, which proceeds in a top-down fashion by evaluating each relation against the criteria for normal forms and decomposing relations as necessary, can thus be considered as relational design by analysis. Cont’d…
Initially, Codd proposed three normal forms, which
he called first, second,and third normal form. A stronger definition of 3NF—called Boyce-Codd normal form (BCNF)—was proposed later by Boyce and Codd. All these normal forms are based on a single analytical tool: the Cont’d…
functional dependencies among the attributes of a
relation. Later, a fourth normal form (4NF) and a fifth normal form (5NF) were proposed, based on the concepts of multivalued dependencies and join dependencies, respectively. Cont’d…
Normalization of data can be considered a process of
analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of minimizing redundancy and minimizing the insertion, delet.ion, and update anomalies First Normal Form
First normal form (1NF) is now considered to be part of
the formal definition of a relation in the basic (flat) relational model; historically, it was defined to disallow multivalued attributes, composite attributes, and their combinations. It states that the domain of an attribute must include only atomic(simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute. Cont’d…
The only attribute values permitted by 1NF are single
atomic (or indivisible) values. Consider the DEPARTMENT relation schema shown in above Figure whose primary key is Dnumber, and suppose that we extend it by including the Dlocations attribute as shown in the figure. Cont’d…
We assume that each department can have a number
of locations. The DEPARTMENT schema and a sample relation state. As we can see, this is not in 1NF because Dlocations is not an atomic attribute, as illustrated by the first tuple. There are two ways we can look at the Dlocations attribute: Cont’d…
(a). The domain of Dlocations contains atomic values,
but some tuples can have a set of these values. In this case, Dlocations is not functionally dependent on the primary key Dnumber. (b). The domain of Dlocations contains sets of values and hence is nonatomic. In this case,Dnumber→Dlocations because each set is considered a single member of the attribute domain. Cont’d…
Remove the attribute Dlocations that violates 1NF and
place it in a separate relation DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT. The primary key of this relation is the combination {Dnumber, Dlocation}. Expand the key so that there will be a separate tuple in the original DEPARTMENT relation for each location of a DEPARTMENT. Second Normal Form
Second normal form (2NF) is based on the concept of
full functional dependency. Functional dependency is a relationship that exists when one attribute uniquely determines another attribute. If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X- >Y, which specifies Y is functionally dependent on X. Here X is termed as a determinant set and Y as a dependant attribute. Social Security Number determines employee name and project number SSN ENAME and Pnumber Project Number determines project name and location PNUMBER PNAME, PLOCATION Cont’d… Cont’d…
A relation schema R is in second normal form (2NF)
if every non-prime attribute A in R is fully functionally dependent on the primary key. R can be decomposed into 2NF relations via the process of 2NF normalization. Third Normal Form
A relation schema R is in third normal form (3NF) if it
is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key Transitive functional dependency – if there a set of attribute Z that are neither a primary or candidate key and both X Z and Y Z holds. Examples: SSN DMGRSSN is a transitive FD since SSN DNUMBER and DNUMBER DMGRSSN hold SSN ENAME is non-transitive since there is no set of attributes X where SSN X and X ENAME Cont’d… Cont’d…
The dependency Ssn→Dmgr_ssn is transitive through
Dnumber in EMP_DEPT Dependencies Ssn → Dnumber and Dnumber → Dmgr_ssn hold and Dnumber is neither a key itself nor a subset of the key of EMP_DEPT. Intuitively, we can see that the dependency of Dmgr_ssn on Dnumber is undesirable in EMP_DEPT since Dnumber is not a key of EMP_DEPT. BCNF (Boyce-Codd Normal Form) A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X A holds in R, then X is a superkey of R Each normal form is strictly stronger than the previous one: Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF The goal is to have each relation in BCNF (or 3NF) Reading Assignment.
Attention:- read about
BCNF (Boyce-Codd Normal Form Fourth Normal Form Fifth Normal form