0% found this document useful (0 votes)
17 views

ch 14-Final-normalization

The document outlines the principles of normalization in relational database design, detailing various normal forms including First, Second, Third Normal Forms, and Boyce-Codd Normal Form (BCNF). It emphasizes the importance of functional dependencies and the criteria for creating 'good' relation schemas, focusing on reducing redundancy and ensuring data integrity. Additionally, it discusses the practical applications of normalization and the potential need for denormalization in certain scenarios.

Uploaded by

dearest.tinu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

ch 14-Final-normalization

The document outlines the principles of normalization in relational database design, detailing various normal forms including First, Second, Third Normal Forms, and Boyce-Codd Normal Form (BCNF). It emphasizes the importance of functional dependencies and the criteria for creating 'good' relation schemas, focusing on reducing redundancy and ensuring data integrity. Additionally, it discusses the practical applications of normalization and the potential need for denormalization in certain scenarios.

Uploaded by

dearest.tinu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter Outline(Normalization)

1 Normal Forms Based on Primary Keys


3.1 Normalization of Relations
3.2 Practical Use of Normal Forms
3.3 Definitions of Keys and Attributes Participating in Keys
3.4 First Normal Form
3.5 Second Normal Form
3.6 Third Normal Form
2 General Normal Form Definitions (For Multiple
Keys)
3 BCNF (Boyce-Codd Normal Form)
Chapter 10-4
Informal Design Guidelines for
Relational Databases (1)
l
What is relational database design?
The grouping of attributes to form "good" relation schemas
l
Two levels of relation schemas

The logical "user view" level (Conceptual)

The storage "base relation" level
l
Design is concerned mainly with base relations
l
What are the criteria for "good" base relations?

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-5
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Informal Design Guidelines for
Relational Databases (2)
l
We first discussed informal guidelines for good
relational design followed by Fds rules and derivation
l
Now we will discuss formal concepts of functional
dependencies and normal forms
- 1NF (First Normal Form)
- 2NF (Second Normal Form)
- 3NF (Third Normal Form)
- BCNF (Boyce-Codd Normal Form)

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-6
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Reviewing: Informal Design
Guidelines for Relation Schemas
● Measures of quality
● Making sure attribute semantics are clear
● Reducing redundant information in tuples
● Reducing NULL values in tuples
● Disallowing possibility of generating spurious
tuples

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


2.1 Functional Dependencies (1)

l
Functional dependencies (FDs) are used to specify
formal measures of the "goodness" of relational
designs
l
FDs and keys are used to define normal forms for
relations
l
FDs are constraints that are derived from the
meaning and interrelationships of the data
attributes
l
A set of attributes X functionally determines a set
of attributes Y if the value of X determines a
unique value for Y
Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-18
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Normal forms based on primary key
●Most practical relational DB Design takes one of
the following approach:
– Make ERD and then map to tables
– Useexternal knowledge, informal methods to
design relation schemes
Once, scheme is designed using either of this

approaches, then evaluate scheme for goodness


and decompose to higher NF.
Normalization of Relations

l
Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations on the basis of
condition using keys and FDs
l
A formal framework for analyzing relation schemas
based on keys and FDs

Normal form: The highest normal form condition that


a relation meets and represents goodness upto that form
Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-29
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Normalization of Relations (2)

2NF, 3NF, BCNF based on keys and FDs of a


relation schema
Additional properties must also be satisfied to be in a
good relational design (lossless join, dependency
Preservation) while decomposing tables

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-30
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
There are two important properties
of decompositions:
(a) Non-additive join or losslessness join property
No spurious tuples are generated while joining
(b) Dependency preservation property
Each FD in original R must be represented
Note that property (a) is extremely important and
cannot be sacrificed. Property (b) is less stringent
and may be sacrificed.

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-17
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Practical Use of Normal Forms

l
Normalization is carried out in practice so that the
resulting designs are of high quality and meet the desirable
properties
l
The practical utility of these normal forms becomes
questionable when the constraints on which they are based
are hard to understand or to detect
l
The database designers need not normalize to the highest
possible normal form. (usually up to 3NF, BCNF or 4NF)
l
Denormalization: the process of storing the join of higher
normal form relations as a base relation—which is in a
lower normal form

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-31
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Definitions of Keys and Attributes
Participating in Keys (1)
l
A superkey of a relation schema R = {A1, A2, ....,
An} is a set of attributes S subset-of R with the
property that no two tuples t1 and t2 in any legal
relation state r of R will have t1[S] = t2[S]

l
A key K is a superkey with the additional property
that removal of any attribute from K will cause K
not to be a superkey any more.
More than one key then each is called candidate key.
One is taken as primary and rest are called secondary keys
Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-32
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Definitions of Keys and Attributes
Participating in Keys (2)
l
If a relation schema has more than one key, each is
called a candidate key. One of the candidate keys
is arbitrarily designated to be the primary key,
and the others are called secondary keys.
l
A Prime attribute must be a member of some
candidate key
l
A Nonprime attribute is not a prime attribute—
that is, it is not a member of any candidate key.

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-33
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Example
Given a relation scheme (A,B,C,D) with FD as
F={A->B,B->C,A->D, CD->A}
● Find candidate keys A, CD
A, C, D
● Prime attributes
B
● Non-prime attributes
First Normal Form

l
Disallows composite attributes, multivalued
attributes, and nested relations; attributes
whose values for an individual tuple are
non-atomic

l
Considered to be part of the formal definition of
Relation in relational data model i.e. relation is a
flat file
Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-34
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Normalization into 1NF

Note: If DLOCATIONS is multivalues attribute, then FD DNUMBER->DLOCATIONS


does not hold
Chapter 10-35
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Normalization nested
relations into 1NF

Chap
Multiple Multivalues attribute
PERSON(SSN, {CAR#},{MOBILE}) Not in 1NF
One way to normalize : Expand key to include
both MV attribures I.e. key is
(SSN, CAR#,MOBILE)
– Leads to redundancy
Better way: Decompose R on the basis of two MV

attributes i.e. PERSON1(SSN, CAR#},


PERSON2(SSN, MOBILE},
Second Normal Form
l
Uses the concepts of FDs, primary key
Definitions:
l
Prime attribute - attribute that is member of the
primary key K
l
Full functional dependency - a FD Y -> Z where
removal of any attribute from Y means the FD
does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since
neither SSN -> HOURS nor PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called
a partial dependency ) since SSN -> ENAME also holds

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-37
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Second Normal Form (2)

l
A relation schema R is in second normal
form (2NF) if every non-prime attribute A
in R is fully functionally dependent on the
primary key

l
R can be decomposed into 2NF relations via
the process of 2NF normalization

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-38
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
3.3 Second Normal Form (1)

l
Uses the concepts of FDs, primary key
Definitions:
l
Prime attribute - attribute that is member of the
primary key K
l
Full functional dependency - a FD Y -> Z where
removal of any attribute from Y means the FD
does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since
neither SSN -> HOURS nor PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called
a partial dependency ) since SSN -> ENAME also holds
Second Normal Form (2)

l
A relation schema R is in second normal
form (2NF) if every non-prime attribute A
in R is fully functionally dependent on the
primary key

l
R can be decomposed into 2NF relations via
the process of 2NF normalization
3.3 Second Normal Form (1)
l
Uses the concepts of FDs, primary key
Definitions:
l
Prime attribute - attribute that is member of the
primary key K
l
Full functional dependency - a FD Y -> Z where
removal of any attribute from Y means the FD
does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since
neither SSN -> HOURS nor PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called
a partial dependency ) since SSN -> ENAME also holds

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-37
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Second Normal Form (2)

l
A relation schema R is in second normal
form (2NF) if every non-prime attribute A
in R is fully functionally dependent on the
primary key

l
R can be decomposed into 2NF relations via
the process of 2NF normalization

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-38
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
3.3 Second Normal Form (1)
l
Uses the concepts of FDs, primary key
Definitions:
l
Prime attribute - attribute that is member of the
primary key K
l
Full functional dependency - a FD Y -> Z where
removal of any attribute from Y means the FD
does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since
neither SSN -> HOURS nor PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called
a partial dependency ) since SSN -> ENAME also holds

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-37
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Second Normal Form (2)

l
A relation schema R is in second normal
form (2NF) if every non-prime attribute A
in R is fully functionally dependent on the
primary key

l
R can be decomposed into 2NF relations via
the process of 2NF normalization

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-38
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Third Normal Form

Definition:
l
Transitive functional dependency - a FD X -> Z
that can be derived from two FDs X -> Y and Y -> Z
Examples:
SSN -> DMGRSSN is a transitive FD since
SSN -> DNUMBER and DNUMBER -> DMGRSSN
hold
SSN -> ENAME is non-transitive since there is no set
of attributes X where SSN -> X and X -> ENAME

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-41
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
More example
General Normal Form Definitions
(For Multiple Candidate Keys)
A relation schema R is in second normal form
(2NF) if every non-prime attribute A in R is fully
functionally dependent on every key of R

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-43
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
General Normal Form Definitions

Definition:
l
Superkey of relation schema R - a set of attributes
S of R that contains a key of R
l
A relation schema R is in third normal form
(3NF) if whenever a FD X -> A holds in R, then
either:
(a) X is a superkey of R, or
(b) A is a prime attribute of R
NOTE: Boyce-Codd normal form disallows condition (b)
above
Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-44
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Interpreting General 3 Normal Form Definition

If in a relation with FD X->A violates both conditios (a) and (b) means
-A nonprime attribute determines another nonprime attribute (transitivity)
-A proper subset of K determines a nonprime attribute

Alternate definition of 3NF:


Every non-prime attribute is FDD on every key of R and there is no transit
BCNF (Boyce-Codd Normal Form)

l
A relation schema R is in Boyce-Codd Normal
Form (BCNF) if whenever an FD X -> A holds in
R, then X is a superkey of R
l
Each normal form is strictly stronger than the previous one

Every 2NF relation is in 1NF

Every 3NF relation is in 2NF

Every BCNF relation is in 3NF
l
There exist relations that are in 3NF but not in BCNF
l
The goal is to have each relation in BCNF (or 3NF)

Chapter
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition 10-45
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Boyce-Codd normal form

Chapter 10-46
A relation TEACH that is in 3NF but not in BCNF

Chapter 10-47
Achieving the BCNF by Decomposition

l
Two FDs exist in the relation TEACH:
fd1: { student, course} -> instructor
fd2: instructor -> course
l
{student, course} is a candidate key for this relation and that
the dependencies shown follow the pattern in Figure 10.12
(b). So this relation is in 3NF but not in BCNF
l
A relation NOT in BCNF should be decomposed so as to
meet this property, while possibly forgoing the preservation of
all functional dependencies in the decomposed relations.

Chapter 10-48
Achieving the BCNF by Decomposition

l
Three possible decompositions for relation TEACH
1. {student, instructor} and {student, course}
2. {course, instructor } and {course, student}
3. {instructor, course } and {instructor, student}
l
All three decompositions will lose fd1. We have to settle for sacrificing the
functional dependency preservation. But we cannot sacrifice the non-additivity
property after decomposition.
l
Out of the above three, only the 3rd decomposition will not generate spurious
tuples after join.(and hence has the non-additivity property).

Chapter 10-49

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy