0% found this document useful (0 votes)
32 views46 pages

Normalization

Here are the steps to remove partial dependencies from the ORDER relation: 1. Identify the composite primary key: Order_ID, Product_ID 2. Diagram the partial dependencies: Order_ID → Order_Date, Customer fields Product_ID → Product fields 3. Extract the partial dependency sets into their own relations: CUSTOMER (Customer_ID, Customer_Name, Customer_City, Customer_State) PRODUCT (Product_ID, Product_Description, Product_Finish, Unit_Price) 4. Leave the PK fields (Customer_ID, Product_ID) behind in ORDER as FKs 5. The updated ORDER relation is now in 2NF with no partial dependencies

Uploaded by

Hetik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views46 pages

Normalization

Here are the steps to remove partial dependencies from the ORDER relation: 1. Identify the composite primary key: Order_ID, Product_ID 2. Diagram the partial dependencies: Order_ID → Order_Date, Customer fields Product_ID → Product fields 3. Extract the partial dependency sets into their own relations: CUSTOMER (Customer_ID, Customer_Name, Customer_City, Customer_State) PRODUCT (Product_ID, Product_Description, Product_Finish, Unit_Price) 4. Leave the PK fields (Customer_ID, Product_ID) behind in ORDER as FKs 5. The updated ORDER relation is now in 2NF with no partial dependencies

Uploaded by

Hetik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Normalization

Dr. Shyamala Sivakumar, Ph.D., P.Eng., SMIEEE


Normalization is a process

 Learn distinctions between 1st, 2nd and 3rd normal forms


 Learn to diagram and resolve functional dependencies
 Data 1NF
 1NF 2NF
 2NF 3NF
 Identify and explain 3 types of anomalies
 Insertion
 Deletion
 Update
Data Normalization
 Decomposing relations with anomalies (problems) into smaller,
well-structured relations
 Validates and improves logical design to avoid unnecessary
duplication of data
 Good data modeling makes formal normalization easier
 You may find that your expertly designed tables require no
changes to remove anomalies, because you did such a good
job at improving your design through enterprise,
conceptual, and logical ERDs.
 Use normalization procedures to check your work and to
improve the data models provided to you by other
designers.
3
First Normal Form (1NF)

 Unique field names


 Unique rows (requires complete key)
 No multi-values attributes
 Every attribute value is atomic (no composite attributes)
 Order of rows and columns is irrelevant
Table with Multi-Valued Attributes: Not in 1NF

Note: this is NOT a relation


5
Discussion
 Why isn’t this a relation?
 What is the primary key? Order_ID and Product_ID.
 The 2 “records” represented above have a problem with
repeating values (see that there are 3 different values for
Product_ID, 3 different values for Product_Description,
etc.).
 Also, Customer_Address is a composite field. We can
assume that the only parts of the address to be recorded
are city and state.
First Normal Form (1NF)
 Unnormalized table

 Contains a repeating group

 Table in 1NF

 Contains no repeating groups

 Removal of repeating groups is starting point


in quest for problem-free tables
Now it is in 1NF

Note: this is a relation, but not a particularly well-structured one


8
Discussion
 To satisfy 1NF requirements: we duplicate the information contained in the first 5 columns
for those rows missing values.
 Note: Customer_Address should be split into two fields. This is an oversight in the
graphic
 Represented in DBDL form as:
 INVOICE (Order_ID, Order_Date, Customer_ID, Custome_Name, Customer_City,
Customer_State, Product_ID, Product_Description, Product_Finish, Unit_Price,
Ordered_Quantity)
 But there are still issues (anomalies) in this relation:
 Insertion–if new product is ordered for order 1007 of existing customer, customer data
must be re-entered, causing excessive duplication
 Deletion–if we delete the Dining Table from Order 1006, we lose information concerning
this item's finish and price
 Update–changing the price of product ID 4 requires update in several records
 Why do these anomalies exist?
 Because there are multiple themes (entity types) in one relation. This results in duplication and an
unnecessary dependency between the entities
Example: Is this relation in 1NF?

Are the field names unique?


Are the rows unique (is there a primary key)?
Are there any composite fields?
Are there any multi-valued fields?
Is the order of the rows or columns important for meaning of the
10 data?
Discussion
 This list represents data about employees who have taken professional training courses through their
employer.
 What’s the primary key?
 Answer–If employee can only take a course once, then it is composite: Emp_ID, Course_Title. But one employee
doesn’t have a Course_Title. We know that PK fields are always required.
 What happens if--
 Insertion–can’t enter a new employee without having the employee take a class; can’t add
a new class without having a student enrolled
 Deletion–if we remove employee 140, we lose information about the existence of a Tax Acc
class
 Modification–giving a salary increase to employee 100 forces us to update multiple records
 These are called anomolies→ problems with the structure of a relation
 Why do these anomalies exist?
 Because there are two themes (entity classes) in this one relation. This results in data duplication and an
unnecessary dependency between the entities
Second Normal Form (2NF)
 1NF Tables may contain problems
 Redundancy
 Update Anomalies
 Update, inconsistent data, additions, deletions
 Occur because a column is dependent on a
portion of a multi-column primary key
 2NF Table
 In1NF and every nonkey column is functionally
dependent on the entire primary key (more on the
next slide)
Second Normal Form
 1NF plus every non-key attribute is functionally
dependent on the ENTIRE primary key
 No partial dependencies
 Hint: If there is no composite key, the relation is
automatically in 2NF!
 “…the whole key…”
 When you diagram all dependency sets in the relation,
there should be no partial key determinants
 remember the determinant is on the left side of the
arrow.
 2NF is only in doubt if there is a composite key in the
relation.
13
Functional Dependencies

 Functional Dependency: The value of one attribute


(the determinant) allows us to identify the value of
other attribute(s)
 Diagram dependencies:
Determinant field(s) → list of dependent field(s)
Field A → Field B, Field C
This means:
If I know the value of Field A, I have enough
information to find the values of Fields B & C
14
Discussion
 The determinant is a field(s) that uniquely identifies the value of another field. You
must understand the definition and content of the fields to accurately normalize
the design.
 Refer to the relation on slide 10 (Employee2).
 If you find the Emp_ID in one column you can look in that row to find the
employee’s name, their department, and their salary.
 The Emp_ID is unique to the employee and in every row for that employee, they
have the same department and salary.
 Note that Emp_ID does NOT determine
 the value of Course_Title or
 Date-Completed,
 because the value of these fields may change on different rows for the same
Emp_ID.
 That is because the employee may take different courses and complete them on
different dates.
Update Anomalies
 Update
 Information is in multiple rows, difficult to update
 Inconsistent data
 Because of the duplication, a row that is not
updated causes inconsistency
 Additions
 Dummy records are required to add new unused
dependent rows
 Deletions
 Nonkey column (nonkey attribute) – when a
column is not a part of the primary key
Dependency Diagram

 Dependency diagram – uses arrows to indicate


all the functional dependencies present in a
table

 Partial dependencies – dependencies only on a


portion of the primary key
New Example: Partial Dependencies?

ORDER (Order_ID, Order_Date, Customer_ID, Customer_Name,


Customer_City, Customer_State, Product_ID, Product_

Description, Product_Finish, Unit_Price, Ordered_Quantity)

Removing Partial Dependencies


1. Move partial dependency sets to their own relation,
with the determinant as the primary key
2. Leave a copy of the determinant behind as a foreign
key
3. Rename all relations to be unique and meaningful
18
Partial dependencies in ORDER?
ORDER (Order_ID, Order_Date, Customer_ID, Custome_Name,
Customer_City, Customer_State, Product_ID, Product_

Description, Product_Finish, Unit_Price, Ordered_Quantity)

Diagram Partial Dependencies


Start with 1 field from the composite key

Order_ID ➔ Order_Date, Customer_ID,


Customer_Name, Customer_City,
Customer_State

19
Partial dependencies in ORDER?
ORDER(Order_ID, Order_Date, Customer_ID, Custome_Name,
Customer_City, Customer_State, Product_ID, Product_Description,

Product_Finish, Unit_Price, Ordered_Quantity)

Diagram Partial Dependencies


Then the other field from the composite key
Order_ID ➔ Order_Date, Customer_ID,
Customer_Name, Customer_City,
Customer_Address

Product_ID ➔ Product_Description,
Product_Finish, Unit_Price
20
Partial dependencies in ORDER?

ORDER(Order_ID (FK), Order_Date, Customer_ID, Custome_Name,


Customer_Address, Product_ID, Product_Description,

Product_Finish, Unit_Price, Ordered_Quantity)

Convert Partial Dependency into its own table &


leave new PK behind as a FK (repeat)
ORDER (Order_ID, Order_Date, Customer_ID,
Customer_Name, Customer_City, Customer_State)

21
Partial dependencies in ORDER?

ORDER(Order_ID (FK), Order_Date, Customer_ID, Custome_Name,


Customer_City, Customer_State, Product_ID (FK), Product_Description,

Product_Finish, Unit_Price, Ordered_Quantity)

Create New Tables from Partial Dependencies

ORDER (Order_ID, Order_Date, Customer_ID,


Customer_Name, Customer_City, Customer_State)

PRODUCT (Product_ID, Product_Description,


Product_Finish, Unit_Price)
22
Partial dependencies in ORDER?

ORDER(Order_ID (FK), Order_Date, Customer_ID, Custome_Name,


Customer_City, Customer_State, Product_ID (FK), Product_Description,

Product_Finish, Unit_Price, Ordered_Quantity)

Rename Tables to be Meaningful


ORDER (Order_ID, Order_Date, Customer_ID,
Customer_Name, Customer_City, Customer_State)
PRODUCT (Product_ID, Product_Description,
Product_Finish, Unit_Price)
ORDER OD_DETAIL (Order_ID (FK), Product_ID (FK),
Ordered_Quantity)
23
Functional dependencies
 Identify the functional dependencies (if you know the full primary key, you know
that all of the rest of the fields will be dependent on it)
 Start here: Order_ID, Product_ID ➔ all fields
 Order_ID ➔ Order_Date, Customer_ID, Customer_Name,
 Customer_City, Customer_State
 Product_ID ➔ Product_Description, Product_Finish, Unit_Price
 Therefore, NOT in 2nd Normal Form
 New relations:
 ORDER (Order_ID, Order_Date, Customer_ID, Customer_Name,
 Customer_City, Customer_State)
 PRODUCT (Product_ID, Product_Description, Product_Finish,
 Unit_Price)
 ORDER_DETAIL (Order_ID (FK), Product_ID (FK), Ordered_Qty)
Relations in 2NF

ORDER (Order_ID, Order_Date, Customer_ID,


Customer_Name, Customer_Address)

PRODUCT (Product_ID, Product_Description,


Product_Finish, Unit_Price)

OD_DETAIL (Order_ID (FK), Product_ID (FK),


Ordered_Quantity)

25
Third Normal Form (3NF)

 2NF Tables may still contain problems


 Redundancy and wasted space
 Update Anomalies
 Update, inconsistent data, additions, deletions
 Occur because a column is dependent on a
portion of a multi-column primary key
 3NF Table
 In2NF and the only determinants contained are
candidate keys
Relations in 3NF
 2NF plus every non-key attribute is functionally
dependent only on the ENTIRE primary key
 No transitive dependencies (transitive means field is
determined by a nonkey field)
1. Move transitive dependency sets to their own
relation, with the determinant as the primary key
2. Leave a copy of the determinant behind as a foreign
key
3. Rename all relations to be unique and meaningful
Relations in 3NF

ORDER (Order_ID, Order_Date, Customer_ID,


Customer_Name, Customer_Address)

PRODUCT (Product_ID, Product_Description,


Product_Finish, Unit_Price)
Transitive dependency –
OD_DETAIL (Order_ID (FK), Product_ID (FK), depends on Customer_ID
Ordered_Quantity)

Customer(Customer_ID,
Customer_Name, Customer_Address)

28
Problems with Incorrect Decomposition
 Decomposition must take place according to
that described for 3NF

 Even though you may decompose a table, you


run the risk of splitting the functional
dependence across different tables
Fourth Normal Form (4NF)
 3NF Tables may still contain problems
 Dependencies

 Update Anomalies
 Update, additions, deletions
 Occur because of multivalued dependencies
 4NF Table
 In3NF and has no multivalued dependencies (we
will not do much of this)
Summary: Normal Forms
Summary
 Normalization is a process of optimizing
databases to prevent update anomalies
 Normalization attempts to correct update
issues by eliminating duplication
 Duplication also creates inconsistency
 Insertions can violate database integrity if the
database is not normalized
 Deletions can violate database integrity if the
database is not normalized
Summary (con’t.)
 Normal Forms – First (1NF), Second (2NF),
Third(3NF), and Fourth(4NF)
 1NF has no repeating groups
 2NF is in 1NF and no non-key field is dependent
on only a portion of the primary key
 3NF is in 2NF and the only determinants are
candidate keys
Normalization Practice
A Normalization Practice
 Consider the following table structure holding
 The fields are described below:
all information about products (developed by
someone who didn’t know as much about ProdID Unique product ID number
database design as you do): CatID Unique product category ID number
 PRODUCT (ProdID, CatID, SupplierID, Color, SupplierID Unique supplier ID number
Qty, Price, Name, Desc, Loc) Color Color of the product (red, yellow, etc.)
 Consider the following business rules related Qty Number of units of the product in inventory
to this table: Price The price for each unit of the product
Name Company name of the supplier
 Each product is supplied by a single supplier, but
each supplier may provide multiple products. Desc Description of each product category
Loc Warehouse location for each category of product
 Each product is classified into a single product
category, but each product category may include
multiple products
 Each product is in only one warehouse location,
but a location will house multiple products.
Answer the following questions

Answers
Questions
1. Is the table a relation (in 1NF)? How do 1. Is the table a relation (in 1NF)? How do you know? If not,
you know? If not, resolve. resolve.
2. Is the original/corrected relation in 2NF?  Yes, it is in 1NF. It (presumably) has unique field names,
 How do you know? unique rows, no composite or multi-valued fields, and no
ordered rows or columns.
 If not 2NF, diagram only relevant
dependencies and resolve to new 2NF 2. Is the original/corrected relation in 2NF?
relations. DO NOT jump ahead to 3NF!
 Yes, it is in 2NF, because it has a simple PK.
3. Are the relations created in #2 already in
3NF? 3. Are the relations created in #2 already in 3NF?

 How do you know?  No, because there are transitive dependencies.

 If not 3NF, diagram only relevant


dependencies and resolve to new 3NF
relations.
Removing transitive dependencies

Transitive dependencies 3 NF Tables


 CATEGORY (CatID, Desc,Loc)
 CatID → Desc, Loc  SUPPLIER (SupplierID, Name)
 SupplierID → Name  PRODUCT (ProdID, CatID, SupplierID, Color, Qty,
Price, Name, Desc, Loc)
 PRODUCT (ProdID, CatID(FK), SupplierID (FK),
Color, Qty, Price)
B Normalization Practice
 The fields are described below:
 Consider the following table structure holding all of a
company’s data about a hotel and reservations:
Attribute Description
 RESERVATION (Room_No, Room_Quality, Beds,
Arrive_Date, Depart_Date, Mgr_ID, Cust_No, Room_No Unique hotel room number
Last_Name, No_Party, Mgr_Name, Mgr_Phone) Room_Quality Standard, Deluxe, Executive
 Consider the following business rules related to this Beds 1, 2, or 2+
table: Arrive_Date Check-in date
 Each room can have only one reservation per arrival Depart_Date Check-out date
date and can be reserved by only one customer at a
time. Mgr_ID Manager on duty on arrival
date
 A customer can make multiple reservations or may
have no reservations. Cust_No Unique customer identifier
 Each reservation is for exactly one room; each Last_Name Customer’s last name
reservation is for exactly one customer. No_Party Number of people staying in
 Duty managers are assigned based on a schedule: room
one manager per day Mgr_Name Last name of manager on duty
 Each room quality level can have any number of Mgr_Phone Office phone for manage on
beds.
duty
Answer the following questions
Questions
Answers
1. Is the table a relation (in 1NF)? How do 1. Is the table a relation (in 1NF)? How do you know? If not,
you know? If not, resolve. resolve.
2. Is the original/corrected relation in 2NF?  Yes, it is in 1NF. It (presumably) has unique field names, unique
 How do you know? rows, no composite or multi-valued fields, and no ordered rows
or columns.
 If not 2NF, diagram only relevant
dependencies and resolve to new 2NF 2. Is the original/corrected relation in 2NF?
relations. DO NOT jump ahead to 3NF!
 NO, it is NOT in 2NF, because there are partial dependencies
3. Are the relations created in #2 already in
3NF?  Hint: When a entity has a composite PK, check for partial
dependencies
 How do you know?
3. Are the relations created in #2 already in 3NF?
 If not 3NF, diagram only relevant
dependencies and resolve to new 3NF  No, because there are transitive dependencies.
relations.
Removing partial dependencies
partial dependencies 2 NF Tables
 ROOM (RoomNo, Room_Quality, Beds)
 RoomNo → Room_Quality, Beds
 Arrive_date→ Mgr_ID, Mgr_name, Mgr_phone  SCHEDULE(Arrive_date, Mgr_ID, Mgr_Name,
Mgr_Phone)
 RESERVATION (Room_No, Room_Quality, Beds,
Arrive_Date, Depart_Date, Mgr_ID, Cust_No,
Last_Name, No_Party, Mgr_Name, Mgr_Phone)
 RESERVATION (Room_No (FK), Arrive_Date (FK),
Depart_Date, Cust_No, Last_Name, No_Party)
Removing transitive dependencies

Transitive dependencies 3 NF Tables


 Are there TD’s in Room: No  ROOM (RoomNo, Room_Quality, Beds)
 Are there TD’s in SCHEDULE?  SCHEDULE(Arrive_date, Mgr_ID (FK), Mgr_Name,
 MgrID → Mgr_name, Mgr_Phone Mgr_Phone)
 Are there TD’s in RESERVATION  MANAGER(Mgr_ID, Mgr_Name, Mgr_Phone)
 Cust_No → LastName  RESERVATION (Room_No (FK), Arrive_Date (FK),
Depart_Date, Cust_No (FK), Last_Name,
No_Party)
 CUSTOMER(Cust_No, Last_Name)
 RESERVATION (Room_No (FK), Arrive_Date (FK),
Depart_Date, Cust_No (FK), No_Party)
C Normalization Practice

 Consider the following data blob


(developed by someone who didn’t
Item_No Description Vendor Vendor_City Item_Cost
know as much about database
design as you do):
54321 Wagon Quality Toys, Inc. Toronto 25.00
 ITEM (Item_No, Description, Acme Toys, Inc. Winnipeg 23.00
{Vendor,} {Vendor_City}, 33303 Skateboard Quality Toys, Inc. Toronto 19.00
{Item_Cost}) Acme Toys, Inc. Winnipeg 21.00
Sporty Toys, Inc. Toronto 23.00
 You may assume that Vendor names
are unique
Answer the following questions

Questions Answers
1. Is the table a relation (in 1NF)? 1. Is the table a relation (in 1NF)? How do you
How do you know? If not, resolve. know? If not, resolve.
 DO NOT jump ahead to 2NF!  NO, it is NOT in 1NF. It has unique field
names, unique rows and no ordered rows or
columns. However, it has multi-valued
fields,.
Convert to 1NF

Steps 1NF Table


 See modifications to table
 ITEM (Item_No, Description,
Item_No Description Vendor Vendor_City Item_Cost
Vendor, Vendor_City, Item_Cost)
2. Is the corrected 1NF relation in 54321 Wagon Quality Toys, Inc. Toronto 25.00
2NF? 54321 Wagon Acme Toys, Inc. Winnipeg 23.00
 Hint: When a entity has a 33303 Skateboard Quality Toys, Inc. Toronto 19.00
composite PK, check for partial 33303 Skateboard Acme Toys, Inc. Winnipeg 21.00
dependencies 33303 Skateboard Sporty Toys, Inc. Toronto 23.00
 NO, it is NOT in 2NF, because there
are partial dependencies
Convert to 2NF

Partial dependencies 1NF relation to 2NF relations


 ItemNo → Description  ITEM (Item_No, Description, Vendor,
Vendor_City, Item_Cost)
 Vendor → Vendor_city

 ITEM(Item_No, Description)
 VENDOR(Vendor, Vendor_City)
 COST(Item_No (FK), Vendor (FK), Item_Cost)
Convert to 3NF

Questions Answer
3. Are the 2NF relations created  3. Yes all 2NF relations are also
already in 3NF? in 3NF, because all tables only
 How do you know?
have one non-key field, so there
are no transitive dependencies.
 If not 3NF, diagram only relevant
dependencies and resolve to new
3NF relations.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy