0% found this document useful (0 votes)
147 views

Data Normalization

The document discusses database normalization and design. It begins by outlining the objectives of normalization, including minimizing redundancy and protecting data integrity. It then defines key concepts like tables, attributes, and primary keys. The document explains the goals and basic rules of database normalization, including atomic values and functional dependencies. It provides examples to illustrate normalizing a table to eliminate repeating values and define relationships between tables.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
147 views

Data Normalization

The document discusses database normalization and design. It begins by outlining the objectives of normalization, including minimizing redundancy and protecting data integrity. It then defines key concepts like tables, attributes, and primary keys. The document explains the goals and basic rules of database normalization, including atomic values and functional dependencies. It provides examples to illustrate normalizing a table to eliminate repeating values and define relationships between tables.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 97

Database Management Systems

Data Normalization

1
Objectives

 Why is database design important?


 What is a table and how do you choose keys?
 What are the fundamental rules of database normalization?
 How do you begin analyzing a form to create normalized tables?
 How do you create a design in first normal form?
 What is second normal form?
 What is third normal form?
 What problems exist beyond third normal form?
 How do business rules change the database design?
 What problems arise when converting a class diagram to normalized tables?
 What tables are needed for the Sally’s Pet Store?
 How do you combine tables from multiple forms and many developers?
 How do you record the details for all of the columns and tables?

2
Why Normalization?

 Need standardized data definition


 Advantages of DBMS require careful design
 Define data correctly and the rest is much easier
 It especially makes it easier to expand database later
 Method applies to most models and most DBMS
 Similar to Entity-Relationship
 Similar to Objects (without inheritance and methods)
 Goal: Define tables carefully
 Save space
 Minimize redundancy
 Protect data

3
Definitions

 Relational database: A collection of tables.


 Table: A collection of columns (attributes) describing an entity. Individual objects
are stored as rows of data in the table.
 Property (attribute): a characteristic or descriptor of a class or entity.
 Every table has a primary key.
 The smallest set of columns that uniquely identifies any row
 Primary keys can span more than one column (concatenated keys)
 We often create a primary key to insure uniqueness (e.g., CustomerID, Product#, . . .) called a
surrogate key.
Primary key Properties
Class: Employee
Rows/Objects Employee
EmployeeID TaxpayerID LastName FirstName HomePhone Address
12512 888-22-5552 Cartom Abdul (603) 323-9893 252 South Street
15293 222-55-3737 Venetiaan Roland (804) 888-6667 937 Paramaribo Lane
22343 293-87-4343 Johnson John (703) 222-9384 234 Main Street
29387 837-36-2933 Stenheim Susan (410) 330-9837 8934 W. Maple

4
Keys

 Primary key
 Every table (object) must have a primary key
 Uniquely identifies a row (one-to-one)
 Concatenated (or composite) key
 Multiple columns needed for primary key
 Identify repeating relationships (1 : M or M : N)
 Key columns are underlined
 First step
 Collect user documents
 Identify possible keys: unique or repeating relationships

5
Notation

Table name Table columns

Customer(CustomerID, Phone, Name, Address, City, State, ZipCode)

Primary key is underlined

CustomerID Phone LastName FirstName Address City State Zipcode

1 502-666-7777 Johnson Martha 125 Main Street Alvaton KY 42122


2 502-888-6464 Smith Jack 873 Elm Street Bowling Green KY 42101
3 502-777-7575 Washington Elroy 95 Easy Street Smith’s Grove KY 42171
4 502-333-9494 Adams Samuel 746 Brown Drive Alvaton KY 42122
5 502-474-4746 Rabitz Victor 645 White Avenue Bowling Green KY 42102
6 616-373-4746 Steinmetz Susan 15 Speedway Drive Portland TN 37148
7 615-888-4474 Lasater Les 67 S. Ray Drive Portland TN 37148
8 615-452-1162 Jones Charlie 867 Lakeside Drive Castalian Springs TN 37031
9 502-222-4351 Chavez Juan 673 Industry Blvd. Caneyville KY 42721
10 502-444-2512 Rojo Maria 88 Main Street Cave City KY 42127

6
Identifying Key Columns

Orders Each order has


OrderID Date Customer only one customer.
8367 5-5-10 6794 So Customer is not
8368 5-6-10 9263 part of the key.
OrderItems

OrderID Item Quantity Each order has


8367 229 2
many items.
8367 253 4
8367 876 1 Each item can appear
8368 555 4 on many orders.
8368 229 1 So OrderID and Item
are both part of the key.

7
Surrogate Keys

 Real world keys sometimes cause problems in a database.


 Example: Customer
 Avoid phone numbers: people may not notify you when numbers change.
 Avoid SSN (privacy and most businesses are not authorized to ask for
verification, so you could end up with duplicate values)
 Often best to let the DBMS generate unique values
 Access: AutoNumber
 SQL Server: Identity
 Oracle: Sequences (but require additional programming)
 Drawback: Numbers are not related to any business data, so the
application needs to hide them and provide other look up mechanisms.
 Distributed database: GUID

8
Common Order System
Customer Salesperson
1 1
*
Order
*
1
*
OrderItem
*
1
Item
Customer(CustomerID, Name, Address, City, Phone)
Salesperson(EmployeeID, Name, Commission, DateHired)
Order(OrderID, OrderDate, CustomerID, EmployeeID)
OrderItem(OrderID, ItemID, Quantity)
Item(ItemID, Description, ListPrice)

9
Database Normalization Rules

 1. Each cell in a table contains atomic (single-valued) data.


 2. Each non-key column depends on all of the primary key columns
(not just some of the columns).
 3. Each non-key column depends on nothing outside of the key
columns.

10
Atomic Values for Phone Numbers

CustomerID LastName FirstName Phone Business CellPhone


15023 Jones Mary 222-3034 222-4094 223-0984
63478 Sanchez Miguel 030-9693 403-4094
94552 O’Reilly Madelline 849-4948 292-3332 139-3831
45791 Stein Marta 294-4421
49004 Brise Mer 764-5103

11
Repeating Values for Phone Numbers

CustomerID LastName FirstName Phone


15023 Jones Mary 222-3034
222-4094
223-0984
63478 Sanchez Miguel 030-9693
403-4094
94552 O’Reilly Madeline 849-4948
292-3332
139-3831
339-4040
45791 Stein Marta 294-4421
49004 Brise Mer 764-5103

12
Repeating Values for Phone Numbers

CustomerID LastName FirstName CustomerID PhoneType Phone


15023 Jones Mary 15023 Land 222-3034
63478 Sanchez Miguel 15023 Skype 222-4094
94552 O’Reilly Madeline 15023 Cell 223-0984
45791 Stein Marta 63478 Land 030-9693
49004 Brise Mer 63478 Skype 403-4094
94552 Land 849-4948
94552 Skype 292-3332
94552 Cell 139-3831
94552 Business 339-4040
45791 Land 294-4421
49004 Land 764-5103

13
Simple Form

Customer ID Company Name


City
Contact LastName, FirstName
Phone

14
Initial Design

Customer(CustomerID, CompanyName, City)


Contact(ContactID, CustomerID, LastName, FirstName)

15
Sample Database for Sales

Sale ID Date

Customer
First Name
Last Name
Address
City, State ZIPCode
ItemID Description List Price Quantity QOH Value

Total

19
Initial Objects

Initial Object Key Sample Properties


Customer Assign CustomerID Name
Address
Phone
Item Assign ItemID Description
List Price
Quantity On Hand
Sale Assign SaleID Sale Date
SaleItems SaleID + ItemID Quantity

20
Initial Form Evaluation
SaleForm(SaleID, SaleDate, CustomerID, FirstName, LastName,
Address, City, State, ZIPCode,
(ItemID, Description, ListPrice, Quantity, QuantityOnHand) )

Sale Date
Identify potential keys. ID
Identify repeating groups.
Customer
First Name
Last Name
Address
City, State ZIPCode
ItemID Descriptio List Quantit QOH Value
n Price y

Total

21
Problems with Repeating Sections

SaleForm(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode,


(ItemID, Description, ListPrice, Quantity, QuantityOnHand) )

Repeating section
Duplication Not atomic
SaleID Date CID FirstName LastName Address City State ZIP ItemID Description ListPrice Quantity QOH
11851 7/15 15023 Mary Jones 111 Elm Chicago IL 60601 15 Air Tank 192.00 2 15
27 Regulator 251.00 1 5
32 Mask 1557 65.00 1 6
11852 7/15 63478 Miguel Sanchez 222 Oro Madrid 15 Air Tank 192.00 4 15
33 Mask 2020 91.00 1 3
11853 7/16 15023 Mary Jones 111 Elm Chicago IL 60601 41 Snorkel 71 44.00 2 15
75 Wet suit-S 215.00 1 3
11854 7/17 94552 Madeline O’Reilly 333 Tam Dublin 75 Wet suit-S 215.00 2 3
32 Mask 1557 65.00 1 6
57 Snorkel 95 83.00 1 17

22
First Normal Form

SaleForm(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode,


(ItemID, Description, ListPrice, Quantity, QuantityOnHand) )

SaleForm2(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode)

SaleLine(SaleID, ItemID, Description, ListPrice, Quantity, QuantityOnHand)

23
Current Design

SaleLine table is clearly wrong


because it contains both generated
and non-generated key columns.

24
Multiple Repeating: Independent Groups

FormA(Key1, Simple Columns, (Group1, A, B, C), (Group2, X, Y) )

MainTable(Key1, Simple Columns)

Group1(Key1, Group1, A, B, C) Group2(Key1, Group2, X, Y)

25
Nested Repeating Sections

Table (Key1, . . . (Key2, . . .


(Key3, . . .) ) )

Table1(Key1, . . .)TableA (Key1,Key2 . . .(Key3, . . .) )

Table2 (Key1, Key2 . . .)Table3 (Key1, Key2, Key3, . . .)


 Nested: Table (Key1, aaa. . . (Key2, bbb. . . (Key3, ccc. . .) ) )
 First Normal Form (1NF)
 Table1(Key1, aaa . . .)
 Table2(Key1, Key2, bbb . .)
 Table3(Key1, Key2, Key3, ccc. . .)

26
First Normal Form Problems (Data)

SaleLine(SaleID, ItemID, Description, ListPrice, Quantity, QuantityOnHand)


Duplication for columns that depend only on ItemID

SaleID ItemID Description ListPrice Quantity QOH


11851 15 Air Tank 192.00 2 15
11851 27 Regulator 251.00 1 5
11851 32 Mask 1557 65.00 1 6
11852 15 Air Tank 192.00 4 15
11852 33 Mask 2020 91.00 1 3
11853 41 Snorkel 71 44.00 2 15
11853 75 West suit-S 215.00 1 3
11854 75 Wet suit-S 215.00 2 3
11854 32 Mask 1557 65.00 1 6
11854 57 Snorkel 95 83.00 1 17

27
Second Normal Form Definition

Depends on both SaleID and ItemID

SaleLine(SaleID, ItemID, Description, ListPrice, Quantity, QuantityOnHand)

Depend only on ItemID

 Each non-key column must  Dependence (definition)


depend on the entire key.  If given a value for the key you
 Only applies to concatenated keys always know the value of the
 Some columns only depend on property in question, then that
part of the key property is said to depend on the
key.
 Split those into a new table.
 If you change part of a key and
the questionable property does
not change, then the table is not
in 2NF.

28
Second Normal Form Example

SaleLine(SaleID, ItemID, Description, ListPrice, Quantity, QuantityOnHand)

SaleItems(SaleID, ItemID, Quantity)

Item(ItemID, Description, ListPrice, QuantityOnHand)

29
Second Normal Form Example (Data)

SaleItems(SaleID, ItemID, Quantity)

SaleID ItemID Quantity


Item(ItemID, Description, ListPrice, QuantityOnHand)
11851 15 2
11851 27 1
ItemID Description ListPrice QOH
11851 32 1
15 Air Tank 192.00 15
11852 15 4
27 Regulator 251.00 5
11852 33 1
32 Mask 1557 65.00 6
11853 41 2
33 Mask 2020 91.00 3
11853 75 1
41 Snorkel 71 44.00 15
11854 75 2
57 Snorkel 95 83.00 17
11854 32 1
75 Wet suit-S 215.00 3
11854 57 1
77 Wet suit-M 215.00 7

30
Second Normal Form in DB Design

31
Second Normal Form Problems (Data)

SaleForm2(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode)

SaleID Date CustomerID FirstName LastName Address City State ZIP


11851 7/15 15023 Mary Jones 111 Elm Chicago IL 60601
11852 7/15 63478 Miguel Sanchez 222 Oro Madrid
11853 7/16 15023 Mary Jones 111 Elm Chicago IL 60601
11854 7/17 94552 Madeline O’Reilly 333 Tam Dublin

Duplication

32
Third Normal Form Definition

Depend on SaleID

SaleForm2(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode)

Depend on CustomerID

33
Third Normal Form Example

SaleForm2(SaleID, SaleDate, CustomerID, FirstName, LastName, Address, City, State, ZIPCode)

Sale(SaleID, SaleDate, CustomerID)


SaleID Date CustomerID
11851 7/15 15023
11852 7/15 63478
11853 7/16 15023
11854 7/17 94552

Customer(CustomerID, FirstName, LastName, Address, City, State, ZIPCode)


CustomerID FirstName LastName Address City State ZIP
15023 Mary Jones 111 Elm Chicago IL 60601
63478 Miguel Sanchez 222 Oro Madrid
94552 Madeline O’Reilly 333 Tam Dublin

34
Third Normal Form Tables

35
Third Normal Form Tables

Customer(CustomerID, FirstName, LastName, Address, City, State, ZIPCode)

Sale(SaleID, SaleDate, CustomerID)

SaleItems(SaleID, ItemID, Quantity)

Item(ItemID, Description, ListPrice, QuantityOnHand)

36
3NF Rules/Procedure

 Split out repeating sections


 Be sure to include a key from the parent section in the new piece so the two
parts can be recombined.
 Verify that the keys are correct
 Is each row uniquely identified by the primary key?
 Are one-to-many and many-to-many relationships correct?
 Check “many” for keyed columns and “one” for non-key columns.
 Make sure that each non-key column depends on the whole key and
nothing but the key.
 No hidden dependencies.

37
Checking Your Work (Quality Control)

 Look for one-to-many relationships.


 Many side should be keyed (underlined).
 e.g., VideosRented(TransID, VideoID, . . .).
 Check each column and ask if it should be 1 : 1 or 1: M.
 If add a key, renormalize.
 Verify no repeating sections (1NF)
 Check 3NF
 Check each column and ask:
 Does it depend on the whole key and nothing but the key?
 Verify that the tables can be reconnected (joined) to form the original
tables (draw lines).
 Each table represents one object.
 Enter sample data--look for replication.

38
Boyce-Codd Normal Form (BCNF)

c
Employee-Specialty(EID, Specialty, Manager)
b a
d

 Business rules.
 Each employee may have many specialties.
 Each specialty has many managers.
 Employee has only one manager for each specialty.
 Each manager has only one specialty.

Employee(EID, Manager)
Manager(Manager, Specialty)

39
Fourth Normal Form (Keys)

EmployeeTasks(EID, Specialty, ToolID)

 Business rules.
 Each employee has many specialties.
 Each employee has many tools.
 Tools and specialties are unrelated.

EmployeeSpecialty(EID, Specialty)
EmployeeTools(EID, ToolID)

40
Domain-Key Normal Form (DKNF)

EmployeeTask(EmployeeID, TaskID, Tool)

Each employee performs many tasks with many tools.


Each Tool can be used for many tasks, or
Each task has a standard list of tools (but some employees may use
others)

Need to add:
RequiredTools(TaskID, ToolID)

But, it is really BCNF: TaskID  ToolID is hidden.


But, this dependency might not have been explicitly stated.

41
No Hidden Dependencies

 The simple normalization  Solution: Split the table.


rules:  Make sure you can rejoin
 Remove repeating sections the two pieces to recreate
 Each non-key column must the original data
depend on the whole key relationships.
and nothing but the key.  For some hidden
 There must be no hidden dependencies within keys,
dependencies. double-check the business
assumption to be sure that
it is realistic. Sometimes
you are better off with a
more flexible assumption.

42
Data Rules and Integrity
Sale
 Simple business rules
SaleID SaleDate CID …
 Limits on data ranges 1173 Jan-04 321
 Price > 0 1174 Jan-05 938
 Salary < 100,000 1185 Jan-08 337
 DateHired > 1/12/2005 1190 Jan-09 321
1192 Jan-09 776
 Choosing from a set
 Gender = M, F, Unknown
 Jurisdiction=City, County, State, Federal No data for this
customer yet!
 Referential Integrity
 Foreign key values in one table must exist
Customer
in the master table.
CID Name Phone …
 Sale(SaleID, SaleDate, CID,…) 321 Jones 9983-
 CID must exist in the customer table. 337 Sanchez 7738-
938 Carson 8738-

43
SQL Foreign Key (Oracle, SQL Server)

CREATE TABLE Order


( OID NUMBER(9) NOT NULL,
Odate DATE,
CID NUMBER(9),
CONSTRAINT pk_Order PRIMARY KEY (OID),
CONSTRAINT fk_OrderCustomer
FOREIGN KEY (CID)
REFERENCES Customer (CID)
ON DELETE CASCADE
)

44
Effect of Business Rules

Key business rules:


A player can play on only one team.
There is one referee per match.

45
Business Rules 1

There is one referee per match.


A player can play on only one team.

Match(MatchID, DatePlayed, Location, RefID)


Score(MatchID, TeamID, Score)
Referee(RefID, Phone, Address)
Team(TeamID, Name, Sponsor)
Player(PlayerID, Name, Phone, DoB, TeamID)
PlayerStats(MatchID, PlayerID, Points, Penalties)

RefID and TeamID are not keys in the Match


and Team tables, because of the one-to-one rules.

46
Business Rules 2

There can be several referees per match.


A player can play on only several teams (substitute),
but only on one team per match.
Match(MatchID, DatePlayed, Location, RefID)
Score(MatchID, TeamID, Score)
Referee(RefID, Phone, Address)
Team(TeamID, Name, Sponsor)
Player(PlayerID, Name, Phone, DoB, TeamID)
PlayerStats(MatchID, PlayerID, Points, Penalties)

To handle the many-to-many relationship, we need to make


RefID and TeamID keys. But if you leave them in the same
tables, the tables are not in 3NF. DatePlayed does not depend
on RefID. Player Name does not depend on TeamID.

47
Business Rules 2: Normalized

There can be several referees per match.


A player can play on only several teams (substitute),
but only on one team per match.

Match(MatchID, DatePlayed, Location)


RefereeMatch(MatchID, RefID)
Score(MatchID, TeamID, Score)
Referee(RefID, Phone, Address)
Team(TeamID, Name, Sponsor)
Player(PlayerID, Name, Phone, DoB)
PlayerStats(MatchID, PlayerID, TeamID, Points, Penalties)

48
Converting a Class Diagram
to Normalized Tables

Manager
* 1
* Purchase * 1
Supplier 1 Employee
Order
*
*
Item

Raw Assembled Office subtypes


Materials Components Supplies

49
One-to-Many Relationships

1 * Purchase * 1
Supplier Employee
Order

Supplier(SID, Name, Address, City, State, Zip, Phone)


Employee(EID, Name, Salary, Address, …)

PurchaseOrder(POID, Date, SID, EID)

The many side becomes a key (underlined).


Each PO has one supplier and employee.
(Do not key SID or EID)
Each supplier can receive many POs. (Key PO)
Each employee can place many POs. (Key PO)

50
One-to-Many Sample Data
Supplier

Purchase Order
POID Date SID EID
22234 9-9-2004 5676 221
22235 9-10-2004 5676 554
22236 9-10-2004 7831 221
22237 9-11-2004 8872 335
Employee

51
Many-to-Many Relationships
Purchase PurchaseOrder(POID, Date, SID, EID) Purchase
Order 1 Order
* 1
* *
POItem(POID, ItemID, Quantity, PricePaid) POItem
*
*
* 1 1
Item Item(ItemID, Description, ListPrice) Item

Each POID can have many Items (key/underline ItemID).


Each ItemID can be on many POIDs (key POID).
Need the new intermediate table (POItem) because:
You cannot put ItemID into PurchaseOrder because Date, SID, and EID
do not depend on the ItemID.
You cannot put POID into Item because Description and ListPrice
do not depend on POID.

52
Many-to-Many Sample Data
Purchase Order
POID Date SID EID
22234 9/9 5676 221
22235 9/10 5676 554
22236 9/10 7831 221
22237 9/11 8872 335

POItem
POID ItemID Quantity Price
22234 444098 3 2.00
22234 444185 1 25.00
22235 444185 4 24.00
22236 555828 10 150.00
22236 555982 1 5800.00

Item
ItemID Description ListPrice
444098 Staples 2.00
444185 Paper 28.00
555828 Wire 158.00
555982 Sheet steel 5928.00
888371 Brake assembly 152.00 53
N-ary Associations
Employee
Name
...
1
*
1 * * 1
Component Assembly Product
CompID ProductID
Type Type
Name Name

Assembly
EmployeeID
CompID
ProductID

54
Generalization or Subtypes
Item

Raw Assembled Office


Materials Components Supplies

Item(ItemID, Description, ListPrice)

RawMaterials(ItemID, Weight, StrengthRating)

AssembledComponents(ItemID, Width, Height, Depth)

OfficeSupplies(ItemID, BulkQuantity, Discount)


Add new tables for each subtype.
Use the same key as the generic type (ItemID)--one-to-one relationship.
Add the attributes specific to each subtype.

55
Item
Subtypes Sample Data
ItemID Description ListPrice
444098 Staples 2.00
444185 Paper 28.00
555828 Wire 158.00
555982 Sheet steel 5928.00
888371 Brake assembly 152.00

RawMaterials
ItemID Weight StrengthRating
555828 57 2000
555982 2578 8321

AssembledComponents
ItemID Width Height Depth
888371 1 3 1.5

OfficeSupplies
ItemID BulkQuantity Discount
444098 20 10%
444185 10 15% 56
Composition

Bicycle
Bicycle
Size
Model Type SerialNumber
… ModelType
WheelID Components
CrankID
Wheels StemID ComponentID
… Category
Description
Crank Weight
Cost

Stem

57
Recursive Relationships
Manager
* 1
Employee

Employee(EID, Name, Salary, Address, Manager)


Employee

Add a manager column that contains Employee IDs.


An employee can have only one manager. (Manager is not a key.)
A manager can supervise many employees. (EID is a key.)

58
Normalization Examples

 Possible topics
 Auto repair
 Auto sales
 Department store
 Hair stylist
 HRM department
 Law firm
 Manufacturing
 National Park Service
 Personal stock portfolio
 Pet shop
 Restaurant
 Social club
 Sports team

59
Multiple Views & View Integration

 Collect multiple views  Example


 Documents  Federal Emergency
 Reports Management Agency (FEMA).
 Input forms Disaster planning and relief.
 Make business assumptions as
 Create normalized tables from
necessary, but try to keep them
each view simple.
 Combine the views into one
complete model.
 Keep meta-data in a data
dictionary
 Type of data
 Size
 Volume
 Usage

60
The Pet Store: Sales Form
Sa les
Sa leID Da t e
Cu st om er E m ployeeID
Na m e Na m e
Addr ess
Cit y, St a t e, ZIP

Anim a l Sa le
ID Na m e Ca t egor y Br eed DoB Gen der Reg. Color Don a t ion Gr oup

An im a l Su bTot a l

Mer ch a n dise Sa le
It em Descr ipt ion Ca t egor y List P r ice Sa leP r ice Qu a n t it y Va lu e

Mer ch a n dise Su bt ot a l
Ta x
Tot a l

Sales(SaleID, Date, CustomerID, Name, Address, City, State, Zip,


EmployeeID, Name, (AnimalID, Name, Category, Breed, DateOfBirth,
Gender, Registration, Color, Donation, AdoptionGroup), (ItemID,
Description, Category, ListPrice, SalePrice, Quantity))

61
The Pet Store: Purchase Merchandise

MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SupplierID, Name,


Contact, Phone, Address, City, State, Zip, EmployeeID, Name, HomePhone,
(ItemID, Description, Category, Price, Quantity, QuantityOnHand), ShippingCost)

62
Pet Store Normalization

Sale(SaleID, Date, CustomerID, EmployeeID)


SaleItem(SaleID, ItemID, SalePrice, Quantity)
Customer(CustomerID, Name, Address, City, State, Zip)
Employee(EmployeeID, Name)
Animal(AnimalID, Name, Category, Breed, DateOfBirth,
Gender, Registration, Color, ListPrice, SaleID, Donation, AdoptionID)
Merchandise(ItemID, Description, Category, ListPrice)

MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SID, EmpID,


ShipCost)
MerchandiseOrderItem(PONumber, ItemID, Quantity, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)
Employee(EmployeeID, Name, Phone)
Merchandise(ItemID, Description, Category, QuantityOnHand)

63
Pet Store View Integration

Sale(SaleID, Date, CustomerID, EmployeeID)


SaleItem(SaleID, ItemID, SalePrice, Quantity)
Customer(CustomerID, Name, Address, City, State, Zip)
Employee(EmployeeID, Name, Phone, DateHired)
Animal(AnimalID, Name, Category, Breed, DateOfBirth,
Gender, Registration, Color, SaleID, AdoptionID, Donation)
Merchandise(ItemID, Description, Category, ListPrice, Cost, QuantityOnHand)

MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SID, EmpID, ShipCost)


MerchandiseOrderItem(PONumber, ItemID, Quantity, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)
Employee(EmployeeID, Name, Phone)
Merchandise(ItemID, Description, Category, QuantityOnHand)

64
Pet Store Class Diagram

65
Rolling Thunder Integration Example

Bicycle Assembly form. The main EmployeeID control


is not stored directly, but the value is entered in the
assembly column when the employee clicks the column.

66
Initial Tables for Bicycle Assembly

BicycleAssembly(
SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube, SeatTube,
PaintID, PaintColor, ColorStyle, ColorList, CustomName, LetterStyle, EmpFrame, EmpPaint,
BuildDate, ShipDate,
(Tube, TubeType, TubeMaterial, TubeDescription),
(CompCategory, ComponentID, SubstID, ProdNumber, EmpInstall, DateInstall, Quantity, QOH) )

Bicycle(SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube, SeatTube,


PaintID, ColorStyle, CustomName, LetterStyle, EmpFrame, EmpPaint, BuildDate, ShipDate)
Paint(PaintID, ColorList)
BikeTubes(SerialNumber, TubeID, Quantity)
TubeMaterial(TubeID, Type, Material, Description)
BikeParts(SerialNumber, ComponentID, SubstID, Quantity, DateInstalled, EmpInstalled)
Component(ComponentID, ProdNumber, Category, QOH)

67
Rolling Thunder: Purchase Order

68
RT Purchase Order: Initial Tables

PurchaseOrder(PurchaseID, PODate, EmployeeID, FirstName, LastName, ManufacturerID,


MfgName, Address, Phone, CityID, CurrentBalance, ShipReceiveDate,
(ComponentID, Category, ManufacturerID, ProductNumber, Description, PricePaid, Quantity,
ReceiveQuantity, ExtendedValue, QOH, ExtendedReceived), ShippingCost, Discount)

PurchaseOrder(PurchaseID, PODate, EmployeeID, ManufacturerID,


ShipReceiveDate, ShippingCost, Discount)

Employee(EmployeeID, FirstName, LastName)


Manufacturer(ManufacturerID, Name, Address, Phone, Address, CityID, CurrentBalance)
City(CityID, Name, ZipCode)
PurchaseItem(PurchaseID, ComponentID, Quantity, PricePaid, ReceivedQuantity)
Component(ComponentID, Category, ManufacturerID, ProductNumber, Description, QOH)

69
Rolling Thunder: Transactions

70
RT Transactions: Initial Tables

ManufacturerTransactions(ManufacturerID, Name, Phone, Contact, BalanceDue,


(TransDate, Employee, Amount, Description) )

Manufacturer(ManufacturerID, Name, Phone, Contact, BalanceDue)


ManufacturerTransaction(ManufacturerID, TransactionDate, EmployeeID,
Amount, Description)

71
Rolling Thunder: Components

72
RT Components: Initial Tables

ComponentForm(ComponentID, Product, BikeType, Category, Length, Height,


Width, Weight, ListPrice,Description, QOH, ManufacturerID, Name,
Phone, Contact, Address, ZipCode, CityID, City, State, AreaCode)

Component(ComponentID, ProductNumber, BikeType, Category, Length, Height,


Width,Weight, ListPrice, Description, QOH, ManufacturerID)

Manufacturer(ManufacturerID, Name, Phone, Contact, Address, ZipCode, CityID)


City(CityID, City, State, ZipCode, AreaCode)

73
RT: Integrating Tables

Duplicate Manufacturer tables:


PO Mfr(ManufacturerID, Name, Address, Phone, CityID, CurrentBalance)
Mfg Mfr(ManufacturerID, Name, Phone, Contact, BalanceDue)
Comp Mfr(ManufacturerID, Name, Phone, Contact, Address, ZipCode, CityID)

Note that each form can lead to duplicate tables.


Look for tables with the same keys, but do not expect
them to be named exactly alike.
Find all of the data and combine it into one table.

Manufacturer(ManufacturerID, Name, Contact, Address, Phone,


Address, CityID, |ZipCode, CurrentBalance)

74
RT Example: Integrated Tables

Bicycle(SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube,


SeatTube, PaintID, ColorStyle, CustomName, LetterStyle, EmpFrame,
EmpPaint, BuildDate, ShipDate)
Paint(PaintID, ColorList)
BikeTubes(SerialNumber, TubeID, Quantity)
TubeMaterial(TubeID, Type, Material, Description)
BikeParts(SerialNumber, ComponentID, SubstID, Quantity, DateInstalled, EmpInstalled)
Component(ComponentID, ProductNumber, BikeType, Category, Length, Height, Width,
Weight, ListPrice, Description, QOH, ManufacturerID)
PurchaseOrder(PurchaseID, PODate, EmployeeID, ManufacturerID,
ShipReceiveDate, ShippingCost, Discount)
PurchaseItem(PurchaseID, ComponentID, Quantity, PricePaid, ReceivedQuantity)
Employee(EmployeeID, FirstName, LastName)
Manufacturer(ManufacturerID, Name, Contact, Address, Phone,
CityID, ZipCode, CurrentBalance)
ManufacturerTransaction(ManufacturerID, TransactionDate, EmployeeID, Amount,
Description, Reference)
City(CityID, City, State, ZipCode, AreaCode)

75
Rolling Thunder Tables

76
View Integration (FEMA Example 1)

Team Roster
Team# Date Formed Leader
Home Base Name Fax Phone
Response time (days) Address, C,S,Z Home phone

Team Members/Crew
ID Name Home phone Specialty DoB SSN Salary

Total Salary

 This first form is kept for each team that can be called on to help in
emergencies.

77
View Integration (FEMA Example 2)

Disaster Name HQ Location On-Site Problem Report


Local Agency Commander
Political Contact

Date Reported Assigned Problem# Severity


Problem Description

Reported By: Specialty Specialty Rating


Verified By: Specialty Specialty Rating

SubProblem Details
Sub Prob# Category Description Action Est. Cost

Total Est. Cost

 Major problems are reported to HQ to be prioritized and scheduled for


correction.

78
View Integration (FEMA Example 3)

Location Damage Analysis Date Evaluated


LocationID, Address Team Leader Title Repair Priority
Latitude, Longitude Cellular Phone Damage Description

Room Damage Descrip. Damage% Item Value $Loss

Room Damage Descrip. Damage% Item Value $Loss

Item Loss Total


Estimated Damage Total

 On-site teams examine buildings and file a report on damage at that


location.

79
View Integration (FEMA Example 3a)

 Location Analysis(LocationID, MapLatitude, MapLongitude, Date,


Address, Damage, PriorityRepair, Leader, LeaderPhone, LeaderTitle,
(Room, Description, PercentDamage, (Item, Value, Loss)))

80
View Integration (FEMA Example 4)
Task Completion Report Date
Disaster Name Disaster Rating HQ Phone

Problem# Supervisor Date


SubProblem Team# Team Specialty CompletionStatus Comment Expenses

Total Expenses
Problem# Supervisor Date
SubProblem Team# Team Specialty CompletionStatus Comment Expenses

Total Expenses

 Teams file task completion reports. If a task is not completed, the


percentage accomplished is reported as the completion status.

81
View Integration (FEMA Example 4a)

 TasksCompleted(Date, DisasterName, DisasterRating, HQPhone,


(Problem#, Supervisor, (SubProblem, Team#, CompletionStatus,
Comments, Expenses))

82
DBMS Table Definition

 Enter Tables  Column Properties


 Columns  Format
 Keys  Input Mask
 Data Types  Caption
 Text  Default
 Memo  Validation Rule
 Number
 Validation Text
 Byte
 Integer, Long  Required & Zero Length
 Single, Double  Indexed
 Date/Time  Relationships
 Currency
 One-to-One
 AutoNumber (Long)
 Yes/No
 One-to-Many
 OLE Object  Referential Integrity
 Descriptions  Cascade Update/Delete
 Define before entering data

83
Key Table Definition in Access

Numeric Subtypes or text length 84


Graphical Table Definition in Oracle

SQL Developer
85
Graphical Table Definition in SQL Server

86
CREATE TABLE Animal
( SQL Table Definition
AnimalID INTEGER,
Name NVARCHAR2(50),
Category NVARCHAR2(50),
Breed NVARCHAR2(50),
DateBorn DATE,
Gender NVARCHAR2(50) CHECK (Gender='Male' Or
Gender='Female' Or Gender='Unknown' Or Gender Is Null),
Registered NVARCHAR2(50),
Color NVARCHAR2(50),
Photo LONG RAW,
ImageFile NVARCHAR2(250),
ImageHeight INTEGER,
ImageWidth INTEGER,
AdoptionID INTEGER,
Donation NUMBER(10,2),
CONSTRAINT pk_Animal PRIMARY KEY (AnimalID),
CONSTRAINT fk_BreedAnimal FOREIGN KEY (Category,Breed)
REFERENCES Breed(Category,Breed)
ON DELETE CASCADE,
CONSTRAINT fk_CategoryAnimal FOREIGN KEY (Category)
REFERENCES Category(Category)
ON DELETE CASCADE
);
87
Oracle Databases

 For Oracle and SQL Server, it is best to create a text file that contains all of
the SQL statements to create the table.
 It is usually easier to modify the text table definition.
 The text file can be used to recreate the tables for backup or transfer to another
system.
 To make major modifications to the tables, you usually create a new table, then
copy the data from the old table, then delete the old table and rename the new
one. It is much easier to create the new table using the text file definition.
 Be sure to specify Primary Key and Foreign Key constraints.
 Be sure to create tables in the correct order—any table that appears in a
Foreign Key constraint must first be created. For example, create Customer
before creating Order.
 In Oracle, to substantially improve performance, issue the following command
once all tables have been created:
 Analyze table Animal compute statistics;

88
Data Volume

 Estimate the total size of the  For concatenated keys (and


database. similar tables).
 Current.  OrderItems(O#, Item#, Qty)
 Future growth.  Hard to “know” the total number of
 Guide for hardware and software items ordered.
 Start with the total number of
purchases.
orders.
 For each table.  Multiply by the average number of
 Use data types to estimate the items on a typical order.
number of bytes used for each  Need to know time frame or how
row. long to keep data.
 Multiply by the estimated number  Do we store all customer data
of rows. forever?
 Add the value for each table to get  Do we keep all orders in the
the total size. active database, or do we migrate
older ones?

89
Data Volume Example

Customer(C#, Name, Address, City, State, Zip)


Row: 4 + 15 + 25 + 20 + 2 + 10 = 76
Order(O#, C#, Odate)
Row: 4 + 4 + 8 = 16
OrderItem(O#, P#, Quantity, SalePrice)
Row: 4+4 + 4 + 8 = 20
10 Orders
Ordersin 3 yrs  1000 Customers * * 3 yrs  30,000
Customer
5 Lines
OrderLines  30,000 Orders *  150,000
Order
 Business rules
 Three year retention. Customer 76*1000 76,000
 1000 customers. Order 16*30,000 480,000
 Average 10 orders per customer per OrderItem 20*150,000 3,000,000
year. Total 3,556,000
 Average 5 items per order.

90
Appendix: Formal Definitions: Terms

Formal Definition Informal


Relation A set of attributes with data Table
that changes over time. Often
denoted R.
Attribute Characteristic with a real-world Column
domain. Subsets of attributes
are multiple columns, often
denoted X or Y.
Tuple The data values returned for Row of data
specific attribute sets are often
denoted as t[X]
Schema Collection of tables and
constraints/relationships
Functional XY Business rule
dependency dependency

91
Appendix: Functional Dependency

Derives from a real-world relationship/constraint.

Denoted X  Y for sets of attributes X and Y

Holds when any rows of data that have identical values for
X attributes also have identical values for their Y attributes:
If t1[X] = t2[X], then t1[Y] = t2[Y]

X is also known as a determinant if X is non-trivial (not a


subset of Y).

92
Appendix: Keys

Keys are attributes that are ultimately used to identify rows of data.

A key K (sometimes called candidate key) is a set of attributes


(1) With FD K  U where U is all other attributes in the relation
(2) If K’ is a subset of K, then there is no FD K’  U

A set of key attributes functionally determines all other attributes in the


relation, and it is the smallest set of attributes that will do so (there is
no smaller subset of K that determines the columns.)

93
Appendix: First Normal Form

A relation is in first normal form (1NF) if and only if all attributes are
atomic.

Atomic attributes are single valued, and cannot be composite, multi-


valued or nested relations.

Example:
Customer(CID, Name: First + Last, Phones, Address)
CID Name: First + Last Phones Address
111 Joe Jones 111-2223 123 Main
111-3393
112-4582

94
Appendix: Second Normal Form
A relation is in second normal form (2NF) if it is in 1NF and each non-
key attribute is fully functionally dependent on the primary key.

K  Ai for each non-key attribute Ai


That is, there is no subset K’ such that K’  Ai

Example:
OrderProduct(OrderID, ProductID, Quantity, Description)

OrderID ProductID Quantity Description


32 15 1 Blue Hose
32 16 2 Pliers
33 15 1 Blue Hose

95
Appendix: Transitive Dependency

Given functional dependencies: X  Y and Y  Z, the transitive


dependency X  Z must also hold.

Example:
There is an FD between OrderID and CustomerID. Given the OrderID
key attribute, you always know the CustomerID.

There is an FD between CustomerID and the other customer data,


because CustomerID is the primary key. Given the CustomerID, you
always know the corresponding attributes for Name, Phone, and so on.

Consequently, given the OrderID (X), you always know the


corresponding customer data by transitivity.

96
Appendix: Third Normal Form

A relation is in third normal form if and only if it is in 2NF and no non-


key attributes are transitively dependent on the primary key.

That is, K  Ai for each attribute, (2NF) and


There is no subset of attributes X such that K  X  Ai

Example:
Order(OrderID, OrderDate, CustomerID, Name, Phone)

OrderID OrderDate CustomerID Name Phone


32 May-05 1 Jones 222-3333
33 May-05 2 Hong 444-8888
34 May-05 1 Jones 222-3333

97
Appendix: Boyce-Codd Normal Form

A relation is in Boyce-Codd Normal Form (BCNF) if and only if it is in


3NF and every determinant is a candidate key (or K is a superkey).
That is, K  Ai for every attribute, and there is no subset X (key or
nonkey) such that X  Ai where X is different from K.

Example: Employees can have many specialties, and many employees


can be within a specialty. Employees can have many managers, but a
manager can have only one specialty: Mgr  Specialty
EmpSpecMgr(EID, Specialty, ManagerID)

EID Speciality ManagerID


FD ManagerID  Specialty
32 Drill 1
is not currently a key.
33 Weld 2
34 Drill 1

98
Appendix: Multi-Valued Dependency

A multi-valued dependency (MVD) exists when there are at least three


attributes in a relation (A, B, and C; and they could be sets), and one
attribute (A) determines the other two (B and C) but the other two are
independent of each other.

That is, A B and A  C but B and C have no FDs

Example:
Employees have many specialties and many tools, but tools
and specialties are not directly related.

99
Appendix: Fourth Normal Form

A relation is in fourth normal form 4NF if and only if it is in BCNF and


there are no multi-valued dependencies.
That is, all attributes of R are also functionally dependent on A.
If A   B, then all attributes of R are also functionally dependent on A:
A  Ai for each attribute.

Example:
EmpSpecTools(EID, Specialty, ToolID)

EmpSpec(EID, Specialty)
EmpTools(EID, ToolID)

100

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy