Into RDBMS and Normalization
Into RDBMS and Normalization
RDBMS stands for Relational Database Management System. RDBMS is the basis for SQL, and
for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft
Access.
What is table?
The data in RDBMS is stored in database objects called tables. The table is a collection of related
data entries and it consists of columns and rows.
Remember, a table is the most common and simplest form of data storage in a relational database.
Following is the example of a CUSTOMERS table:
+---- ----------
+ +
| 6 | Komal | 22 | MP 4500.00 |
Every table is broken up into smaller entities called fields. The fields in the CUSTOMERS table
consist of ID, NAME, AGE, ADDRESS and SALARY.
A field is a column in a table that is designed to maintain specific information about every
record in the table.
A record, also called a row of data, is each individual entry that exists in a table. For example
there are 7 records in the above CUSTOMERS table. Following is a single row of data or record
in the CUSTOMERS table:
-----------+ +
-----------+ +
What is column?
A column is a vertical entity in a table that contains all information associated with a specific
field in a table.
For example, a column in the CUSTOMERS table is ADDRESS, which represents location
description and would consist of the following:
What is NULL value?
A NULL value in a table is a value in a field that appears to be blank, which means a field with
a NULL value is a field with no value.
It is very important to understand that a NULL value is different than a zero value or a field that
contains spaces. A field with a NULL value is one that has been left blank during record creation.
SQL Constraints:
Constraints are the rules enforced on data columns on table. These are used to limit the type of
data that can go into a table. This ensures the accuracy and reliability of the data in the database.
Constraints could be column level or table level. Column level constraints are applied only to one
column whereas table level constraints are applied to the whole table.
• NOT NULL Constraint: Ensures that a column cannot have NULL value.
• DEFAULT Constraint: Provides a default value for a column when none is specified.
• CHECK Constraint: The CHECK constraint ensures that all values in a column satisfy
certain conditions.
• INDEX: Use to create and retrieve data from the database very quickly.
Data Integrity:
The following categories of the data integrity exist with each RDBMS:
• Referential integrity: Rows cannot be deleted, which are used by other records.
• User-Defined Integrity: Enforces some specific business rules that do not fall into entity, domain or
referential integrity.
Normalization
Q. Define normalization.
Q. What is meant by database normalization?
“Normalization can be defined as process of decomposition of database tables to avoid the data
redundancy.”
If a database design is not perfect, it may contain anomalies, which are like a bad dream for any
database administrator. Managing a database with anomalies is next to impossible.
• Update anomalies − If data items are scattered and are not linked to each other properly, then it could
lead to strange situations. For example, when we try to update one data item having its copies scattered
over several places, a few instances get updated properly while a few others are left with old values. Such
instances leave the database in an inconsistent state.
• Deletion anomalies − We tried to delete a record, but parts of it was left undeleted because of
unawareness, the data is also saved somewhere else.
• Insert anomalies − We tried to insert data in a record that does not exist at all.
Normalization is a method to remove all these anomalies and bring the database to a consistent
state.
Each attribute must contain only a single value from its pre-defined domain.
• Prime attribute − An attribute, which is a part of the prime-key, is known as a prime attribute.
• Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a non-prime
attribute.
If we follow second normal form, then every non-prime attribute should be fully functionally
dependent on prime key attribute. That is, if X → A holds, then there should not be any proper subset
Y of X, for which Y → A also holds true.
We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon
both and not on any of the prime key attribute individually. But we find that Stu_Name can be
identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
calledpartial dependency, which is not allowed in Second Normal Form.
We broke the relation in two as depicted in the above picture. So there exists no partial
dependency.
For a relation to be in Third Normal Form, it must be in Second Normal form and the following
must satisfy −
Address Table :
Zip Street City state
Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF
states that −
In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-key
in the relation ZipCodes. So,
Stu_ID
Stu_Na
me,
Zip and
Zip → City