0% found this document useful (0 votes)
603 views108 pages

DataBase Management (VBSPU 4th Sem)

The document discusses database management systems and their advantages over traditional file-oriented data processing systems. It provides definitions of key concepts like data, records, files and databases. It explains that a DBMS allows for structured storage of data with metadata that describes the data. This separation of data and description allows for more flexible querying of data. It also allows multiple users to concurrently access and update data. Additional advantages include reduced data redundancy, enforced data integrity, security controls and centralized administration.

Uploaded by

Atharv Katkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
603 views108 pages

DataBase Management (VBSPU 4th Sem)

The document discusses database management systems and their advantages over traditional file-oriented data processing systems. It provides definitions of key concepts like data, records, files and databases. It explains that a DBMS allows for structured storage of data with metadata that describes the data. This separation of data and description allows for more flexible querying of data. It also allows multiple users to concurrently access and update data. Additional advantages include reduced data redundancy, enforced data integrity, security controls and centralized administration.

Uploaded by

Atharv Katkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 108

Data Base Management

System
PRAVEEN KUMAR SRIVASTAVA
Database

Database is a collection of logically related data and


data is a collection of facts and figures that can be processed to produce information.
Mostly data represents recordable facts.
Data aids in producing information, which is based on facts. For example, if we have data about marks
obtained by all students, we can then conclude about toppers and average marks.

A database management system stores data in such a way that it becomes easier to retrieve, manipulate, and
produce information.
To find out what database is, we have to start from data, which is the basic building block of any DBMS.
Data: Facts, figures, statistics etc. having no particular meaning (e.g. 1, ABC, 19 etc).
Record: Collection of related data items, e.g. in the above example the three data items had no
meaning.
Data Processing Vs. Data Management Systems: -

Although Data Processing and Data Management Systems both refer to functions that take raw data and
transform it into usable information, the usage of the terms is very different. Data Processing is the term
generally used to describe what was done by large mainframe computers from the late 1940's until the
early 1980's (and which continues to be done in largest organizations to a greater or lesser extent even
today): large volumes of raw transaction data fed into programs that update a master file, with fixed-
format reports written to paper.
The term Data Management Systems refers to an expansion of this concept, where the raw data,
previously copied manually from paper to punched cards, and later into data-entry terminals, is now fed
into the system from a variety of sources, including ATMs, EFT, and direct customer entry through the
Internet. The master file concept has been largely displaced by database management systems, and static
reporting replaced or augmented by ad-hoc reporting and direct inquiry, including downloading of data by
customers. The ubiquity of the Internet and the Personal Computer have been the driving force in the
transformation of Data Processing to the more global concept of Data Management Systems.
File Oriented Approach: -
The earliest business computer systems were used to process business records and produce
information. They were generally faster and more accurate than equivalent manual systems.
These systems stored groups of records in separate files, and so they were called file processing
systems. In a typical file processing systems, each department has its own files, designed
specifically for those applications. The department itself working with the data processing staff,
sets policies or standards for the format and maintenance of its files.
Programs are dependent on the files and vice-versa; that is, when the physical format of the file is
changed, the program has also to be changed. Although the traditional file oriented approach to
information processing is still widely used, it does have some very important disadvantages.
.N File System DBMS
O.
1. File system is a software that manages and DBMS is a software for managing the
organizes the files in a storage medium within database.
a computer.
2. Redundant data can be present in a file In DBMS there is no redundant data.
system.
3. It doesn’t provide backup and recovery of It provides backup and recovery of data
data if it is lost. even if it is lost.
4. There is no efficient query processing in file Efficient query processing is there in DBMS.
system.
5. There is less data consistency in file system. There is more data consistency because of
the process of normalization.

6. It is less complex as compared to DBMS. It has more complexity in handling as


compared to file system.
7. File systems provide less security in DBMS has more security mechanisms as
comparison to DBMS. compared to file system.

8. It is less expensive than DBMS. It has a comparatively higher cost than a


file system.
Characteristics of Database: -

Concurrent Use:

A database system allows several users to access the database concurrently. Answering different
questions from different users with the same (base) data is a central aspect of an information system.
Such concurrent use of data increases the economy of a system.

An example for concurrent use is the travel database of a bigger travel agency. The employees of
different branches can access the database concurrently and book journeys for their clients. Each
travel agent sees on his interface if there are still seats available for a specific journey or if it is already
fully booked.
Structured and Described Data: -
A fundamental feature of the database approach is that the database systems does not only contain
the data but also the complete definition and description of these data. These descriptions are basically
details about the extent, the structure, the type and the format of all data and, additionally, the
relationship between the data. This kind of stored data is called metadata ("data about data").

Separation of Data and Applications: -


the structure of a database is described through metadata which is also stored in the database. An
application software does not need any knowledge about the physical data storage like encoding, format,
storage place, etc. It only communicates with the management system of a database (DBMS) via a
standardized interface with the help of a standardized language like PL/SQL.
To access to the data and the metadata is entirely done by the DBMS. In this way all the applications can
be totally separated from the data. Therefore, database internal reorganizations or improvement of
efficiency do not have any influence on the application software.
Data Integrity: -
Data integrity refers to the overall accuracy, completeness, and reliability of data. Data integrity is
preserved by an array of error-checking and validation procedures, rules, and principles executed
during the integration flow designing phase. These checks and correction procedures are based on a
predefined set of business rules.

Transactions: -
A transaction is a bundle of actions which are done within a database to bring it from one consistent
state to a new consistent state.
A transaction is atomic what means that it cannot be divided up any further. Within a transaction all
or none of the actions need to be carried out. Doing only a part of the actions would lead to an
inconsistent database state.
One example of a transaction is the transfer of an amount of money from one bank account to
another. The debit of the money from one account and the credit of it to another account makes
together a consistent transaction. This transaction is also atomic. The debit or credit alone would
both lead to an inconsistent state. After finishing the transaction (debit and credit) the changes to
both accounts become persistent and the one who gave the money has now less money on his
account while the receiver has now a higher balance.
Data Persistence: -

Data persistence means that in a DBMS all data is maintained as long as it is not deleted
explicitly. The life span of data needs to be determined directly or indirectly by the user and
must not be dependent on system features. Additionally, data once stored in a database must
not be lost. Changes of a database which are done by a transaction are persistent. When a
transaction is finished even a system crash cannot put the data in danger.
Advantages of a DBMS: -
Using a DBMS to manage data has many advantages:

Data independence: - Application programs should be as independent as possible from details


of data representation and storage. The DBMS can provide an abstract view of the data to insulate
application code from such details.

Reduction of Redundancy: - This is perhaps the most significant advantage of using DBMS.
Redundancy is the problem of storing the same data item in more one place. Redundancy
creates several problems like requiring extra storage space, entering same data more than
once during data insertion, and deleting data from more than one place during deletion.
Anomalies may occur in the database if insertion, deletion etc are not done properly.
Efficient data access: - A DBMS utilizes a variety of sophisticated techniques to store and
retrieve data efficiently. This feature is especially important if the data is stored on external storage
devices.

Data integrity and security: - If data is always accessed through the DBMS, the DBMS can
enforce integrity constraints on the data. For example, before inserting salary information for an
employee, the DBMS can check that the department budget is not exceeded. Also, the DBMS can
enforce access controls that govern what data is visible to different classes of users.

Sharing of Data:-In a paper-based record keeping, data cannot be shared among many users.
But in computerized DBMS, many users can share the same database if they are connected
via a network.
Data administration: - When several users share the data, centralizing the administration of data can offer
significant improvements. Experienced professionals who understand the nature of the data being managed, and
how different groups of users use it, can be responsible for organizing the data representation to minimize
redundancy and fine-tuning the storage of the data to make retrieval efficient.

Concurrent access and crash recovery: A DBMS schedules concurrent accesses to the data in such a
manner that users can think of the data as being accessed by only one user at a time. Further, the DBMS protects
users from the effects of system failures.
Disadvantages of a DBMS: -

Danger of an Overkill: - For small and simple applications for single users a database system is
often not advisable.

Complexity: - A database system creates additional complexity and requirements. The supply and
operation of a database management system with several users and databases is quite costly and
demanding.

Qualified Personnel: - The professional operation of a database system requires appropriately


trained staff. Without a qualified database administrator nothing will work for long.

Costs: - Through the use of a database system new costs are generated for the system itself but also for
additional hardware and the more complex handling of the system.

Lower Efficiency: - A database system is a multi-use software which is often less efficient than
specialized software which is produced and optimized exactly for one problem.
Data Independence: -

It is the property of the database which tries to ensure that if we make any change in any
level of schema of the database, the schema immediately above it would require minimal or no
need of change. It removes the need for additional amount of work needed in adopting the single
change into all the levels above.
Data independence can be classified into the following two types:
1. Physical Data Independence: - This means that for any change made in the physical
schema, the need to change the logical schema is minimal. This is practically easier to achieve.

2. Logical Data Independence: - This means that for any change made in the logical
schema, the need to change the external schema is minimal. As we shall see, this is a little
difficult to achieve.
Instances and Schemas: -
Databases change over time as information is inserted and deleted. The collection of information stored
in the database at a particular moment is called an instance of the database. The overall design of the
database is called the database schema.

The concept of database schemas and instances can be understood by analogy to a program written in a
programming language. A database schema corresponds to the variable declarations (along with
associated type definitions) in a program. Each variable has a particular value at a given instant. The
values of the variables in a program at a point in time correspond to an instance of a database schema.
Database systems have several schemas, partitioned according to the levels of abstraction.

The physical schema describes the database design at the physical level, while the
logical schema describes the database design at the logical level.
Database may also have several schemas at the view level, sometimes called subschemas, that describe
different views of the database
Three Views of Data: -

Physical Level
This is the lowest level in the three level
architecture. It is also known as the internal level.
The physical level describes how data is actually
stored in the database. In the lowest level, this
data is stored in the external hard drives in the
form of bits and at a little high level, it can be said
that the data is stored in files and folders. The
physical level also discusses compression and
encryption techniques.
Conceptual Level
The conceptual level is at a higher level than the physical level. It is also known as the
logical level. It describes how the database appears to the users conceptually and the
relationships between various data tables. The conceptual level does not care for how
the data in the database is actually stored.
External Level
This is the highest level in the three level architecture and closest to the user. It is also
known as the view level. The external level only shows the relevant database content
to the users in the form of views and hides the rest of the data. So different users can
see the database as a different view as per their individual requirements.
The Three-Schema Architecture: -

The goal of the three-schema architecture is to separate the user applications and the physical
database. In this architecture, schemas can be defined at the following three levels:

1. The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model and describes the complete details of
data storage and access paths for the database.

2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types, relationships, user operations, and
constraints. A high-level data model or an implementation data model can be used at this level.

3. The external or view level includes a number of external schemas or user views. Each external
schema describes the part of the database that a particular user group is interested in and hides the
rest of the database from that user group. A high-level data model or an implementation data model
can be used at this level.
Database Users: -

There are four different types of database-system users, differentiated by the way they expect to interact with
the system. Different types of user interfaces have been designed for the different types of users.
1. Naive users are unsophisticated users who interact with the system by invoking one of the application
programs that have been written previously. For example, a bank teller who needs to transfer $50 from account
A to account B invokes a program called transfer. This program asks the teller for the amount of money to be
transferred, the account from which the money is to be transferred, and the account to which the money is to be
transferred.
2. Application programmers are computer professionals who write application programs. Application
programmers can choose from many tools to develop user interfaces. Rapid application development (RAD)
tools are tools that enable an application programmer to construct forms and reports without writing a program.
There are also special types of programming languages that combine imperative control structures (for example,
for loops, while loops and if-then-else statements) with statements of the data manipulation language. These
languages, sometimes called fourth-generation languages, often include special features to facilitate the
generation of forms and the display of data on the screen. Most major commercial database systems include a
fourth generation language.
3. Sophisticated users interact with the system without writing programs. Instead, they form
their requests in a database query language. They submit each such query to a query processor,
whose function is to break down DML statements into instructions that the storage manager
understands. Analysts who submit queries to explore data in the database fall in this category.

4. Database Administrator One of the main reasons for using DBMSs is to have central control of
both the data and the programs that access those data. A person who has such central control over
the system is called a database administrator (DBA).
Database Administrator (DBA):-
The functions of a DBA include:
1. Schema definition: - The DBA creates the original database schema by executing a set of data definition
statements in the DDL. Storage structure and access-method definition.

2. Schema and physical-organization modification: -The DBA carries out changes to the schema and physical
organization to reflect the changing needs of the organization, or to alter the physical organization to improve
performance.

3. Granting of authorization for data access: - By granting different types of authorization, the database
administrator can regulate which parts of the database various users can access. The authorization
information is kept in a special system structure that the database system consults whenever someone
attempts to access the data in the system.

4. Routine maintenance: - Examples of the database administrator’s routine maintenance activities are:
Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of data in case
of disasters such as flooding. Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required.

Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive
tasks submitted by some users.
Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a database
at each level of data abstraction. Therefore, there are following four data models used for
understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows and columns within a
table. Thus, a relational model uses tables for representing data and in-between relationships. Tables
are also called relations. This model was initially described by Edgar F. Codd, in 1969. The relational
data model is the widely used model which is primarily used by commercial data processing
applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and
relationships among them. These objects are known as entities, and relationship is an association among
these entities. This model was designed by Peter Chen and published in 1976 papers. It was widely used
in database designing. A set of attributes describe the entities.
For example, student_name, student_id describes the 'student' entity. A set of the same type of entities is
known as an 'Entity set', and the set of the same type of relationships is known as 'relationship set'.
3) Object-based Data Model: An extension of the ER model with notions of functions,
encapsulation, and object identity, as well. This model supports a rich type system that includes
structured and collection types. Thus, in 1980s, various database systems following the object-
oriented approach were developed. Here, the objects are nothing but the data carrying its
properties.

4) Semistructured Data Model: This type of data model is different from the other three data
models . The semistructured data model allows the data specifications at places where the
individual data items of the same type may have different attributes sets.
The Extensible Markup Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for including the markup information to the
text document, it gains importance because of its application in the exchange of data.
Database Language
•A DBMS has appropriate languages and interfaces to express database queries and updates.
•Database languages can be used to read, store and update the data in the database.

Types of Database Language

1. Data Definition Language


•DDL stands for Data Definition Language. It is used to define database structure or pattern.
•It is used to create schema, tables, indexes, constraints, etc. in the database.
•Using the DDL statements, you can create the skeleton of the database.
•Data definition language is used to store the information of metadata like the number of tables and
schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
•Create: It is used to create objects in the database.
•Alter: It is used to alter the structure of the database.
•Drop: It is used to delete objects from the database.
•Truncate: It is used to remove all records from a table.
•Rename: It is used to rename an object.
•Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under Data definition
language.
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.

Here are some tasks that come under DML:


•Select: It is used to retrieve data from a database.
•Insert: It is used to insert data into a table.
•Update: It is used to update existing data within a table.
•Delete: It is used to delete all records from a table.
•Merge: It performs UPSERT operation, i.e., insert or update operations.
•Call: It is used to call a structured query language or a Java subprogram.
•Explain Plan: It has the parameter of explaining data.
•Lock Table: It controls concurrency.
3. Data Control Language

•DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
•The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of rolling
back.)

Here are some tasks that come under DCL:


•Grant: It is used to give user access privileges to a database.
•Revoke: It is used to take back permissions from the user.
4. Transaction Control Language

TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.

Here are some tasks that come under TCL:

•Commit: It is used to save the transaction on the database.


•Rollback: It is used to restore the database to original since the last Commit.
ER MODEL: –
The ER model defines the conceptual view of a database. It works around real-world entities and the
associations among them. At view level, the ER model is considered a good option for designing databases.

Entity: -
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For
example, in a school database, students, teachers, classes, and courses offered can be considered as
entities. All these entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities.
An entity set may contain entities with attribute sharing similar values.
For example:- a Students set may contain all the students of a school; likewise, a Teachers set may contain
all the teachers of a school from all faculties. Entity sets need not be disjoint.
Attributes: -
Entities are represented by means of their properties called attributes. All attributes have values.
For example:- a student entity may have name, class, and age as attributes. There exists a domain or range of
values that can be assigned to attributes.
For example:- a student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Types of Attributes: -
Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key attribute.
For example:- Roll_No will be unique for each student. In ER diagram, key attribute is
represented by an oval with underlying lines.
Composite Attribute –
An attribute composed of many other attribute is called as composite attribute.
For example:- Address attribute of student Entity type consists of Street, City, State, and
Country. In ER diagram, composite attribute is represented by an oval comprising of ovals.
Multivalued Attribute –
An attribute consisting more than one value for a given entity.
For example:- Phone_No (can be more than one for a given student. In ER diagram,
multivalued attribute is represented by double oval.
Derived Attribute –
An attribute which can be derived from other attributes of the entity type is known as derived
attribute.
e.g.; Age (can be derived from DOB). In ER diagram, derived attribute is represented by dashed
oval.
The complete entity type Student with its attributes can be represented as:
Entity-Set:-
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Relationship:-
The association among entities is called a relationship.
For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and
Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive attributes.
Degree of a relationship set:
The number of different entity sets participating in a relationship set is called as degree of a
relationship set.
Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship is called as unary
relationship.
For example, one person is married to only one person.
Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as binary
relationship.For example, Student is enrolled in Course.

n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called as n-ary
relationship.
Mapping Cardinalities:-
Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of
other set via relationship set.
1. One-to-one: One entity from entity set A can be associated with at most one entity of entity set B and vice
versa.

1 1
Teacher Teaches Student
2. One-to-many: One entity from entity set A can be associated with more than one entities of
entity set B, however an entity from entity set B can be associated with at most one entity.

1 M
Teacher Teaches Student
Many to one – When entities in one entity set can take part only once in the relationship
set and entities in other entity set can take part more than once in the relationship
set, cardinality is many to one. Let us assume that a student can take only one course but
one course can be taken by many students. So the cardinality will be n to 1. It means that for
one course there can be n students but for one student, there will be only one course.
Many to many – When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. Let us assume that a student can take more than one
course and one course can be taken by many students. So the relationship will be many to
many.
ER DIAGRAM REPRESENTATION:-
Any object, for example, entities, attributes of an entity, relationship sets, and attributes of relationship sets,
can be represented with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.

Teacher Student Manager


Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every ellipse
represents one attribute and is directly connected to its entity (rectangle).

Roll No
Email. id

Mob. Student Name


No.

Add DOB
If the attributes are composite, they are further divided in a tree like structure. Every node is then connected
to its attribute. That is, composite attributes are represented by ellipses that are connected with an ellipse.

Roll No First
Email. id Name

Mob. Student Name Second


No. Name

Add DOB
Multivalued attributes are depicted by double ellipse.

Roll No First
Email. id Name

Mob. Student Name Second


No. Name

Add DOB
Participation Constraints:-
Total Participation: Each entity is involved in the relationship. Total participation is represented by double
lines.
Partial participation: Not all entities are involved in the relationship. Partial participation is represented by
single lines.
Generalization and Specialization: -
The ER Model has the power of expressing database entities in a conceptual hierarchical manner. As the
hierarchy goes up, it generalizes the view of entities, and as we go deep in the hierarchy, it gives us the
detail of every entity included.
Going up in this structure is called generalization, where entities are clubbed together to represent a
more generalized view. For example, a particular student named Mira can be generalized along with all
the students. The entity shall be a student, and further, the student is a person. The reverse is called
specialization where a person is a student, and that student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the
properties of all the generalized entities, is called generalization. In generalization, a number of entities
are brought together into one generalized entity based on their similar characteristics. For example,
pigeon, house sparrow, crow, and dove can all be generalized as Birds.
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-groups
based on their characteristics. Take a group ‘Person’ for example. A person has name, date of birth, gender, etc.
These properties are common in all persons, human beings. But in a company, persons can be identified as
employee, employer, customer, or vendor, based on what
role they play in the company.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.

For example: Center entity offers the Course entity act as a single entity in the relationship which
is in a relationship with another entity visitor. In the real world, if a visitor visits a coaching center
then he will never enquiry about the Course only or just about the Center instead he will ask the
enquiry about both.
Reduction of ER diagram to Table

The database can be


represented using the
notations, and these
notations can be reduced
to a collection of tables.
In the database, every
entity set or relationship
set can be represented in
tabular form.
The ER diagram is given
below:
There are some points for converting the ER diagram to the table:
•Entity type becomes a table.
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual tables.
•All single-valued attribute becomes a column for the table.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT table.
Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so on.

•A key attribute of the entity type represented by the primary key.


In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the key
attribute of the entity.
•The multivalued attribute is represented by a separate table.
In the student table, a hobby is a multivalued attribute. So it is not possible to represent multiple values in a
single column of STUDENT table. Hence we create a table STUD_HOBBY with column name
STUDENT_ID and HOBBY. Using both the column, we create a composite key.
•Composite attribute represented by components.
In the given ER diagram, student address is a composite attribute. It contains CITY, PIN, DOOR#, STREET,
and STATE. In the STUDENT table, these attributes can merge as an individual column.
•Derived attributes are not considered in the table.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by calculating
the difference between current date and Date of Birth.
Relational Model concept

Relational model can represent as a table with columns and rows. Each row is known as a tuple. Each table of
the column has a name or attribute.

Domain: It contains a set of atomic values that an attribute can take.

Attribute: It contains the name of a column in a particular table. Each attribute Ai must have a domain, dom(Ai)

Relational instance: In the relational database system, the relational instance is represented by a finite set of
tuples. Relation instances do not have duplicate tuples.

Relational schema: A relational schema contains the name of the relation and name of all columns or
attributes.

Relational key: In the relational key, each row has one or more attributes. It can identify the row in the relation
uniquely.
NAME ROLL_NO PHONE_NO ADDRESS AGE

Ram 14795 7305758992 Noida 24


Shyam 12839 9026288936 Delhi 35

Laxman 33289 8583287182 Gurugram 20


Mahesh 27857 7086819134 Ghaziabad 27
Ganesh 17282 9028 9i3988 Delhi 40

•In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the attributes.
•The instance of schema STUDENT has 5 tuples.
Properties of Relations
•Name of the relation is distinct from all other relations.
•Each relation cell contains exactly one atomic (single) value
•Each attribute contains a distinct name
•Attribute domain has no significance
•tuple has no duplicate value
•Order of tuple can have a different sequence
Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain the result
of the query. It uses operators to perform queries.
Types of Relational operation
1. Select Operation:
•The select operation selects tuples that satisfy a given predicate.
•It is denoted by sigma (σ).

Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and NOT. These
relational can use as relational operators like =, ≠, ≥, <, >, ≤.
For example: LOAN Relation
BRANCH_NAME LOAN_NO AMOUNT
Downtown L-17 1000
Redwood L-23 2000
Perryride L-15 1500
Downtown L-14 1500
Mianus L-13 500
Roundhill L-11 900
Perryride L-16 1300

σ BRANCH_NAME="perryride" (LOAN)
BRANCH_NAME LOAN_NO AMOUNT
Perryride L-15 1500
Perryride L-16 1300
Project Operation:
•This operation shows the list of those attributes that we wish to appear in the result. Rest of
the attributes are eliminated from the table.
•It is denoted by ∏.
Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.
Example: CUSTOMER RELATION

NAME STREET CITY


Jones Main Harrison
Smith North Rye
Hays Main Harrison
Curry North Rye
Johnson Alma Brooklyn
Brooks Senator Brooklyn

NAME CITY
Jones Harrison
∏ NAME, CITY (CUSTOMER)
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
Union Operation:
•Suppose there are two tuples R and S. The union operation contains all the tuples that are
either in R or S or both in R & S.
•It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S
A union operation must hold the following condition:
•R and S must have the attribute of the same number.
•Duplicate tuples are eliminated automatically.
DEPOSITOR RELATION BORROW RELATION

CUSTOMER_NAME ACCOUNT_NO CUSTOMER_NAME LOAN_NO


Johnson A-101 Jones L-17
Smith A-121 Smith L-23
Mayes A-321 Hayes L-15
Turner A-176 Jackson L-14
Johnson A-273 Curry L-93
CUSTOMER_NAME
Jones A-472 Smith L-11
Johnson
Lindsay A-284 Williams L-17 Smith
Hayes
∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR) Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes
Set Intersection:
•Suppose there are two tuples R and S. The set intersection operation contains all tuples that
are in both R & S.
•It is denoted by intersection ∩.
Notation: R ∩ S
Example: Using the above DEPOSITOR table and BORROW table

∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

CUSTOMER_NAME
Smith
Jones
Set Difference:
•Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in R but not in S.
•It is denoted by intersection minus (-).

Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table

∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

CUSTOMER_NAME
Jackson
Hayes
Willians
Curry
Cartesian product
•The Cartesian product is used to combine each row in one table with each row in the other table. It is
also known as a cross product.
•It is denoted by X.
Notation: E X D
Example: EMPLOYEE X DEPARTMENT
EMPLOYEE
EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME
EMP_ID EMP_NAME EMP_DEPT
1 Smith A A Marketing
1 Smith A
2 Harry C 1 Smith A B Sales
1 Smith A C Legal
3 John B
2 Harry C A Marketing
DEPARTMENT
2 Harry C B Sales
DEPT_NO DEPT_NAME 2 Harry C C Legal
A Marketing 3 John B A Marketing

B Sales 3 John B B Sales


3 John B C Legal
C Legal
Rename Operation:

The rename operation is used to rename the output relation. It is denoted by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to STUDENT1.

ρ(STUDENT1, STUDENT)
Integrity Constraints

•Integrity constraints are a set of rules. It is used to maintain the quality of information.
•Integrity constraints ensure that the data insertion, updating, and other processes have to be
performed in such a way that data integrity is not affected.
•Thus, integrity constraint is used to guard against accidental damage to the database.
Types of Integrity Constraint
Domain constraints
•Domain constraints can be defined as the definition of a valid set of values for an attribute.
•The data type of domain includes string, character, integer, time, date, currency, etc. The value
of the attribute must be available in the corresponding domain.

Example
Entity integrity constraints
•The entity integrity constraint states that primary key value can't be null.
•This is because the primary key value is used to identify individual rows in relation and if
the primary key has a null value, then we can't identify those rows.
•A table can contain a null value other than the primary key field
Referential Integrity Constraints
•A referential integrity constraint is specified between two tables.
•In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key of
Table 2, then every value of the Foreign Key in Table 1 must be null or be available in Table 2.
Key constraints
•Keys are the entity set that is used to identify an entity within its entity set uniquely.
•An entity set can have multiple keys, but out of which one key will be the primary key. A primary
key can contain a unique value in the relational table.
Relational keys:-

Super Keys:-
A super key is a set of attributes whose values can be used to uniquely identify a tuple within a relation. A
relation may have more than one super key, but it always has at least one: the set of all attributes that make up
the relation.

Candidate Keys:-
A candidate key is a super key that is minimal; that is, there is no proper subset that is itself a super key. A
relation may have more than one candidate key, and the different candidate keys may have a different number
of attributes. In other words, you should not interpret 'minimal' to mean the super key with the fewest
attributes.
Properties of Candidate key:
 It must contain unique values
 Candidate key in SQL may have multiple attributes
 Must not contain null values
 It should contain minimum fields to ensure uniqueness
 Uniquely identify each record in a table
Primary Key:-
The primary key of a relation is a candidate key especially selected to be the key for the relation.
In other words, it is a choice, and there can be only one candidate key designated to be the
primary key.
Rules for defining Primary key:
 Two rows can't have the same primary key value
 It must for every row to have a primary key value.
 The primary key field cannot be null.
 The value in a primary key column can never be modified or updated if any
foreign key refers to that primary key.

•Alternate Key - is a column


or group of columns in a table
that uniquely identify every
row in that table.
Foreign key :-
The attribute(s) within one relation that matches a candidate key of another relation. A relation may have several
foreign keys, associated with different target relations.
Foreign keys allow users to link information in one relation to information in another relation. Without FKs, a
database would be a collection of unrelated tables.
Difference Between Primary key & Foreign key
Following is the main difference between primary key and foreign key:

Primary Key Foreign Key


Helps you to uniquely identify a record It is a field in the table that is the
in the table. primary key of another table.

Primary Key never accept null values. A foreign key may accept multiple
null values.
Primary key is a clustered index and A foreign key cannot automatically
data in the DBMS table are physically create an index, clustered or non-
organized in the sequence of the clustered. However, you can manually
clustered index. create an index on the foreign key.

You can have the single Primary key in You can have multiple foreign keys in
a table. a table.
Oracle CREATE TABLE
In Oracle, CREATE TABLE statement is used to create a new table in the database.
To create a table, you have to name that table and define its columns and datatype for each column.
Syntax:
1.CREATE TABLE table_name
2.(
3. column1 datatype [ NULL | NOT NULL ],
4. column2 datatype [ NULL | NOT NULL ],
5. ...
6. column_n datatype [ NULL | NOT NULL ]
7.);
Parameters used in syntax
•table_name: It specifies the name of the table which you want to create.
•column1, column2, ... column n: It specifies the columns which you want to add in the table. Every
column must have a datatype. Every column should either be defined as "NULL" or "NOT NULL". In
the case, the value is left blank; it is treated as "NULL" as default.
Example:-
1.CREATE TABLE customers
2.( customer_id number(10) NOT NULL,
3. customer_name varchar2(50) NOT NULL,
4. city varchar2(50)
5.);
CREATE TABLE Example with primary key
1.CREATE TABLE customers
2.( customer_id number(10) NOT NULL,
3. customer_name varchar2(50) NOT NULL,
4. city varchar2(50),
5. CONSTRAINT customers_pk PRIMARY KEY (customer_id)
6.);

OR

CREATE TABLE Login


(
id number constraint id_pk primary key,
pass varchar2(10)
);
CREATE TABLE AS Statement
The CREATE TABLE AS statement is used to create a table from an existing table by copying the
columns of existing table.

1.CREATE TABLE new_table


2.AS (SELECT * FROM old_table);

Create Table Example: copying selected columns of another table


Syntax:
1.CREATE TABLE new_table
2. AS (SELECT column_1, column2, ... column_n
3. FROM old_table);

Let's take an example:


1.CREATE TABLE newcustomers2
2.AS (SELECT customer_id, customer_name
3. FROM customers
4. WHERE customer_id < 5000);
Create Table Example: copying selected columns from multiple tables

Syntax:
1.CREATE TABLE new_table
2.AS (SELECT column_1, column2, ... column_n
3. FROM old_table_1, old_table_2, ... old_table_n);

Let's take an example: Consider that you have already created two tables "regularcustomers"
and "irregularcustomers".
The table "regularcustomers" has three columns rcustomer_id, rcustomer_name and rc_city.
1.CREATE TABLE "regularcustomers"
2. ( "RCUSTOMER_ID" NUMBER(10,0) NOT NULL ENABLE,
3. "RCUSTOMER_NAME" VARCHAR2(50) NOT NULL ENABLE,
4. "RC_CITY" VARCHAR2(50)
5. )
6./
The second table "irregularcustomers" has also three columns ircustomer_id,
ircustomer_name and irc_city.
1.CREATE TABLE "irregularcustomers"
2. ( "IRCUSTOMER_ID" NUMBER(10,0) NOT NULL ENABLE,
3. "IRCUSTOMER_NAME" VARCHAR2(50) NOT NULL ENABLE,
4. "IRC_CITY" VARCHAR2(50)
5. )
6./

In the following example, we will create a table name "newcustomers3" form copying columns from both tables.
Example:
1.CREATE TABLE newcustomers3
2. AS (SELECT regularcustomers.rcustomer_id, regularcustomers.rc_city, irregularcustomers.ircustomer_name
3. FROM regularcustomers, irregularcustomers
4. WHERE regularcustomers.rcustomer_id = irregularcustomers.ircustomer_id
5. AND regularcustomers.rcustomer_id < 5000);
ALTER TABLE Statement
In Oracle, ALTER TABLE statement specifies how to add, modify, drop or delete columns in a
table. It is also used to rename a table.
How to add column in a table
Syntax:
1.ALTER TABLE table_name
2. ADD column_name column-definition;
Example:
Consider that already existing table customers. Now, add a new column customer_age into the
table customers.
1.ALTER TABLE customers
2. ADD customer_age varchar2(50);
add multiple columns in the existing table
Syntax:
Hello Java Program for Beginners
1.ALTER TABLE table_name
2. ADD (column_1 column-definition,
3. column_2 column-definition,
4. ...
5. column_n column_definition);

Example
1.ALTER TABLE customers
2. ADD (customer_type varchar2(50),
3. customer_address varchar2(50));
modify column of a table

Syntax:
1.ALTER TABLE table_name
2. MODIFY column_name column_type;

Example:
1.ALTER TABLE customers
2. MODIFY customer_name varchar2(100) not null;
modify multiple columns of a table
Syntax:
1.ALTER TABLE table_name
2. MODIFY (column_1 column_type,
3. column_2 column_type,
4. ...
5. column_n column_type);

Example:
1.ALTER TABLE customers
2. MODIFY (customer_name varchar2(100) not null,
3. city varchar2(100));
drop column of a table

Syntax:
1.ALTER TABLE table_name
2. DROP COLUMN column_name;
3.
Example:
1.ALTER TABLE customers
2. DROP COLUMN customer_name;

rename column of a table


Syntax:
1.ALTER TABLE table_name
2. RENAME COLUMN old_name to new_name;

Example:
1.ALTER TABLE customers
2. RENAME COLUMN customer_name to cname;
rename table
Syntax:
1.ALTER TABLE table_name
2. RENAME TO new_table_name;

Example:
1.ALTER TABLE customers
2.RENAME TO retailers;
DROP TABLE Statement
Oracle DROP TABLE statement is used to remove or delete a table from the Oracle database.
Syntax
1.DROP [schema_name].TABLE table_name
2.[ CASCADE CONSTRAINTS ]
3.[ PURGE ];

Parameters
schema_name: It specifies the name of the schema that owns the table.
table_name: It specifies the name of the table which you want to remove from the Oracle database.
CASCADE CONSTRAINTS: It is optional. If specified, it will drop all referential integrity constraints
as well.
PURGE: It is also optional. If specified, the table and its dependent objects are placed in the recycle
bin and can?t be recovered.

1.DROP TABLE customers;


If there are referential integrity constraints on table_name and you do not specify the
CASCADE CONSTRAINTS option, the DROP TABLE statement will return an error and Oracle
will not drop the table.
DROP TABLE Example with PURGE parameter
1.DROP TABLE customers PURGE
This statement will drop the table called customers and issue a PURGE so that the space
associated with the customers table is released and the customers table is not placed in recycle
bin. So, it is not possible to recover that table if required.
Normal Forms in DBMS
Normalization is the process of minimizing redundancy from a relation or set of relations.
Redundancy in relation may cause insertion, deletion and updation anomalies. So, it helps to minimize
the redundancy in relations. Normal forms are used to eliminate or reduce redundancy in database
tables.
If a database design is not perfect, it may contain anomalies, which are like a bad dream for
any database administrator. Managing a database with anomalies is next to impossible.
1. Update anomalies: - If data items are scattered and are not linked to each other properly,
then it could lead to strange situations. For example, when we try to update one data item
having its copies scattered over several places, a few instances get updated properly while a
few others are left with old values. Such instances leave the database in an inconsistent state.

2. Deletion anomalies: - We tried to delete a record, but parts of it was left undeleted because
of unawareness, the data is also saved somewhere else.

3. Insert anomalies: - We tried to insert data in a record that does not exist at all. Normalization
is a method to remove all these anomalies and bring the database to a consistent stat
Decompositions:-
Intuitively, redundancy arises when a relational schema forces an association between attributes that is not
natural. Functional dependencies can 'be used to identify such situations and suggest refinements to the
schema. The essential idea is that many problems arising from redundancy can be addressed by replacing a
relation 'with a collection of smaller relations. A. decomposition of a relation schema consists of replacing
the relation schema by two (or more) relation schema that each contain a subset of the attributes of R and
together include all attributes in R. Intuitively, we want to store the information in any given instance of R
by storing projections of the instance. This section examines the use of decompositions through several
examples. we can decompose Hourly_Emps into two relations:
Hourly_Emps2(ssn,name,lot,rating_hours_worked)
Wages (rating, hourly_wages)
1. First Normal Form –
If a relation contain composite or multi-valued attribute, it violates first normal form or a relation
is in first normal form if it does not contain any composite or multi-valued attribute. A relation is
in first normal form if every attribute in that relation is singled valued attribute.

Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute STUD_PHONE. Its decomposition into 1NF has been
shown in table 2.
Example 2 –

ID Name Courses
1 A C1,C2
2 E C3
3 M C2.C3

In the above table Course is a multi valued attribute so it is not in 1NF.

ID Name Cources
1 A C1
1 A C2
2 E C3
3 M C2
3 M C3
Functional Dependency:-
Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional dependency says
that if two tuples have same values for attributes A1, A2,..., An, then those two tuples must have to have same values
for attributes B1, B2, ..., Bn.
Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-
hand side attributes determine the values of attributes on the right-hand side.
A functional dependency is a property of the semantics of the attributes in a relation. The semantics indicate how
attributes relate to one another, and specify the functional dependencies between attributes. When a functional
dependency is present, the dependency is specified as a constraint between the attributes.

Trivial Functional Dependency:-


1. Trivial: If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs
always hold.

2. Non-trivial: If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD.

3. Completely non-trivial: If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non-trivial FD.


Armstrong's Axioms:-
If F is a set of functional dependencies then the closure of F, denoted as F+, is the set of all functional
dependencies logically implied by F. Armstrong's Axioms are a set of rules that, when applied repeatedly,
generates a closure of functional dependencies.
1. Reflexive rule: If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta.

2. Augmentation rule: If a → b holds and y is attribute set, then ay → by also holds. That is adding
attributes in dependencies, does not change the basic dependencies.

3. Transitivity rule: Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also
holds. a → b is called as a functionally that determines b.
Second Normal Form –
To be in second normal form, a relation must be in first normal form and relation must not contain
any partial dependency. A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime
attribute (attributes which are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.
Partial Dependency – If the proper subset of candidate key determines non-prime attribute, it is
called partial dependency.
report: report_no -> editor, dept_no dept_no -> dept_name, dept_addr
author_id -> author_name, author_addr

Definition. A table is in second normal form (2NF) if and only if it is in 1NF and every non key attribute
is fully dependent on the primary key. An attribute is fully dependent on the primary key if it is on the
right side of an FD for which the left side is either the primary key itself or something that can be
derived from the primary key using the transitivity of FDs. An example of a transitive FD in report is
the following: report_no -> dept_no dept_no -> dept_name
The projection of report into three smaller tables has preserved the FDs and the association between
report_no and author_no that was important in the original table.

The FDs for these 2NF tables are:

report1: report_no -> editor, dept_no dept_no -> dept_name, dept_addr


report2: author_id -> author_name, author_addr
report3: report_no, author_id is a candidate key (no FDs)
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
TEACHER table
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a
proper subset of a candidate key. That's why it violates the rule for 2NF.

TEACHER_ID SUBJECT TEACHER_AGE


25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
To convert the given table into 2NF, we decompose it into two tables:

TEACHER_DETAIL table

TEACHER_ID TEACHER_AGE
25 30
47 35
83 38

TEACHER_SUBJECT table: TEACHER_ID SUBJECT


25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Third Normal Form –
A relation is in third normal form, if there is no transitive dependency for non-prime attributes as well as
it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in every non-trivial function
dependency X –> Y

1.X is a super key.


2.Y is a prime attribute (each element of Y is part of some candidate key).

Definition. A table is in third normal form (3NF) if and only if for every nontrivial functional dependency X->A, where X
and A are either simple or composite attributes, one of two conditions must hold. Either attribute X is a superkey, or
attribute A is a member of a candidate key. If attribute A is a member of a candidate key, A is called a prime attribute.
Note: a trivial FD is of the form YZ->Z.
report11: report_no -> editor, dept_no
report12: dept_no -> dept_name, dept_addr
report2: author_id -> author_name, author_addr
report3: report_no, author_id is a candidate key (no FDs)
Transitive dependency – If A->B and B->C are two FDs then A->C is called transitive dependency.

•Example 1 – In relation STUDENT given in Table 4,FD set: {STUD_NO -> STUD_NAME, STUD_NO ->
STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key: {STUD_NO}

•For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY are true.
So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form. To convert it
in third normal form, we will decompose the relation
•STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE)
as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)
A functional dependency is said to be transitive if it is indirectly formed by two functional
dependencies.
<MovieListing>

Movie_ID Listing_ID Listing_Type DVD_Price ($)

M08 L09 Crime 180

M03 L05 Drama 250

M05 L09 Crime 180

The above table is not in 3NF because it has a transitive functional dependency −
Movie_ID -> Listing_Type
Movie_ID -> Listing_ID

Therefore, the following has transitive functional dependency.


Listing_ID -> Listing_Type
The above states the relation <MovieListing> violates the 3rd Normal Form (3NF).
To remove the violation, you need to split the tables and remove the transitive functional dependency.
<Movie>

Movie_ID Listing_ID DVD_Price ($) <Listing>

M08 L09 180


Listing_ID Listing_Type
L09 Crime
M03 L05 250 L05 Drama
L09 Crime
M05 L09 180

Now the above relation is in Third Normal Form (3NF) of Normalization.


Boyce-Codd Normal Form:-
3NF, which eliminates most of the anomalies known in databases today, is the most common standard for
normalization in commercial databases and CASE tools. The few remaining anomalies can be eliminated by the
Boyce-Codd normal form (BCNF). BCNF is considered to be a strong variation of 3NF.

Definition. A table R is in Boyce-Codd normal form (BCNF) if for every nontrivial FD X->A, X is a superkey.
BCNF is a stronger form of normalization than 3NF because it eliminates the second condition for 3NF, which
allowed the right side of the FD to be a prime attribute. Thus, every left side of an FD in a table must be a
superkey. Every table that is BCNF is also 3NF, 2NF, and 1NF, by the previous definitions.
A relation R is in BCNF if R is in Third Normal Form and for every FD, LHS is super key. A relation is in BCNF
iff in every non-trivial functional dependency X –> Y, X is a super key.

•Example 1 – Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC->BE, B->E}
Step 1. As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine all attribute of relation, So
AC will be candidate key. A or C can’t be derived from any other attribute of the relation, so there will be only 1
candidate key {AC}.
Step 2. Prime attributes are those attribute which are part of candidate key {A, C} in this example and others
will be non-prime {B, D, E} in this example.
Step 3. The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite
attribute.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper subset of
candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E is in 2nd normal form (B
is not a proper subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime attribute)
and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for, either LHS of an
FD should be super key or RHS should be prime attribute.
So the highest normal form of relation will be 2nd Normal form.
•Example 2 –For example consider relation R(A, B, C)
A -> BC,
B ->
A and B both are super keys so above relation is in BCNF.

Key Points –
1.BCNF is free from redundancy.
2.If a relation is in BCNF, then 3NF is also also satisfied.
3. If all attributes of relation are prime attribute, then the relation is always in 3NF.
4.A relation in a Relational Database is always and at least in 1NF form.
5.Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
6.If a Relation has only singleton candidate keys( i.e. every candidate key consists of only 1
attribute), then the Relation is always in 2NF( because no Partial functional dependency possible).
7.Sometimes going for BCNF form may not preserve functional dependency. In that case go for
BCNF only if the lost FD(s) is not required, else normalize till 3NF only.
8.There are many more Normal forms that exist after BCNF, like 4NF and more. But in real world
database systems it’s generally not required to go beyond BCNF.
Exercise 1: Find the highest normal form in R (A, B, C, D, E) under following functional dependencies.
ABC --> D
CD --> AE
Important Points for solving above type of question.
1) It is always a good idea to start checking from BCNF, then 3 NF and so on.
2) If any functional dependency satisfied a normal form then there is no need to check for lower normal
form. For example, ABC –> D is in BCNF (Note that ABC is a superkey), so no need to check this
dependency for lower normal forms.
Candidate keys in the given relation are {ABC, BCD}
BCNF: ABC -> D is in BCNF. Let us check CD -> AE, CD is not a super key so this dependency is not in
BCNF. So, R is not in BCNF.

3NF: ABC -> D we don’t need to check for this dependency as it already satisfied BCNF. Let us consider
CD -> AE. Since E is not a prime attribute, so the relation is not in 3NF.
2NF: In 2NF, we need to check for partial dependency. CD which is a proper subset of a candidate key
and it determine E, which is non-prime attribute. So, given relation is also not in 2 NF. So, the highest
normal form is 1 NF.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy