Dbms Complete Notes 2nd Sem-1
Dbms Complete Notes 2nd Sem-1
MANAGEMENT SYSTEMS
CHAPTER -1
DATABASES AND DATABASE USERS
Basic Definitions :
Data :
Data is a Representation of facts, figures, statistics etc. having no
particular meaning.
Data can be in the form of numbers, characters, symbols, or even
pictures.
Ex:
1,ABC,19 etc.
Information :
It is a processed data (or) Collection of data which is having perfect
meaning is called information.
Field:
A field is a single piece of information, a record is one complete set of
fields and a file is a collection of records.
EX: Telephone book is analogous to file. It contains a list of records ,
each of which consist of three fields : Name, Address, And telephone
numbers.
Database :
A database is an collection of related data(information) that
is organized So that it can easily be accessed,managed,and
Updated.
Examples:
University Database
Data: Departments,Students,Exams,Rooms etc..
Usage: Create Exam plans,Enter Exam Results, create Statistics
and Build Timetable
Bank Database
Data: Clients,Accounts,Credits,Funds etc..
Applications: Accounting,Transfers.Risk management
Airline Database
Data: Flights,Passengers,Employees,Airplanes Etc…
Applications: Reservation,Booking,CreatingFlight Schedule.
Database System :
A database system is a way of organizing information on a
computer , implemented by a set of computer programs.
Data Base Management System (DBMS) :
It is a collection of programs that enables user to create and
maintain a database.
In other words it is general-purpose software that provides
the users with the processes of defining, constructing and
manipulating the database for various applications.
Ex:
Computerized library Systems
Automated teller machines
Flight reservation Systems
FUNCTIONALITIES OF DATA BASE
Defining A Database :
Defining a database involves specifying the Data types,
structures, and constraints of the data to be stored in the
database.
Constructing A Database:
constructing the database is the process of storing data on
some storage medium that is controlled by the DBMS.
Manipulating A Database:
Manipulating A Database includes functions such as
Querying the database to retrieve specific data, updating
the database and generating from the data.
Sharing:
Allows multiple users to access the database
simultaneously.
PROPERTIES OF THE DATABASE :
There are three properties:
Student Course
Name Rollno Class Department Subject name Subject code Department
Somu 1 A CS CA DCCA101 CS
Grade Report
Actors on the Scene: They actually use and control the database content;
and design, develop and maintain database applications
Database Administrators
Database Designers
Software Engineers
End-users
1.Database Administrator (DBA):
DBA is a person who is responsible for authorizing access
to the database, coordinating and monitoring its use, and
acquiring software and hardware resources as needed.
2.Database Designers:
Database Designers are responsible for identifying the
data to be stored in the database and for choosing
appropriate structures to represent and store the data.
They must communicate with the end-users and
understand their needs.
3.End Users:
These are persons who access the database for querying, updating,
and generating reports.
1.Controlling Redundancy
The Database Management System will not allow storing the
redundant data (same data multiple times) in the database.
The redundancy in storing the same data multiple times leads to
several problems.
storage space is wasted when the same data is stored repeatedly.
Files that represent the same data may become inconsistent and etc.
Database Damage
A Brief History of Database Applications:
Early Database Applications Using Hierarchical and Network
Systems:
Large numbers of records of similar structure.
One of the main problems with early database systems was the
intermixing of conceptual relationships with the physical storage and
placement of records on disk.
Another shortcoming of early systems was that they provided only
programming language interfaces.
Providing Data Abstraction and Application Flexibility with Relational
Databases:
Relational databases were originally proposed to separate the
physical storage of data from its conceptual representation and to
provide a mathematical foundation for data representation and
querying.
The relational data model also introduced high-level query
languages that provided an alternative to programming language
interfaces, making it much faster to write new queries.
Relational databases now exist on almost all types of computers,
from small personal computers to large servers.
Object-Oriented Applications and the Need for More Complex
Databases:
object-oriented databases (OODBs) were considered a
competitor to relational databases, since they provided more
general data structures.
Used in specialized applications: engineering design,
multimedia publishing, and manufacturing systems.
Interchanging Data on the Web for E-Commerce Using XML:
In the 1990s, electronic commerce (e-commerce) emerged as
a major application on the Web.
A variety of techniques were developed to allow the
interchange of data on the Web.
Currently, extended Markup Language (XML) is considered
to be the primary standard for interchanging data among
various types of databases and Web pages.
Extending database capabilities for new applications
1. Extensions to better support specialized requirements for
applications
2. Enterprise resource planning (ERP)
3. Customer relationship management (CRM)
Databases versus information retrieval
Information retrieval (IR)
Deals with books, manuscripts, and various forms of library-based
articles.
When not to use a DBMS :
a) Main costs of using a DBMS:
i) High initial investment and possible need for additional
hardware.
ii) Overhead for providing generality, security, concurrency control,
recovery, and integrity functions.
b) When a DBMS may be unnecessary:
i) If the database and applications are simple, well defined and not
expected to change.
ii) If there are stringent real-time requirements that may not be met
because of DBMS overhead.
iii)If access to data by multiple users is not required.
c) When no DBMS may be sufficient:
i) If the database system is not able to handle the complexity of data
because of modeling limitations .
ii) If the database users need special operations not supported by
the DBMS.
1. Database Administrators (DBA):
The DBA is responsible for authorizing access to the
database, for Coordinating and monitoring its use and for
acquiring software and hardware resources as needed.
These are the people, who maintain and design the
database daily.
DBA is responsible for the following Issues:
o Design of the conceptual and physical schemas:
• The DBA is responsible for interacting with the users
of the system to understand what data is to be stored in
the DBMS and how it is likely to be used.
• The DBA creates the original schema by writing a set
of definitions and is Permanently stored in the 'Data
Dictionary.
Security and Authorization:
The DBA is responsible for ensuring the unauthorized data
access is not permitted.
The granting of different types of authorization allows the
DBA to regulate which parts of the database various users can
access.
Storage structure and Access method definition:
The DBA creates appropriate storage structures and access
methods by writing a set of definitions, which are translated
by the DDL compiler.
Data Availability and Recovery from Failures:
The DBA must take steps to ensure that if the system fails,
users can continue to access as much of the uncorrupted data
as possible. The DBA also work to restore the data to
consistent state.
Database Tuning:
The DBA is responsible for modifying the database to ensure
adequate Performance as requirements change.
Integrity Constraint Specification:
The integrity constraints are kept in a special system
structure that is consulted by the DBA whenever an update
takes place in the system.
DATABASE SYSTEM CONCEPTS AND
ARCHITECTURE
CHAPTER-2
DATA MODEL:
Data Model can be defined as an integrated Collection of concepts
for describing and manipulating data, relationship between data, and
constraints on the data in an organization.
It defines the data elements and the relationships between the data
elements.
Data Models are used to show how data is stored, connected,
accessed and updated in the database management system.
Though there are many data models being used nowadays but the
Relational model is the most widely used model.
Categories of Data Models.
High-level or conceptual data models :
high level or conceptual model is the User level data model .
This provides concepts that are close to the way that many users
perceive data.
Conceptual data models use concepts such as Entities, Attributes and
Relationship
Low level-Physical data models:
provides concepts that describe the details of how data is stored in the
computer model.
Low level data model is only for Computer specialists not for end-
user.
Representation data model:
It is between High level & Low level data model Which provides
concepts that may be understood by end-user but that are not too far
removed from the way data is organized by within the computer.
Types of Data Models :
1. Hierarchical Data Model
Each child node has a parent node but a parent node can have more than
one child node.
Multiple parents are not allowed.
Pointers are used to link the parent node with the child node and are
used to navigate between the stored data.
Advantages:
Any change in the parent node is automatically reflected in the child
node so, the integrity of data is maintained
Simplicity, Security, Efficiency
Disadvantages:
Complex relationships are not supported.
Operation Anomalies
Network Model :
This model is an extension of the hierarchical model.
This model is the same as the hierarchical model, the only difference
is that a record can have more than one parent.
It replaces the hierarchical tree with a graph.
Network model and there can be more than one path to reach a
particular node.
Ease to access data : data access is easier than the hierarchical model
So, the tables are also called relations in the relational model.
A row contains all the information about any instance of the object.
Attribute or field: Attributes are the property which defines the table
or relation.
In the above example, we have different attributes of the employee like
Salary, Mobile no, etc.
Advantages of Relational Model :
Simple: This model is more simple as compared to the network and hierarchical
model.
Structural Independence: In the relational model, changes in the structure do
not affect the data access
Disadvantages of Relational Model :
Hardware Overheads: For hiding the complexities and making things easier for
the user this model requires more powerful hardware computers and data storage
devices.
Too many rules make database non-user-friendly.
Object-oriented Data Model :
The real-world problems are more closely represented through the
object-oriented data model.
In this model, both the data and relationship are present in a single
structure known as an object.
We can store audio, video, images, etc in the database which was not
possible in the relational model.
In this model, two are more objects are connected through links.
The real world entities and situations are represented as objects in the
Object oriented database model.
Attributes and Method
A new class can be derived from the original class. The derived class
contains attributes and methods of the original class as well as its own.
Object-Relational Models :
An Object relational model is a combination of a Object oriented
database model and a Relational database model.
So, it supports objects, classes, inheritance etc. just like Object
Oriented models and has support for data types, tabular structures
etc. like Relational data model.
One of the major goals of Object relational data model is to close the
gap between relational databases and the object oriented practices
frequently used in many programming languages such as C++, C#,
Java etc.
The advantages of the Object Relational model are −
Inheritance
The Object Relational data model allows its users to inherit objects,
tables etc.
Complex Data Types
Complex data types can be formed using existing data types
Extensibility
The functionality of the system can be extended in Object relational
data model.
Disadvantages of Object Relational model
The object relational data model can get quite complicated and
difficult to handle at times as it is a combination of the Object
oriented data model and Relational data model and utilizes the
functionalities of both of them.
Schema and Instances:
Database Schema: A database schema represents the logical view of
the entire database.
It defines how the data is organized and how the relations among
them are associated.
Schemas provide a logical classification of objects in the database.
A schema can contain tables, views, triggers, functions, packages,
and other objects.
A database schema defines its entities and the relationship among
them.
It contains a descriptive detail of the database, which can be depicted
by means of schema diagrams.
It’s the database designers who design the schema to help
programmers understand the database and make it useful.
Schema Diagram
From the above schema diagram student and Grade report are related
and course and prerequisite and section are related.
DBMS Instance :
The data stored in database at a particular moment of time is called
instance of database or snapshot or database state
Database State: Refers to the content of a database at a moment in
time.
Empty State : At this point, the corresponding database state is the
empty state with no data.
Initial State: Refers to the database state when the database is first
populated or loaded with initial data.
Valid State: A state that satisfies the structure and constraints specified
in the Schema.
The Schema is called the Intension of the Schema and the database
state an Extension of the Schema
SIMPLIFIED DATABASE SYSTEM :
A database system is Computer-Based system to record and maintain the
information. A Database Management System consists of a collection of
inter-related data and a set of program to access those data.
The database and the DBMS software together is a Database system.
1. User/Programmers
2. Applications programs/Queries
3. Software to process Queries/programs
4. Software to Access stored data
5. DBMS Catalog Contains the Stored database definition (Metadata)
6. The Physical Stored database
A SIMPLIFIED DATABASE SYSTEM ENVIRONMENT
A DBMS Catalog stores the description of the database. The description is
called Meta-data. This allows the DBMS software to work with different
databases
The collection of data, usually referred to as the database, contains
information about one particular enterprise.
The primary goal of DBMS is to provide an environment that is both
convenient and efficient to use in retrieving and storing database information
This system involves the control of how databases are Created, interrogated
and maintained to provide information needed by end users and the
organization.
DBMS acts as the interface between the application programs and database.
There are many different types of DBMS, ranging from small systems that run
on personal computers to huge systems that run on mainframes.
Ex: Computerized library system
DBMS Architecture :
Every database system logically organizes data with respect to some
model is called Data model. A Data model describes how various
pieces of data in the database are logically related to each other
The data model represents the relationship between entities. The
database model is also well known as Database Architecture.
The structure of a DBMS may be analyzed in two separate architectures
Front end
Back end:
The back end is responsible for managing the physical database and providing
necessary support and mappings for the internal, conceptual levels.
Other benefits of DBMS, such as Security, integrity and access control are also
responsibility of the back end
Front end:
The Front end is really an application that runs on the top of the DBMS. These
may be applications provided by the DBMS vendor, the use of the third party.
The user interact with the front end and may not even be aware that the back
end exists.
DATA – INDEPENDENCE
Data Independence is defined as a property of DBMS that helps you to
change Database Schema at one level of a database system without
requiring to change the Schema at the next higher level.
Data Independence is one of the main advantages of DBMS
Example:
Using new storage devices like Hard drive or Magnetic tapes
These commands are used to update the database schema that's why they
come under Data definition language.
2. Data Manipulation Language(DML)
DML stands for Data Manipulation Language.
It also has rollback parameters. (But in Oracle database, the execution of data
control language does not have the feature of rolling back.)
Ex: an employee in admin department has nothing to do with records related
to finance of the organisation. So, it would be wise to restrict the employee to
use only the required tables and data. For this purpose, database provides a set
of commands.
Here are some tasks that come under DCL:
Grant: It is used to give user access privileges to a database. Grant to allow
specified users to perform specified tasks.
Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT
4.Transaction Control Language (TCL)
TCL is used to run the changes made by the DML statement.
Forms-based interfaces
GUI’s
Improved performance
Cost, Security
The intermediate server accepts the requests from the client, processes
the request and sends the database commands to the db server, then
passes the data from the database server to the client, where it may be
processes further and filtered
The three tiers are: user interface, application rules, and data access
Three-Tier Client Server Architecture
Classification of DBMS
The DBMS can be classified into different categories on the basis of
several criteria such as the data model they are using, number of users
they support, number of sites over which the database is distributed and
purpose they serve.
1.Data Model Classification
Relational data model
Hierarchical data model
Network data model
Object-Oriented data model
Object-Relational data model
2.Number of Users
Single User System – Single user system the database resides on one
computer and only accessed by one user at a time
Multiuser system – multiuser can access the database simultaneously. In
multiuser DBMS, the data is both integrated and shared
3. Number of sites
Centralized – data is stored in single site
5. Purpose
General purpose
Special Purpose
CHAPTER-3
(UNIT 2)
High Level Conceptual Data Models for Database Design
A data model is collection of concepts for
describing the data in a database.
While A schema is a description of a particular
collection of data, using a given data model.
The relationship between data model, schema,
and phases of design are as follow:
The process of database design is divided into different
parts. It includes the following:
Requirement Collection and Analysis
Conceptual Design
Logical Design
Physical Design
Phases of Database
Design
Requirement Collection and Analysis :
In this phase a detailed analysis of the requirement is done.
Types of Attributes
There are different types of Attributes:
1.Simple Attribute 2. Composite Attribute
3.Single valued Attribute 4. Multi valued Attribute
5.Stored Attribute 6. Derived Attribute
7.Complex Attribute 8. Null value Attribute
1.Simple Attribute(Atomic attribute): Simple attribute that consist
of a single atomic value. A simple attribute cannot be subdivided.
Ex: the attributes age, sex, etc., is simple attributes
2.Composite Attribute: A composite attribute is an attribute that can
be further subdivided and attribute is value not atomic.
Ex: The attribute ADDRESS can be subdivided into street, city,
state, and pin code
3.Single Valued Attributes: A single valued attribute can have only a
single value. For Ex: a person can have a one ’date of birth’, ‘age’ etc.
It can have only single value. But it can be simple or composite
attribute. That is ‘date of birth’ is a composite, ‘age’ is a simple
attribute.
4. Multi-valued Attribute: Multi-valued Attributes can have multiple
values. For Ex: a person may have multiple phone numbers,
multiple degrees etc. Multi -valued attributes are shown by a double
line connecting to the entity in the ER diagram.
Stored Attribute:
The value certain attributes cannot be derived from some other
attributes. Such attributes are said to be Stored Attributes.
Here age and DOB are related Attributes. By using Birth date and with
the help of current date (Today) Can I determine
Ex: Date of Birth.
6. Derived Attribute: If the value for the derived attribute is derived
from the stored attribute. The value for the attribute ‘AGE’ can be
derived by subtracting the DOB from the current date.
7. Complex Attribute: A complex attribute that is both Composite
and multi-valued
Ex: if a person can have more than one residence and each residence can
have multiple phones.
Here, phone number and email are examples of multi-valued
attributes and address is an example of the composite attribute,
because it can be divided into house number, street, city, and state.
8. Null Value Attribute: A particular entity may not have an applicable
value for an attribute.
For Ex: The apartment number attribute of an address applies only to
the addresses that are in apartment buildings and not to other types of
residences, such as single-family homes
Entity Type
The entity type is a collection of the entity having
similar attributes.
First_Name = Sam
Middle_Name = Daniel
Last_Name = Mccormick
Manag Employee
Manger s
es
Relationship Instances
A relationship instance is the instance that
associates an entity from an entity type to
another entity of another entity type , in
order to establish a relationship among
various participating entity types.
Ex: if you have an entity set Employees and another entity
set Departments, you might define a relationship set
Works_In which associates members of those two entity
sets.
Relationship Set
A Relationship set is a set of relationships of the same type.
Works_O PROJEC
EMPLOYEE N TS
Role Names and Recursive Relationship :
Role Name:
Each entity type in a relationship plays a particular role.
The role name specifies the role that a participating entity type plays in
the relationship.
Ex : In the relationship between Employee and Department, the Employee
entity type plays the employee role and the Department entity type plays
the department(or) employer role.
Recursive Relationship :
A Recursive Relation ship is an entity is associated with itself. An employee
may manage many employees and each employee is managed by one
employee.
Cardinality
(No of Rows)=
4
Degree( No of
Columns)=3
The characteristic of a relation are as
follows:
Ordering of tuples in a relation r(R)
⚫ Relation is defined as a set of tuples (even
though they appear to be in the tabular form).
⚫ Elements have no order among them.
No duplicate tuples
⚫ A relation cannot contain two or more tuples
which have the same values for all the
attributes. i.e., In any relation, every row is
unique.
⚫ There is only one value for each attribute of a
tuple. The tuple should have only one value.
⚫ Values in tuple: All Values are considered
Atomic ( Indivisible).
A Special null value used to represent values
that are unknown.
Relational Integrity Constraints
Integrity Constraints :
⚫ Integrity constraints are a set of rules. It is
used to maintain the quality of information.
⚫ Integrity constraints ensure that the data
insertion, updating, and other processes have
to be performed in such a way that data
integrity is not affected.
Categories of Integrity Constraints
1. Inherent Model Based Constraints
2. Schema based Constraints
3. Application based constraints
4. Data Dependencies Constraints
1. Inherent Model Based Constraints :
⚫ Constraints that are inherent in the data
model are called as Inherent Model Based
Constraints.
Ex: The constraints that a relation cannot have
duplicate tuples in an inherent constraints.
2. Schema based Constraints :
⚫ Constraints that can be directly expressed in
the schemas of the data model, using DDL
(Data Definition Language).
⚫ Schema-based constraints include four
constraints.
A. Domain constraints
B. Entity integrity constraints
C. Referential Integrity Constraints
D. Key constraints
A. Domain constraints :
⚫ Domain constraints can be defined as the
definition of a valid set of values for an
attribute.
⚫ The data type of domain includes string,
character, integer, time, date, currency, etc.
The value of the attribute must be available
in the corresponding domain.
⚫ Every domain must contain atomic
values(smallest indivisible units) which
means composite and multi-valued
attributes are not allowed.
⚫ Domain constraints specify that within each
tuple, and the value of each attribute must
be unique. This is specified as data types
B. Entity integrity constraints :
⚫ The entity integrity constraint states that
primary key value can't be null.
⚫ This is because the primary key value is used
to identify individual rows in relation and if
the primary key has a null value, then we can't
identify those rows.
⚫ A table can contain a null value other than the
primary key field.
C. Referential Integrity Constraints :
⚫ A referential integrity constraint is specified
between two tables.
⚫ In the Referential integrity constraints, if a
foreign key in Table 1 refers to the Primary
Key of Table 2, then every value of the
Foreign Key in Table 1 must be null or be
available in Table 2.
D. Key constraints :
⚫ Keys are the entity set that is used to identify
an entity within its entity set uniquely.
⚫ An entity set can have multiple keys, but out of
which one key will be the primary key. A
primary key can contain a unique and not null
value in the relational table.
E. Constraints on NULLs :
⚫ We can specify whether an attribute can have
NULL or not using this constraint.
⚫ For example, if we do not want a NULL value for
the student’s name then we can specify
and constraint it using NOT NULL.
3. Application based constraints :
⚫ Constraints that cannot be directly applied in the
data model's schemas. These are known as
application-based or semantic constraints.
⚫ Expressed and enforced by application program.
⚫ Ex: Student can not have a phone number more
than 10 digits.
4. Data Dependencies Constraints :
⚫ It includes functional dependencies and multi
valued dependencies they are used mainly for
Integrity Constraint over relations.
⚫ Integrity constraints are a set of rules. It is used
to maintain the quality of information.
⚫ Integrity constraints ensure that the data
insertion, updating, and other processes have to
be performed in such a way that data integrity is
not affected.
⚫ Thus, integrity constraint is used to guard
against accidental damage to the database.
Relational Databases and Relational Databases
Schemas
Relational Databases:
Relational Database is a database system that
stores and retrieves data in a tabular format
organized in the form of rows and columns.
Therefore a relational database is a collection of
relations with distinct relation names.
Relational Databases Schemas:
A relational database schema is the collection of
schemas for the relations in the database
A database state that does not obey all the integrity
constraints is called an Invalid state
A state that satisfies all the integrity constraints is
called a Valid state.
Operations on Relations
The operations of the relational model can be
categorized into:
Retrieval
Update
⚫ Retrieval operations on Relations:
Retrieval operations are performed on the relation
to extract required information from relational
database.
SELECT operation is one of the example for
retrieval operation.
⚫ Update operations on Relations:
There are Three basic update operations on
relations:
Insert: Insert is used to insert new tuple or tuples in
a relation.
The Insert operation:
The SQL INSERT statement is used to insert a single
or multiple data in a table. Insert can be violate any of
the four types of constraints as described below.
⚫ This insertion satisfies all constraints, so it is
acceptable.
EX: insert<5010, “Ashwini”,10, 9710185288, 40,000>
into employee
⚫ Domain constraint can be violated if an attribute
value is given that does not appear in the
corresponding domain.
EX: insert<5010, “Ashwini”,10, 9710185288, 40,000,
“Java project”> into Employee
Is insertion operation not possible because it violates
domain constraint as it has entity “Java project” which
does not corresponding any attribute in the original
⚫ Key constraint can be violated if a key value in the
new tuple already exists in another tuple in the
relation.
EX: insert<5010, “Vidya”,10, 9720685288, 45,000>
into Employee
This insertion violates the key constraint because
another tuple with the same Eno value already
exists in the EMPLOYEE relation, so it is rejected.
⚫ Entity integrity can be violated if the primary key
of the new tuple is null.
Ex: insert< “Anuradha”,20, 9758998743> into
Employee
This insertion violates the entity integrity constraint
(null for the primary key Eno), so it is rejected.
In this operation, the DBMS could ask the user to
provide a value for Eno and could accept the
⚫ Referential integrity can be violated if the value
of any foreign key in t refers to a tuple that
does not exists in the referenced relation.
EX: insert<5040, “Vasudha”,40, 9448050836,
30,000> into Employee
This insertion operation violates the referential
integrity constraint specified on Dno because the
Dno = 40 of EMPLOYEE does not exist in the
DEPARTMENT relation at all
The Delete operation:
The Delete operation is used to delete tuples
from the relation. The Delete operation can
violate only referential integrity, if the tuple
being deleted is referenced by the foreign keys
from other tuples in the database.
To Specify deletion, a condition on the attributes
of the relation selects the tuple(or tuples) to be
deleted.
EX: 1. Delete the EMPLOYEE tuple with Eno =
5030 and Dno = 30
This deletion is acceptable.
2. Delete the DEPARTMENT tuple with Dno
= 20
This Deletion is not acceptable, because tuples in
Several options are available if a deletion operation
cause a violation. Any one of the below options must
be specified during database design for each foreign
key constraint.
1. The First option is to reject the deletion.
2. The Second option is to attempt Cascade (or
propagate) the deletion by deleting tuples that
references the tuple that is being deleted.
3. The Third option is to modify the referencing
attribute values that cause the violations each such
value
(ON is either
DELETE set to nullconstraint
CASCADE or changedis to reference
used in
another
MySQL tovalid tuple.
delete the rows from the child table
automatically, when the rows from the parent
table are deleted.)
The Modify Operation:
The Update (or modify) operation is used to change
the values of one or more attributes in a tuple or
tuples of some relation R.
It is necessary to specify a condition on the attributes
of the relation to select the tuple (or tuples) to be
modified.
⚫ Updating an attribute that is neither a primary key
nor a foreign key usually causes no problems. The
DBMS need only check to confirm that the new
value is of the correct data type and domain.
⚫ Modifying a primary key value is similar to deleting
one tuple and inserting another in its place, because
we use the primary key to identify tuples.
⚫ If a foreign key attribute is modified, the DBMS
must make sure that the new value refers to an
existing tuple in the referenced relation.
EX: 1. Update the SALARY of the EMPLOYEE tuple
2. Update the DNO of the EMPLOYEE tuple with
Eno = 5030 to 10
Acceptable
3. Update the DNO of the EMPLOYEE tuple with
Eno = 5030 to 40
Unacceptable, because it violates referential
integrity. i.e, Dno = 40 does not exist in
the DEPARTMENT table.
4. Update the Eno of the EMPLOYEE tuple with
Eno = 5030 to 5010
Unacceptable, because it violates primary key
and referential integrity constraints. i.e, the
Eno 5010 already exists, we cannot have two
identical values for primary key.
Relational Algebra
The relationl algebra is a procedural query
language. It consists of a set of operations that
take one or two relations as input and produce a
new relation as their result.
The relational algebra is often considered to be an
integral part of the relational data model, and its
operations can be divided into two groups.
⚫ One group includes set operations from
mathematical set theory. Set operations include:
1. Union 2. Intersection
3. Set Difference 4.Cartesian Product
⚫ The other group consists of operations
developed specifically for relational databases
these include
Union (U)
The relations P and Q are said to be union compatible
if both P and Q are of the degree ‘n’ and the domain of
the corresponding ‘n’ attributes are identical.
In Simple, the union operation combines two sets
of rows into a single set composed of all the rows in
either or both of the two original sets.
Consider the relations P and Q. R is the computed
result relation.
R= P U Q
The result relation R contains tuples that are in
either P or Q or in both of them. The duplicated
tuples are eliminated.
Intersection (^)
The result of this operation, by P – Q, is a relation
that includes all tuples that are in both P and Q
R=P^Q
The insertion of two tables rules in a third table
containing all the tuples that are in both relations.
Difference (Minus)(-)
The difference of two tables is a third table
containing all the rows, which are in the first
table but not in the second.
The result of this operation, denoted by P – Q,
is a relation that includes all tuples that are in P
but not in Q
R=P–Q
Cartesian Product (*)
The Cartesian product of two relations is the
concatenation of tuples belonging to the two
relations. A new resultant relation scheme is
created consisting of all possible combinations
of the tuples.
If R = P * Q where a tuple r belongs R
The result relation is obtained by concatenating
each tuple in relation P with each tuple in
relation Q
⚫ The scheme of the result relation is given by R
= P || Q
⚫ The degree of the result relation given by |R| =
|P| + |Q|
⚫ The Cardinality of the result relation given by
Unary Relational Operations
1. Projection (∏)
⚫ The projection operation is used to either
reduce the number of attributes in the resultant
relation or to record attributes.
⚫ The PROJECT operation selects certain columns
from the table and discards the other columns.
⚫ If the user is interested in only certain
attributes of a relation, then the PROJECT
operation is used to project the relation over
these attributes only.
⚫ The result of the PROJECT operation can hence
be visualized as a vertical partition of the
relation into two relations.
1. One has the needed columns (attributes) and
⚫ In general, the project operation is denoted by
Syntax: ∏<attribute list>(R)
Where, ∏(pi) – represent the PROJECT
operation
<attribute list> is the desired list of
attributes from the attributes of the relation R
Ex: Consider the relation S1
1.For Ex, to list each employee’s age, we can use
the PROJECT operation as follows:
∏age (S1)
Advantages of Normalization
• Avoid redundancy (same data stored many times in same/different
tables).
• All the update anomalies and does not have any loss of data (or)
inefficient data update process.
• In this, a well organized database where all the tables are
inter-related maintaining integrity and consistency of data.
• All data are stored efficiently since there is no redundancy.
• The entire database system remains consistent over time as the
Disadvantages of Normalization
• Maintaining more tables is a bit tough.
• Nested queries over multiple tables gets tricky.
CONSRTAINTS
• SQL constraints are used to specify rules for data in a table.
• Constraints can be specified when the table is created with the CREATE
TABLE statement, or after the table is created with the ALTER TABLE
statement.
Constraint Meaning
NOT NULL Constraint Ensures that a column cannot have NULL value
DEFAULT Constraint Provides a default value for a column when none is specified while inserting the
records.
UNIQUE Constraint Ensures that all values in a column are different
CHECK Constraint Makes sure that all values in a column satisfy certain criteria
Primary key Constraint Used to uniquely identify a row in the table
Even though a value for the “SCORE” column in the INSERT INTO
• 3. UNIQUE CONSTRAINT
The UNIQUE constraint ensures that all values in a column
are distinct. For example, in the following CREATE TABLE
statement, column “ROLLNO” has a unique constraint, and
hence cannot include duplicate values.
CREATE TABLE STUDENT
( UNIQU
ROLLNO NUMBER(10) E,
FIRST_NAME VARCHAR2(30),
LAST_NAME VARCHAR2(30),
SCORE NUMBER(3) DEFAULT 80
);
Assume that table contains 3 records as shown below
ROLLNO FIRST_NAME LAST_NAME SCORE
1001 SNIGDHA SRIKANTH 80
1002 SURABHI KUMARI 80
1003 DHANYA RAMESH 80
Query :
SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT
FROM EMPLOYEE
INNER JOIN PROJECT
ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;
2. LEFT JOIN
The SQL left join returns all the values from left table and the matching
values from the right table. If there is no matching join value, it will
return NULL.
Syntax: SELECT table1.column1, table1.column2, table2.column1,....
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
• Query:
SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT
FROM EMPLOYEE
LEFT JOIN PROJECT
ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;
Output:
3. RIGHT JOIN
• In SQL, RIGHT JOIN returns all the values from the values from the
rows of right table and the matched values from the left table. If there
is no matching in both tables, it will return NULL.
Syntax: SELECT table1.column1, table1.column2, table2.column1,....
FROM table1
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;
Query :
SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT
FROM EMPLOYEE
RIGHT JOIN PROJECT
ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;
4. FULL JOIN
In SQL, FULL JOIN is the result of a combination of both left and right
outer join. Join tables have all the records from both tables. It puts
NULL on the place of matches not found.
Syntax: SELECT table1.column1, table1.column2, table2.column1,....
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;
Query:
SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT
FROM EMPLOYEE
FULL JOIN PROJECT
ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;
Output
VIEWS
• A view is a database object that is a logical representation of one or
more tables.
• A view does not contain data. Instead, a view is a virtual table,
deriving its data from base tables
• SQL views are data objects, like SQL tables that can be queried,
updated, and dropped.
• We can think of view as a stored query
Syntax: CREATE VIEW <View_Name> as SELECT Column_Name(s)
FROM <Table_Name>
WHERE <condition>
Functions of Views
• It can hide certain columns in a table.
• It allows using function and manipulating data.
• It represents the subset of the data contained in a base table.
Ex: Creating and Accessing view
Embedded SQL
• Embedded SQL is a method of combining the computing power of a
programming language and the database manipulation capabilities of
SQL.
• Embedded SQL statements are SQL statements return inline with
program source code of the host language.
• The popular host language is C
• The C and embedded SQL is called pro*C in oracle
• The steps involved in compiling and embedded SQL program.
This illustration shows the steps necessary to compile an embedded SQL program
Five steps are involved in compiling an embedded SQL
program.
Step 1:
• The embedded SQL program is submitted to the SQL pre-compiler.
• The pre-compiler scans the program and process them.
• A different pre-compiler is required for different programming
languages.
Step 2:
• The pre-compiler produces two output files. The first file is the source
file, stripped of its embedded SQL statements.
• The second file is a copy of all the embedded SQL statements used in
the program. This file is sometimes called a database request module,
or DBRM.
Step 3:
• The source file output from the pre-compiler is submitted to the
standard compiler for the host programming language (such as a C or
Step 4:
• The linker accepts the object modules generated by the compiler, links
them with various library routines, and produces an executable
program.
Step 5:
• The database request module sends its input to the binding utility.
This utility examines the SQL statements, validates and optimises
them and produces an access plan for each statements.
• The result is a combined access plan for the entire program which is
used to access the database.
Dynamic SQL
• It is a programming technique that is used to write SQL queries
during runtime.
• Here the SQL statements are not embedded in the source program
instead they can be entered interactively during runtime.
• Dynamic SQL could be used to create flexible SQL queries.
Dynamic SQL Concepts
• In dynamic SQL, the SQL statements are not hard coded in the
programming language. The text of the SQL statement is asked at the
run time to the user.
• In dynamic SQL, the SQL statements that are to be executed are not
known until runtime, so DBMS can’t get prepared for executing the
statements in advanced.
• When the program is executed, the DBMS takes the text of SQL
statements to execute the statements that are executed in such a
manner called statement string.
Five Steps Execution in Dynamic
SQL
Dynamic Statement Execution (Execute Immediate)
The Execute Immediate statement provides the simplest form of
dynamic SQL. This statement passes the text of SQL statements to
DBMS and asks the DBMS to execute the SQL statements
immediately.
For using the statement our program goes through the following
steps.
• The program constructs a SQL statement as a string of text in one of
its data areas (called a buffer).
• The program passes the SQL statements to the DBMS with the
EXECUTE IMMEDIATE statement.
• The DBMS executes the statement and sets the SQL CODE/SQL
STATE values to flag the finishing status same like if the statement
had been hard coded using static SQL.
Specifying Constraints as Assertions and Triggers
Specifying Constraints as Assertions
• When a constraint involves 2 (or) more tables, the table constraint
mechanism is sometimes hard and results may not come as expected.
• To cover such situation SQL supports the creation of assertions that are
constraints not associated with only one table. And an assertion
statement should ensure a certain condition will always exist in the
database.
• Assertions are different from check constraint in a way that, check
constraint are related to one single row only.
• Assertions on the other hand, involve any number of rows in the table
or any number of other table.
• Assertion can check the condition, return a Boolean value.
• An Assertion is piece of SQL which makes sure a condition is satisfied,
else or it stops the action being taken on a database.
• An Assertion is a constraint that might be dependent upon multiple
• Domain constraints, functional dependency and referential integrity
are special forms of assertion. These forms of assertion are
dependent(involve) on single row of a table at a time.
• Any modification to a database is allowed only if it would not cause
any assertion to be violated i.e., assertions are checked only when
UPDATE or INSERT actions are performed against the table.
Syntax: create assertion <constraint name>
check (<search condition>)
(<Constraint Attributes>)
EX: Consider a employee table and we want to make an assertion that
no employee in our database which is paid more than 50,000 or less
than 25,000.
( Not exists
( Select ID from Employee E
Where E.Salary > 50000 or E.Salary < 25000
)
Specifying Constraints as Triggers
• A triggers is a database object that is associated with a table, will be
activated when a action is executed for the table.
• Triggers are sometimes called event-condition-action rules.
• Triggers are activated only when certain events occur. The usual events
are “insert”, “update”, “delete”.
• When the trigger is awakened, the trigger tests a condition. If the
condition does not hold, then nothing else associated with the trigger
happens in response to the given event.
• On the other hand, if the trigger is satisfied then a pre-defined action is
performed by the trigger.
Syntax: Trigger Creation
CREATE (or) REPLACE TRIGGER trigger_name
[ BEFORE | AFTER ] [ INSERT | UPDATE | DELETE ]
ON table _name
[FOR EACH ROW] [WHEN condition ]
BEGIN
-----------
-----------trigger body
-----------
END;
• [ BEFORE | AFTER ] – This specifies when the trigger should be
execute before the constraints and table update (or) after the
constraints and table update.
• [ INSERT | UPDATE | DELETE ] – It specifies what type of DML
operation should activate the trigger.
• ON Table - This specifies what table the trigger is defined on.
• ON Database: - This is used to specify that the trigger is for a system
event . ( Startup, shutdown, and server error)
• ON Schema – This is used to specify a specific schema for which a
trigger is to be fired.
Unit – 4
Chapter - 8
Transaction Processing
Concepts
INTRODUCTION
• Transaction processing system are systems with large databases and
hundreds of concurrent users that are executing database transactions.
• EX: System for reservations, banking, Credit card processing, stock
markets, supermarket checkout, and other similar systems.
• They require high availability and fast response time for hundreds of
concurrent users.
• In this chapter we present the concepts that are needed in transaction
processing system.
Transaction Processing Concepts
Transaction
A transaction is the basic logical unit of execution in an information
system. A transaction is a sequence of operations that must be executed
as a whole, taking a consistent (correct) database state into another
consistent (correct) database state.
Deadkock prevention
• A transaction locks all data items it refers to before it begins execution.
• This way of locking prevents deadlock since a transaction never waits
for a data item.
• The conservative two-locking uses this approach.
Deadlock detection and resolution
• In this approach, deadlocks are allowed to happen. The scheduler
maintains a wait-for-graph for detecting cycle. If a cycle exists, then
one transaction involved in then cycle is selected (victim) and
roller-back.
• A wait-for-graph is created using the lock table. As soon as a
transaction is blocked, it is added to the graph. When a chain like: Ti
waits for Tj waits for Tk waits for Ti or Tj occurs, then this creates a
cycle.
Deadlock avoidance
• There are many variations of two-phase locking algorithm.
• Some avoid deadlock by not letting the cycle to complete.
• That is as soon as the algorithm discovers that blocking a transaction
is likely to create a cycle, it rolls back the transaction.
Starvation
• Starvation occurs when a particular transaction consistently waits or
restarted and never gets a chance to proceed further.
• In a deadlock resolution it is possible that the same transaction may
consistently be selected as victim and rolled-back.
• The limitation is inherent in all priority based scheduling
mechanisms.
• In wound-wait scheme a younger transaction may always be wounded
(aborted) by a long running older transaction which may create
starvation.
DeadLock Prevention
A deadlock can be prevented by following two commonly used
schemes.
1. Wait – die
2. Wound-wait
Wait-die Scheme
(Non-Preemptive Scheduling is one in which once the
resources (CPU Cycle) have been allocated to a process,
the process holds it until it completes its burst time or
switches to the 'wait' state.
In non-preemptive scheduling, a process cannot be
interrupted until it terminates itself or its time is over.)
(Preemptive Scheduling is a CPU scheduling technique
that works by dividing time slots of CPU to a given
process.
The time slot given might be able to complete the whole
process or might not be able to it.)
2. Wound – Wait scheme
Deadlock detection
Recovery Techniques
Database recovery techniques are used in database management
systems (DBMS) to restore a database to a consistent state after a
failure or error has occurred. The main goal of recovery techniques is
to ensure data integrity and consistency and prevent data loss.
Failure Classification
Recovery Techniques in DBMS
Concurrency Control Based on Time Stamp Ordering
• Timestamp TS(T) is a unique identifier created by DBMS to identify
the transaction
• The timestamp values are assigned in the order in which the
transaction are submitted to the system
• The timestamp can be generated using a counter that is incremented
each time its value is assigned to a transaction
• Timestamp ordering algorithm (TO) it orders the transaction
according to their timestamps
• It generates serializability of schedules
• The algorithm must ensure that for each item accessed by conflicting
operations in the schedule, does not violate the serializability order.
Database Backup and Recovery from Catastrophic Failures