Mid 2 (DBMS)
Mid 2 (DBMS)
UNIT-III
1. Explain relational Algebra with join conditions and examples.
Join operation combines the relation R1 and R2 with respect to a condition. It is denoted by ⋈.
The different types of join operation are as follows −
Theta join
Natural join
Outer join − It is further classified into following types −
o Left outer join.
o Right outer join.
o Full outer join.
Theta join
If we join R1 and R2 other than the equal to condition then it is called theta join/ non-equi join.
Example
Consider R1 table
RegNo Branch Section
1 CSE A
2 ECE B
3 CIVIL A
4 IT B
5 IT A
Table R2
Name RegNo
Bhanu 2
Priya 4
R1 ⋈ R2 with condition R1.regno > R2.regno
3 CIVIL A Bhanu 2
4 IT B Bhanu 2
5 IT A Bhanu 2
5 IT B Priya 4
In the join operation, we select those rows from the cartesian product where R1.regno>R2.regno.
Join operation = select operation + cartesian product operation
Natural join
If we join R1 and R2 on equal condition then it is called natural join or equi join. Generally, join is
referred to as natural join.
Natural join of R1 and R2 is −
{ we select those tuples from cartesian product where R1.regno=R2.regno}
R1 ⋈ R2
Regno Branch Section Name
2 - - Bhanu
4 - - priya
Outer join
It is an extension of natural join to deal with missing values of relation.
Consider R1 and R2 shown below –
Table R1
RegNo Branch Section
1 CSE A
2 ECE B
3 CIVIL A
4 IT B
5 IT A
Table R2
Name Regno
Bhanu 2
Priya 4
Hari 7
Outer join is of three types. These are explained below −
Left outer join
It is denoted by R1 ⋈ R2.
2 - - Bhanu 2
4 - - Priya 4
1 - - NULL NULL
3 - - NULL NULL
5 - - NULL NULL
Here all the tuples of R1(left table) appear in output.
The mismatching values of R2 are filled with NULL.
Left outer join = natural join + mismatch / extra tuple of R1
Right outer join
It is denoted by R1 ⋈ R2
Here all the tuples of R2(right table) appear in output. The mismatching values of R1 are filled with
NULL.
2 - - Bhanu 2
4 - - Priya 4
2 - - Bhanu 2
4 - - Priya 4
1 - - NULL NULL
3 - - NULL NULL
5 - - NULL NULL
Example
Create table class(id number(10), grade number(10), teacherID number(10),
noofstudents number(10));
insert into class values(1,8,2,20);
insert into class values(2,9,3,40);
insert into class values(3,10,1,38);
select * from class;
Example 1
Select AVG(noofstudents) from class where teacherID IN(
Select id from teacher
Where subject=’science’ OR subject=’maths’);
Output
You will get the following output −
20.0
UNION ALL:
It will not remove duplicate records. It can be faster than UNION.
INTERSECT:
It is used to take the result of two queries and returns only those rows which are common in both
result sets. It removes duplicate records from the final result set.
EXCEPT:
It is used to take the distinct records of two one query and returns only those rows which do not
appear in the second result set.
The key important things about UNION, UNION ALL, INTERSECT, EXCEPT in MS SQL.
UNION: combines results from both tables.
UNION ALL: combines two or more result sets into a single set, including all duplicate rows.
INTERSECT: takes the rows from both the result sets which are common in both.
EXCEPT: takes the rows from the first result data but does not in the second result set.
Use below SQL Script to create and populate the two tables that we are going to use in our examples.
CREATE TABLE TableA
(
ID INT,
Name VARCHAR(50),
Gender VARCHAR(10),
Department VARCHAR(50)
)
GO
INSERT INTO TableA VALUES(1, 'Pranaya', 'Male','IT')
INSERT INTO TableA VALUES(2, 'Priyanka', 'Female','IT')
INSERT INTO TableA VALUES(3, 'Preety', 'Female','HR')
INSERT INTO TableA VALUES(3, 'Preety', 'Female','HR')
GO
Fetch the records:
SELECT * FROM TableA
INTERSECT Operator:
The INTERSECT operator retrieves the common unique rows from both the left and the right query.
Notice the duplicates are removed.
SELECT ID, Name, Gender, Department FROM TableA
INTERSECT
SELECT ID, Name, Gender, Department FROM TableB
EXCEPT Operator:
The EXCEPT operator will return unique rows from the left query that aren’t present in the right query’s
results.
SELECT ID, Name, Gender, Department FROM TableA
EXCEPT
SELECT ID, Name, Gender, Department FROM TableB
If you want the rows that are present in Table B but not in Table A, reverse the queries.
SELECT ID, Name, Gender, Department FROM TableB
EXCEPT
SELECT ID, Name, Gender, Department FROM TableA
In SQL, this concept is the same as the trigger in real life. For example, when we pull the gun trigger, the
bullet is fired.
The following query creates the Student_Trigger table in the SQL database:
1. DESC Student_Trigger;
Output:
The following query fires a trigger before the insertion of the student record
in the table:
To check the output of the above INSERT statement, you have to type the
following SELECT statement:
Output:
b.Integrity Constraints:
Integrity constraints are a set of rules. It is used to maintain the quality of information.
Integrity constraints ensure that the data insertion, updating, and other processes have to be
performed in such a way that data integrity is not affected.
Thus, integrity constraint is used to guard against accidental damage to the database.
UNIT-IV
1. What is Normalization ? Explain?
Normalization :
A large database defined as a single relation may result in data duplication. This repetition of
data may result in:
So to handle these problems, we should analyze and decompose the relations with redundant
data into smaller, simpler, and well-structured relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations with fewer attributes.
Explaination:
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations. It is also
used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using relationships.
o The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the
database grows. Normalization consists of a series of guidelines that helps to guide you in
creating a good database structure.
a. Functional Dependencies
Functional dependency mathematically demonstrates the relation
between two attributes in a database management system (DBMS). It typically
exists with a primary key attribute and a non-key attribute within a table or data
set. Functional dependency acts as a constraint between the two sets of
attributes and is an essential factor in designing database parameters and
functions to help your business, organization or company store and manage its
data.
Often, an arrow denotes functional dependency. For example, if you mark one
attribute X and another Y and the functional dependency of X relies on Y, the
simple formula is:
X→Y
The attribute set on the left, X, is called the determinant, while the Y on the right
is called the dependent.
b. Multivalued dependency
Multivalued dependency (MVD) is having the presence of one or more rows in a table. It
implies the presence of one or more other rows in that same table. A multivalued dependency
prevents fourth normal form. A multivalued dependency involves at least three attributes of a
table.
It is represented with a symbol "->->" in DBMS.
X->Y relates one value of X to one value of Y.
X->->Y (read as X multidetermines Y) relates one value of X to many values of Y.
A Nontrivial MVD occurs when X->->Y and X->->z where Y and Z are not dependent are
independent to each other. Non-trivial MVD produces redundancy.
BCNF rules
To check if the table satisfies the conditions of BCNF, the following two conditions should exist:
3NF states that the transitive dependency must not exist. Transitive dependency is that the LHS (left-
hand side) of the functional dependency must consist of a super key/candidate key or the RHS (right-
hand side) must have a prime attribute. BCNF adds more restrictions by stating that LHS of functional
dependency must have a super key and removes the RHS condition.
UNIT-V
1. What is transcation Management? Explain?
Transaction Management in DBMS
Transaction in DBMS refers to operations like insertion, updation, and deletion of data. This set
of logical works requires one or more database operations. A transaction means that there is a
change in the database.
Transaction States in DBMS
Transaction states in DBMS are the transaction stages from the beginning of the transaction to
its completion or rollback. The transaction states are the multiple phases of the transaction
during its lifetime. These states outline the transaction’s current situation and explain how it is
handled.
There are a total of 6 transaction states in DBMS. The transaction states are active, partially
committed state, committed, failed, aborted, and terminated state.
Active state: A transaction is called an active transaction when the operations are benign and
made in that particular transaction.
Partially Committed and Committed state: In this state, the operations made in the main
memory will be made permanent and then passed on to the committed state. And if, in the
process, some failure is experienced, it goes into a failed state.
Aborted State: In case of transaction failure, it goes from failed state to aborted state. If we
apply the rollback operation, the transaction goes to the state as it was in its beginning before
applying the operations. Also, locks are released if the tractions use a lock on the data, and now
other transactions can access the data.
Transaction States:
Active
Partially committed
Failed
Aborted
Committed
At first, when the transaction is going to operate it is the active state. When the read or write
operations occurs, it can be called partially committed states. Finally, after read or write operations,
when they use commit operations, they will be committed states meaning the transaction is stored
permanently in the database. And when after these both of active states and partially committed
states fails, they will fall under the category failed state. Without executing, failed state will rollback
which will create the aborted state. Once again, aborted state will be automatically converges into
terminated state. Also, after the committed state, the transaction terminates.
In the process of series of operations, read and write operations creates partially committed. That
will be stored in local memory or buffer. After, the use of commit statement, data will be moved into
permanent storage. This will justify the flow from active to partially committed and to committed
state.
On the other hand, in the case of power failure, it will be in failed state. Also, in the partially
committed state, in the power failure case, it will be in failed sate. After this, rollback occurs meaning
the local memory is cleared. Then aborted occurs meaning the database is unchanged and finally
terminated.
3. What is meant by ACID properties?
ACID properties are a set of fundamental principles in the context of database management
systems (DBMS) that ensure data integrity, consistency, and reliability in transaction processing.
The term "ACID" stands for Atomicity, Consistency, Isolation, and Durability. Let's look at each of
these properties in detail:
Atomicity:
Atomicity ensures that a transaction is treated as a single, indivisible unit of work. It means
that either all the operations within a transaction are executed successfully, or none of them are
executed at all. If any part of a transaction fails, the entire transaction is rolled back, and the
database is left unchanged.
Consistency:
Consistency guarantees that a transaction takes the database from one valid state to
another. It enforces all the rules and constraints defined in the database schema. In simpler terms,
the database remains in a consistent state before and after a transaction is executed.
Isolation:
Isolation ensures that the concurrent execution of multiple transactions does not lead to
data inconsistencies. Each transaction is isolated from other transactions until it is completed and
committed. This prevents interference or conflicts between transactions.
Durability:
Durability ensures that once a transaction is successfully committed, its changes to the
database are permanent and survive any subsequent system failures (e.g., power outage, crash).
The changes are stored safely in non-volatile storage, such as disk, so that they can be recovered
and restored in case of a system failure.
By adhering to the ACID properties, a DBMS ensures that data integrity and reliability are
maintained, and transactions are executed in a robust and consistent manner, even in a multi-user
and concurrent environment. However, it's essential to note that adhering to all ACID properties
might incur some performance overhead, especially in high-concurrency scenarios. To address this,
some systems may adopt weaker consistency models like BASE (Basically Available, Soft state,
Eventually consistent) for certain use cases.
a.DBMS storage
Storage System in DBMS
A database system provides an ultimate view of the stored data. However, data in the form of bits, bytes
get stored in different storage devices.
In this section, we will take an overview of various types of storage devices that are used for accessing
and storing data.
Types of Data Storage
For storing the data, there are different types of storage options available. These storage types differ
from one another as per the speed and accessibility. There are the following types of storage devices
used for storing the data:
o Primary Storage
o Secondary Storage
o Tertiary Storage
b. Deadlock
In a database management system (DBMS), a deadlock occurs when two or more transactions
are waiting for each other to release resources, such as locks on database objects, that they need
to complete their operations. As a result, none of the transactions can proceed, leading to a
situation where they are stuck or “deadlocked.”
c. Serializability
In computer science, serializability is a property of a system describing how different processes
operate on shared data. A system is serializable if its result is the same as if the operations were
executed in some sequential order, meaning there is no overlap in execution. A database
management system (DBMS) can be accomplished by locking data so that no other process can
access it while it is being read or written.
Shared Lock (Read Lock): Multiple transactions can acquire shared locks on the same data item
simultaneously. Shared locks allow read-only operations on the data and do not conflict with other
shared locks. Transactions holding shared locks can read the data but cannot modify it until all
shared locks are released.
Exclusive Lock (Write Lock): Only one transaction can acquire an exclusive lock on a data item at
a time. Exclusive locks prevent other transactions from acquiring shared or exclusive locks on the
same data item. Transactions holding exclusive locks have the exclusive right to modify the data.
There are several lock-based protocols that control the way locks are acquired and released to
ensure data consistency. Some common lock-based protocols include:
Two-Phase Locking (2PL): Transactions follow two phases: an acquisition phase, where they
acquire all the necessary locks, and a release phase, where they release all the locks. 2PL
guarantees serializability and avoids issues like dirty reads, non-repeatable reads, and lost updates.
Strict Two-Phase Locking (S2PL): A variant of 2PL where transactions hold all their locks until the
end of the transaction. This ensures strict serializability but may lead to more lock contention and
reduced concurrency.
Rigorous Two-Phase Locking (Strict 2PL with Conservative 2PL): Similar to S2PL, but the locks
are acquired before any data access in the transaction. This ensures rigorous serializability but may
result in more restrictive locking.
Lock-based protocols ensure that transactions access shared resources in a coordinated and
controlled manner, avoiding conflicts and data inconsistencies. However, they may also lead to
issues like lock contention, deadlock, and reduced concurrency, especially in high-concurrency
environments. To strike a balance between data consistency and performance, different DBMS
implementations may use various lock-based protocols or combine them with other concurrency
control mechanisms.