Chapter 7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Chapter 7

Transaction Processing

Prepared By:
Ms. Ayushi Gondaliya
Assistant Prof.
SRCOE, RAJKOT
Transaction Process
• A transaction is a unit of program execution that
accesses and possibly updates various data items.
• A transaction must see a consistent database.
• During transaction execution the database may be
temporarily inconsistent.
• When the transaction completes successfully (is
committed), the database must be consistent.
• After a transaction commits, the changes it has made to
the database persist, even if there are system failures.
• Multiple transactions can execute in parallel.
• Two main issues to deal with:
▪ Failures of various kinds, such as hardware failures
and system crashes
▪ Concurrent execution of multiple transactions
ACID Properties
 Property of Transaction
◦ Atomicity
◦ Consistency
◦ Isolation
◦ Durability
States Of Transaction
 Active state
◦ The active state is the first state of every transaction. In
this state, the transaction is being executed.
◦ For example: Insertion or deletion or updating a record is
done here. But all the records are still not saved to the
database.
 Partially committed
◦ In the partially committed state, a transaction executes its
final operation, but the data is still not saved to the
database.
 Committed
◦ A transaction is said to be in a committed state if it
executes all its operations successfully. In this state, all the
effects are now permanently saved on the database
system.
 Failed state
◦ If any of the checks made by the database recovery
system fails, then the transaction is said to be in the
failed state.
◦ In the example of total mark calculation, if the
database is not able to fire a query to fetch the marks,
then the transaction will fail to execute.
 Aborted
◦ If the transaction fails in the middle of the transaction
then before executing the transaction, all the
executed transactions are rolled back to its
consistent state.
◦ After aborting the transaction, the database recovery
module will select one of the two operations:
 Re-start the transaction
 Kill the transaction
Schedule
 A series of operation from one transaction to
another transaction is known as schedule. It is
used to preserve the order of the operation in
each of the individual transaction.
1. Serial Schedule
 The serial schedule is a type of schedule where
one transaction is executed completely before
starting another transaction.
 For example: Suppose there are two transactions
T1 and T2 which have some operations. If it has no
interleaving of operations, then there are the following
two possible outcomes:
◦ Execute all the operations of T1 which was followed
by all the operations of T2.
◦ Execute all the operations of T2 which was followed
by all the operations of T1.
Let T1 transfer $50 from A to B, and T2 transfer 10% of the
balance from A to B.

A serial schedule where T1 is followed by T2


A serial schedule where T2 is followed by T1
2. Non-serial Schedule

 If interleaving of operations is allowed, then


there will be non-serial schedule.
 It contains many possible orders in which the
system can execute the individual operations of
the transactions.
Example:
Serializability
 Basic Assumption – Each transaction preserves database
consistency.
• When multiple transactions run concurrently, then it
may give rise to inconsistency of the database.
• Serializability is a concept that helps to identify
which non-serial schedules are correct and will
maintain the consistency of the database.
1. conflict serializability
2. view serializability
Conflict Serializable Schedule
 Conflict Serializability is one of the type of
Serializability, which can be used to check
whether a non-serial schedule is conflict
serializable or not.
 What is Conflict Serializability?
◦ A schedule is called conflict serializable if we can
convert it into a serial schedule after swapping its
non-conflicting operations.
 The two operations become conflicting if all
conditions satisfy:
◦ Both belong to separate transactions.
◦ They have the same data item.
◦ They contain at least one write operation.
 Example 1: Operation W(X) of transaction T1 and
operation R(X) of transaction T2 are conflicting operations,
because they satisfy all the three conditions mentioned
above. They belong to different transactions, they are
working on same data item X, one of the operation in write
operation.

 Example 2: Similarly Operations W(X) of T1 and W(X) of


T2 are conflicting operations.

 Example 3: Operations W(X) of T1 and W(Y) of T2 are


non-conflicting operations because both the write
operations are not working on same data item so these
operations don’t satisfy the second condition.

 Example 4: Similarly R(X) of T1 and R(X) of T2 are non-


conflicting operations because none of them is write
operation.
Conflict Equivalent Schedules
 Two schedules are said to be conflict Equivalent
if one schedule can be converted into other
schedule after swapping non-conflicting
operations.
 Two schedules are said to be conflict equivalent
if and only if:
◦ They contain the same set of the transaction.
◦ If each pair of conflict operations are ordered
in the same way.
Example:

•Schedule S2 is a serial schedule because, in this, all operations of T1 are


performed before starting any operation of T2.
•Schedule S1 can be transformed into a serial schedule by swapping non-
conflicting operations of S1.
 After swapping of non-conflict
operations, the schedule S1
becomes:
T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

 Since, S1 is conflict serializable.


View Serializability
 View Serializability is a process to find out that
a given schedule is view serializable or not.
 To check whether a given schedule is view
serializable, we need to check whether the
given schedule is View Equivalent to its serial
schedule.
 Lets Take an example,

T1 T2
Read(A)
Write(A)
Read(A)
Write(A)
Read(B)
Write(B)
Read(B)
Write(B)
 So the serial schedule of the above given
schedule would look like this:
T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

 If we can prove that the given schedule is View


Equivalent to its serial schedule then the given
schedule is called view Serializable.
View Equivalent
 Two schedules S1 and S2 are said to be view
equivalent if they satisfy the following
conditions:
1. Initial Read
◦ An initial read of both schedules must be the same.
Suppose two schedule S1 and S2. In schedule S1, if a
transaction T1 is reading the data item A, then in S2,
transaction T1 should also read A.
 2. Updated Read
◦ In schedule S1, if Ti is reading A which is updated by Tj
then in S2 also, Ti should read A which is updated by
Tj.
 3. Final Write
◦ Final write operations on each data item must
match in both the schedules. For example, a
data item X is last written by Transaction T1
in schedule S1 then in S2, the last write
operation on X should be performed by the
transaction T1.
Two phase commit protocol
Two phase commit protocol
 As per the name, there are two phases in
2pc:
◦ 1. prepare phase (commit request phase)
◦ 2. commit/Abort phase.
1.Prepare phase
 The co-ordinator asks from each participator
whether they have successfully completed their
responsibilities for that transaction and are
ready to commit.
 Each participator responds ‘yes |OK’ or
‘no|abort’.
 Every participator writes its data records in a
log.
 If it is unsuccessful to do, then it responds with
a failure message; if it is successful, then it sends
an OK message.
2. Commit/Abort phase
 After the controlling site has received “YES”
message from all the slaves −
◦ The controlling site sends a “Global Commit”
message to the slaves.
◦ The slaves apply the transaction and send a
“Commit ACK” message to the controlling
site.
◦ When the controlling site receives “Commit
ACK” message from all the slaves, it considers
the transaction as committed.
 After the controlling site has received the first
“No” message from any slave −
◦ The controlling site sends a “Global Abort”
message to the slaves.
◦ The slaves abort the transaction and send a
“Abort ACK” message to the controlling site.
◦ When the controlling site receives “Abort
ACK” message from all the slaves, it considers
the transaction as aborted.
Database recovery
 When a system crashes, it may have several transactions
being executed and various files opened for them to
modify the data items. Transactions are made of various
operations, which are atomic in nature.
 There are two types of techniques, which can help a
DBMS in recovering as well as maintaining the atomicity
of a transaction −
 Maintaining the logs of each transaction, and writing
them onto some stable storage before actually
modifying the database.
 Maintaining shadow paging, where the changes are done
on a volatile memory, and later, the actual database is
updated.
Log based Recovery
 Log is a sequence of records.
 Log of each transaction is maintained in some
stable storage so that if any failure occurs, then it
can be recovered from there.
 If any operation is performed on the database, then
it will be recorded in the log.
 But the process of storing the logs should be done
before the actual transaction is applied in the
database.
 Log-based recovery works as follows −
◦ The log file is kept on a stable storage media.
◦ When a transaction enters the system and starts
execution, it writes a log about it.
 Let's assume there is a transaction to modify the
City of a student. The following logs are written
for this transaction.
 When the transaction is initiated, then it writes
'start' log.
◦ <Tn, Start>
 When the transaction modifies the City from
'Noida' to 'Bangalore', then another log is written
to the file.
◦ <Tn, City, 'Noida', 'Bangalore' >
 When the transaction is finished, then it writes
another log to indicate the end of the transaction.
◦ <Tn, Commit>
 There are two approaches to modify the
database:
 1. Deferred database modification:
◦ The deferred modification technique occurs if the
transaction does not modify the database until it has
committed.
◦ In this method, all the logs are created and stored in
the stable storage, and the database is updated when
a transaction commits.
 2. Immediate database modification:
◦ The Immediate modification technique occurs if
database modification occurs while the transaction is
still active.
◦ In this technique, the database is modified
immediately after every operation. It follows an actual
database modification.
Checkpoint
 The checkpoint is a type of mechanism where all
the previous logs are removed from the system
and permanently stored in the storage disk.
 The checkpoint is like a bookmark. While the
execution of the transaction, such checkpoints
are marked, and the transaction is executed then
using the steps of the transaction, the log files will
be created.
 The checkpoint is used to declare a point before
which the DBMS was in the consistent state, and
all transactions were committed.
Recovery using Checkpoint
 The recovery system reads log files from the end to
start. It reads log files from T4 to T1.
 Recovery system maintains two lists, a redo-list, and
an undo-list.
 The transaction is put into redo state if the recovery
system sees a log with <Tn, Start> and <Tn, Commit>
or just <Tn, Commit>.
 In the redo-list and their previous list, all the
transactions are removed and then redone before
saving their logs.
 For example: In the log file, transaction T2 and T3 will
have <Tn, Start> and <Tn, Commit>. The T1 transaction
will have only <Tn, commit> in the log file. That's why
the transaction is committed after the checkpoint is
crossed. Hence it puts T1, T2 and T3 transaction into
redo list.
 The transaction is put into undo state if the
recovery system sees a log with <Tn, Start> but
no commit or abort log found. In the undo-list,
all the transactions are undone, and their logs
are removed.
 For example: Transaction T4 will have <Tn,
Start>. So T4 will be put into undo list since this
transaction is not yet complete and failed amid.
Shadow Paging
 This is the method where all the
transactions are executed in the primary
memory or the shadow copy of database.
 Once all the transactions completely
executed, it will be updated to the database.
 Hence, if there is any failure in the middle of
transaction, it will not be reflected in the
database.
 Database will be updated after all the
transaction is complete.
 A database pointer will be always pointing to
the consistent copy of the database, and copy of
the database is used by transactions to update.
 Once all the transactions are complete, the DB
pointer is modified to point to new copy of DB,
and old copy is deleted.
 If there is any failure during the transaction, the
pointer will be still pointing to old copy of
database, and shadow database will be deleted.
 If the transactions are complete then the
pointer is changed to point to shadow DB, and
old DB is deleted.
Concurrency Control
 Concurrency is the ability of a database
to allow multiple users to access the data
at the same time.
 In the concurrency control, the multiple
transactions can be executed
simultaneously.
 It may affect the transaction result.
 It is highly important to maintain the
order of execution of those transactions.
Problems of concurrency control

 Several problems can occur when


concurrent transactions are executed in
an uncontrolled manner.
 Following are the three problems in
concurrency control.
1. Lost updates
2. Dirty read
3. Unrepeatable read
1. Lost update problem

 When two transactions that access the same


database items contain their operations in a
way that makes the value of some database item
incorrect, then the lost update problem occurs.
 If two transactions T1 and T2 read a record and
then update it, then the effect of updating of the
first record will be overwritten by the second
update.
Example

•at time T5, the update of Transaction-X is lost because Transaction y


overwrites it without looking at its current value.

•Such type of problem is known as Lost Update Problem as update


made by one transaction is lost here.
2. Dirty Read
 The dirty read occurs in the case when one
transaction updates an item of the database, and
then the transaction fails for some reason. The
updated database item is accessed by another
transaction before it is changed back to the
original value.
 A transaction T1 updates a record which is read
by T2. If T1 aborts then T2 now has values
which have never formed part of the stable
database.
Example

•At time t4, Transactions-Y rollbacks. So, it changes A's value back to
that of prior to t1.
•So, Transaction-X now contains a value which has never become part
of the stable database.
•Such type of problem is known as Dirty Read Problem, as one
transaction reads a dirty value which has not been committed.
3. Unrepeatable read.
 Inconsistent Retrievals Problem is also known
as unrepeatable read. When a transaction
calculates some summary function over a set of
data while the other transactions are updating
the data, then the Inconsistent Retrievals
Problem occurs.
 A transaction T1 reads a record and then does
some other processing during which the
transaction T2 updates the record. Now when
the transaction T1 reads the record, then the
new value will be inconsistent with the previous
value.
 Transaction-X is doing the sum of all balance
while transaction-Y is transferring an amount
50 from Account-1 to Account-3.
 Here, transaction-X produces the result of 550
which is incorrect. If we write this produced
result in the database, the database will become
an inconsistent state because the actual sum is
600.
 Here, transaction-X has seen an inconsistent
state of the database.
Concurrency Control Protocol
 Concurrency control protocols ensure
atomicity, isolation, and serializability of
concurrent transactions. The concurrency
control protocol can be divided into:
1. Lock based protocol
2. Time-stamp protocol
Lock-Based Protocol
 In this type of protocol, any transaction cannot
read or write data until it acquires an
appropriate lock on it. There are two types of
lock:
 1. Shared lock:
◦ It is also known as a Read-only lock. In a
shared lock, the data item can only read by
the transaction.
◦ It can be shared between the transactions
because when the transaction holds a lock,
then it can't update the data on the data item.
 2. Exclusive lock:
◦ In the exclusive lock, the data item can be
both reads as well as written by the
transaction.
◦ This lock is exclusive, and in this lock, multiple
transactions do not modify the same data
simultaneously.
Two-phase Locking Protocol(2PL)
 The two-phase locking protocol divides the
execution phase of the transaction into three
parts.
 In the first part, when the execution of the
transaction starts, it seeks permission for the
lock it requires.
 In the second part, the transaction acquires all
the locks. The third phase is started as soon as
the transaction releases its first lock.
 In the third phase, the transaction cannot
demand any new locks. It only releases the
acquired locks.
 There are two phases of 2PL:
 Growing phase: In the growing phase, a new
lock on the data item may be acquired by the
transaction, but none can be released.
 Shrinking phase: In the shrinking phase,
existing lock held by the transaction may be
released, but no new locks can be acquired.
 In the below example, if lock conversion is
allowed then the following phase can happen:
◦ Upgrading of lock (from S(a) to X (a)) is allowed in
growing phase.
◦ Downgrading of lock (from X(a) to S(a)) must be
done in shrinking phase.
 The following way shows how unlocking and
locking work with 2-PL.
 Transaction T1:
◦ Growing phase: from step 1-3
◦ Shrinking phase: from step 5-7
◦ Lock point: at 3
 Transaction T2:
◦ Growing phase: from step 2-6
◦ Shrinking phase: from step 8-9
◦ Lock point: at 6
Time-stamp protocol
 The most commonly used concurrency protocol is
the timestamp based protocol. This protocol uses
either system time or logical counter as a
timestamp.
 Every transaction has a timestamp associated with
it, and the ordering is determined by the age of the
transaction.
 The priority of the older transaction is higher that's
why it executes first.
 Let's assume there are two transactions T1 and T2.
Suppose the transaction T1 has entered the system
at 007 times and transaction T2 has entered the
system at 009 times. T1 has the higher priority, so it
executes first as it is entered the system first.
Time-stamp Ordering protocol
 The timestamp-ordering protocol ensures
serializability among transactions in their
conflicting read and write operations.
 This is the responsibility of the protocol system
that the conflicting pair of tasks should be
executed according to the timestamp values of
the transactions.
◦ The timestamp of transaction Ti is denoted as TS(Ti).
◦ Read time-stamp of data-item X is denoted by R-
TS(X).
◦ Write time-stamp of data-item X is denoted by W-
TS(X).
 Basic Timestamp ordering protocol
works as follows:
 1. Check the following condition whenever a
transaction Ti issues a Read (X) operation:
◦ If W_TS(X) >TS(Ti) then the operation is rejected.
◦ If W_TS(X) <= TS(Ti) then the operation is executed.
◦ Timestamps of all the data items are updated.
 2. Check the following condition whenever a
transaction Ti issues a Write(X) operation:
◦ If TS(Ti) < R_TS(X) then the operation is rejected.
◦ If TS(Ti) < W_TS(X) then the operation is rejected and Ti
is rolled back otherwise the operation is executed.
Deadlock
 A deadlock is a condition where two or more
transactions are waiting indefinitely for one another
to give up locks.
 Deadlock is said to be one of the most feared
complications in DBMS as no task ever gets finished
and is in waiting state forever.
 For example: In the student table, transaction T1
holds a lock on some rows and needs to update
some rows in the grade table. Simultaneously,
transaction T2 holds locks on some rows in the
grade table and needs to update the rows in the
Student table held by Transaction T1.
THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy