Module 5 - Tranasaction
Module 5 - Tranasaction
Module 5 - Tranasaction
MODULE 5
Transaction Processing Concepts
5.1 Introduction to Transaction Processing
A DBMS is single-user id at most one user at a time can use the system, and it is multiuser if
many users can use the system—and hence access the database—concurrently.
Most DBMS are multiuser (e.g., airline reservation system).
Multiprogramming operating systems allow the computer to execute multiple programs (or
processes) at the same time (having one CPU, concurrent execution of processes is actually
interleaved).
If the computer has multiple hardware processors (CPUs), parallel processing of multiple
processes is possible.
A transaction is a logical unit of database processing that includes one or more database access
operations (e.g., insertion, deletion, modification, or retrieval operations). The database operations
that form a transaction can either be embedded within an application program or they can be
specified interactively via a high-level query language such as SQL. One way of specifying the
transaction boundaries is by specifying explicit begin transaction and end transaction
statements in an application program; in this case, all database access operations between the two
are considered as forming one transaction. A single application program may contain more than
one transaction if it contains several transaction boundaries. If the database operations in a
transaction do not update the database but only retrieve data, the transaction is called a read-only
transaction.
Read-only transaction - do not changes the state of a database, only retrieves data.
The basic database access operations that a transaction can include are as follows:
read_item(X): reads a database item X into a program variable X.
o write_item(X): writes the value of program variable X into the database item named X.
5.3 Why Concurrency Control Is Needed
This problem occurs when one transaction updates a database item and then the transaction fails
for some reason. The updated item is accessed by another transaction before it is changed back to
its original value. Figure 19.03(b) shows an example where T1 updates item X and then fails
before completion, so the system must change X back to its original value. Before it can do so,
however, transaction T2 reads the "temporary" value of X, which will not be recorded
permanently in the database because of the failure of T1. The value of item X that is read by T2
is called dirty data, because it has been created by a transaction that has not completed and
committed yet; hence, this problem is also known as the dirty read problem.
The Incorrect Summary Problem.
Another problem that may occur is called unrepeatable read, where a transaction T reads an
item twice and the item is changed by another transaction T' between the two reads. Hence, T
receives different values for its two reads of the same item. This may occur, for example, if
during an airline reservation transaction, a customer is inquiring about seat availability on several
flights. When the customer decides on a particular flight, the transaction then reads the number
of seats on that flight a second time before completing the reservation.
5.4 Why Recovery Is Needed
Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either (1) all the operations in the transaction are completed successfully and
their effect is recorded permanently in the database, or (2) the transaction has no effect
whatsoever on the database or on any other transactions. The DBMS must not permit some
operations of a transaction T to be applied to the database while other operations of T are not.
This may happen if a transaction fails after executing some of its operations but before executing
all of them.
Types of Failures
Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:
1. A computer failure (system crash): A hardware, software, or network error occurs in the
computer system during transaction execution. Hardware crashes are usually media
failures—for example, main memory failure.
2. A transaction or system error: Some operation in the transaction may cause it to fail,
such as integer overflow or division by zero. Transaction failure may also occur because
of erroneous parameter values or because of a logical programming error . In addition,
the user may interrupt the transaction during its execution.
3. Local errors or exception conditions detected by the transaction: During transaction
execution, certain conditions may occur that necessitate cancellation of the transaction.
For example, data for the transaction may not be found. Notice that an exception
condition , such as insufficient account balance in a banking database, may cause a
transaction, such as a fund withdrawal, to be canceled. This exception should be
programmed in the transaction itself, and hence would not be considered a failure.
4. Concurrency control enforcement: The concurrency control method (see Chapter 20)
may decide to abort the transaction, to be restarted later, because it violates serializability
(see Section 19.5) or because several transactions are in a state of deadlock.
5. Disk failure: Some disk blocks may lose their data because of a read or write malfunction
or because of a disk read/write head crash. This may happen during a read or a write
operation of the transaction.
6. Physical problems and catastrophes: This refers to an endless list of problems that
includes power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes
by mistake, and mounting of a wrong tape by the operator.
Failures of types 1, 2, 3, and 4 are more common than those of types 5 or 6. Whenever a failure
of type 1 through 4 occurs, the system must keep sufficient information to recover from the
failure. Disk failure or other catastrophic failures of type 5 or 6 do not happen frequently; if they
do occur, recovery is a major task.
The concept of transaction is fundamental to many techniques for concurrency control and
recovery from failures.
5.5 Transaction and System Concepts
A transaction is an atomic unit of work that is either completed in its entirety or not done at all.
For recovery purposes, the system needs to keep track of when the transaction starts, terminates,
and commits or aborts (see below). Hence, the recovery manager keeps track of the following
operations:
Figure 19.04 shows a state transition diagram that describes how a transaction moves through its
execution states. A transaction goes into an active state immediately after it starts execution,
where it can issue READ and WRITE operations. When the transaction ends, it moves to the partially
committed state. At this point, some recovery protocols need to ensure that a system failure will
not result in an inability to record the changes of the transaction permanently (usually by
recording changes in the system log ). Once this check is successful, the transaction is said to
have reached its commit point and enters the committed state. Once a transaction is committed,
it has concluded its execution successfully and all its changes must be recorded permanently in
the database.
5.6 The System Log
To be able to recover from failures that affect transactions, the system maintains a log to keep
track of all transactions that affect the values of database items.
Log records consists of the following information (T refers to a unique transaction_id):
1. [start_transaction, T]: Indicates that transaction T has started execution.
2. [write_item, T,X,old_value,new_value]: Indicates that transaction T has changed the value
of database item X from old_value to new_value.
3. [read_item, T,X]: Indicates that transaction T has read the value of database item X.
4. [commit,T]: Indicates that transaction T has completed successfully, and affirms that its
effect can be committed (recorded permanently) to the database.
5. [abort,T]: Indicates that transaction T has been aborted.
Transactions should possess several properties. These are often called the ACID properties, and
they should be enforced by the concurrency control and recovery methods of the DBMS. The
following are the ACID properties:
A schedule (or history) S of n transactions T1, T2, ..., Tn is an ordering of the operations of the
transactions subject to the constraint that, for each transaction Ti that participates in S, the
operations of Ti in S must appear in the same order in which they occur in Ti. Note, however,
that operations from other transactions Tj can be interleaved with the operations of Ti in S. For
now, consider the order of operations in S to be a total ordering, although it is possible
theoretically to deal with schedules whose operations form partial orders.
Similarly, the schedule for Figure 19.03(b), which we call Sb, can be written as follows, if we
assume that transaction T1 aborted after its read_item(Y) operation:
Two operations in a schedule are said to conflict if they satisfy all three of the following
conditions:
A schedule S of n transactions T1, T2, ..., Tn, is said to be a complete schedule if the following
conditions hold:
1. The operations in S are exactly those operations in T1, T2, ..., Tn, including a commit or abort
operation as the last operation for each transaction in the schedule.
2. For any pair of operations from the same transaction Ti, their order of appearance in S is the
same as their order of appearance in Ti.
3. For any two conflicting operations, one of the two must occur before the other in the schedule.
once a transaction T is committed, it should never be necessary to roll back T. The schedules that
theoretically meet this criterion are called recoverable schedules and those that do not are called
nonrecoverable, and hence should not be permitted.
A schedule S is recoverable if no transaction T in S commits until all transactions T' that have
written an item that T reads have committed. A transaction T reads from transaction T in a
schedule S if some item X is first written by and later read by T. In addition, should not
have been aborted before T reads item X, and there should be no transactions that write X after
writes it and before T reads it (unless those transactions, if any, have aborted before T
reads
X).
Consider the schedule given below, which is the same as schedule except that two
commit operations have been added to :
)
is not recoverable, because T2 reads item X from T1, and then T2 commits before T1 commits.
If T1 aborts after the c2 operation in , then the value of X that T2 read is no longer valid and T2
must be aborted after it had been committed, leading to a schedule that is not recoverable. For the
schedule to be recoverable, the c2 operation in must be postponed until
after T1 commits. If T1 aborts instead of committing, then T2 should also abort as shown in Se,
because the value of X it read is no longer valid.
Serializability of Schedules
If no interleaving of operations is permitted, there are only two possible arrangement for
transactions T1 and T2.
1. Execute all the operations of T1 (in sequence) followed by all the operations of T2 (in
sequence).
2. Execute all the operations of T2 (in sequence) followed by all the operations of T1
A schedule S is serial if, for every transaction T all the operations of T are executed consecutively
in the schedule.
A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the same
n transactions.
5.11 Transaction Support in SQL
EXEC SQL INSERT INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Jabbar', 'Ahmad', '998877665', 2, 44000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1 WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: . . . ;
Questions
1. Write a short Notes on
i. 2PL Lock
ii. Two-P Deadlock