GNR-18 DBMS Unit-5

UNIT-V
SYLLABUS: Recovery System: Failure Classification, Storage, Recovery and Atomicity,

Recovery Algorithm, Failure with loss of non-volatile storage, Remote Backup systems.
Indexing: Ordered Indices, B+ Tree Index files.
Failure Classification
Failure Types
• Transaction failure: Two types
– Logical errors: transaction cannot complete due to some
internal error condition.eg: bad input, data not found,
overflow or resource limit exceeded.
– System errors: the database system has entered
undesirable state(e.g., deadlock) as a result of which a
transaction cannotcontinue.
• System crash: a power failure or other hardware or software
failure causes the system to crash.(Bug in database software or
os software)
– Fail-stop assumption: non-volatile storage contents are
assumed to not be corrupted by system crash
• Disk failure: A disk block loses its content as a result of either head
crash or failure during data transfer operation. use DVDs or tapes
are used to recover from failure.
Recovery Algorithms have two parts:

1. Actions taken during normal transaction processing to ensure that
enoughinformation exists to allow recovery from failures.
2. Actions taken after a failure to recover the database contents to a
state thatensures database consistency, transaction atomicity, and
durability.
Storage Structure
• Volatile storage:
– does not survive system crashes
– examples: main memory, cache memory
• Nonvolatile storage:
– survives system crashes
– examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
– but may still fail, losing data
• Stable storage:
– maintaining multiple copies on distinct
nonvolatilemedia
1. Stable-Storage Implementation:
• To implement stable storage,needto replicate the needed
information in several nonvolatile storage media with independent
failure modesand to update the information in a controlledmanner
to ensure that failure duringdata transfer does not damage the
needed information.
• RAID- Redundant Array of Independent Disks
• Mirrored disk: two copies of each block on separate disks. but it
is notprotecting from disasters.
• copies can be at remote sites to protect against disasters such as
fire orflooding.
• Failure during data transfer can still result in inconsistent copies:
Blocktransfer between memory and disk storage can result in:
– Successful completion: Transferred information arrived
safely atits destination.
– Partial failure: A failure occurred in the midst of transfer and
destination block has incorrect information
– Total failure: A failure occurred early during the transfer a
thatdestination block was never updated
Execute output operation as follows (assuming two copies of each block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
informationonto the second physical block.
3. The output is completed only after the second write successfully
completes.
If the system fail while blocks are being written, two copies of a block are
inconsistent with each other.
During recovery for each block of the system need to examine two
copies ofthe blocks
 If Both blocks are same and no detectable error existsthen no action
 If system detects error in one block then replace its contents with contents
of other block
 If both blocks contain no detectable error but they differ in content then
system replace the content of first block with content of second.
Disadvantage: comparing every pair of blocks during recovery is expansive

Solution: keep track of block writes that are in progress using Small amount of
nonvolatile RAM
2. Data Access:
The database is partitioned into fixed length storage units called blocks. The
input and output operations are done in block units.
• Physical blocks are those blocks residing on the disk.
• Buffer blocks are the blocks residing temporarily in main memory.
• Disk buffer: The area of memory where blocks reside temporarily.
Block movements between disk and main memory are initiated through the
following two operations:
– input(B) transfers the physical block B to main memory.
– output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
Each transaction Tihas its private work-area in which local copies of all data
items accessed and updated by Ti are kept.
– Ti's local copy of a data item X is called xi.
Transferring data by two operations
1. read(X) assigns the value of data item X to the local variable xi.
a) if block BX on which X resides is not in main memory, it issues
input(BX).
b) It assigns to xi the value of X from the buffer block.
2. write(X) assigns the value of local variable xito data item X in the buffer
block.
a) If block BXon which X resides is not in main memory, it

issuesinput(BX).
b) It assigns the value of xi to X in buffer BX.
The output(BX) may takes place later because BX may contain other data items
Recovery and Atomicity:

1. Log Records: The log is a sequence of log records, recording all the
update activities in the database.
Several types of log records:
<Ti ,Xj , V1, V2>An update log record
 Transaction identifier, which is the unique identifier of the transaction
that performed the write operation.
 Data-item identifier, which is the unique identifier of the data item
written.
 Old value, which is the value of the data item prior to the write.
 New value, which is the value that the data item will have after the
write.
<Ti, start>Transactionhas started.
<Ti, commit>Transaction Tihas committed.
<Ti, abort>. Transaction Tihas aborted.
 Whenever a transaction performs a write, the log record for that write
be created and added to the log, before the database is modified.
 log records are useful for recovery from system and disk failures, the log
must reside in stable storage.
 Every log record is written to the end of the log on stable storage.
2. Database Modification:
Steps a transaction takes in modifying a data item:
1. Transaction performs some computations in its own private part of
mainmemory.
2. The transaction modifies the data block in the disk buffer in main
memory holding the data item.
3. The database system executes the output operation that writes the data
block to disk.
Deferred-Modification Technique
If a transaction does not modify the databaseuntil it has committed.
overhead: Transactions need to make local copies of all updated data items;
 Further, if a transaction reads a data item that it has updated, it
mustread the value from its local copy.
Immediate-Modification Technique
If database modifications occur while thetransaction is still active.
The recovery algorithm supports Immediate-modification technique. The
system to perform undo and redo operations as appropriate.
Transaction Commit: A transaction has committed when its commit log
record, which is thelast log record of the transaction, has been output to stable storage.
Using the Log to Redo and Undo Transactions
The recovery scheme uses two recovery procedures.

1. redo(Ti ) sets the value of all data items updated by transaction Ti to the new
values.
2. Undo(Ti ) restores the value of all data items updated by transaction Ti to
theold values. It alsowrites log records to record the updates performed as
part of theundo process. These log records are special redo-only log
records, sincethey do not need to contain the old-value of the updated
data item
After a system crash which transactions need to be redone, and which need to
be undone to ensure atomicity.
 Transaction Tineeds to be undone if the log contains the record
<Tistart>,but does not contain either the record <Ticommit>or the
record <Tiabort>.
 Transaction Tineeds to be redone if the log contains the record
<Tistart>and either the record <Ticommit>or the record <Tiabort>.
Undo(T0) redo(T0),undo(T1) redo(T0),redo(T1)

3. Checkpoints:
To search the entire log to determine which transactions need to undo or redo.
Two difficulties with this approach:
1. search process is time-consuming.
2. unnecessarily redo transactions which have already output their
updates to the database. So that recovery to takes longer
A checkpoint is performed as follows:
1. Output all log records currently residing in main memory onto
stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint L> onto stable storage where L is
a list of all transactions active at the time of checkpoint.
 All updates are stopped while doing checkpointing.
 After a system crash has occurred, the system examines the log to find the
last <checkpoint L>record by scanning log backward.
 The redo or undo operations need to be applied only to transactions in Land
all transactions started execution after check point.
 For all transactions Tkin T that have no <Tkcommit>record or <Tkabort>
record in the log, execute undo(Tk).
 For all transactions Tkin T such that either the record <Tkcommit>or the
record <Tkabort>appears in the log, execute redo(Tk).
Example:
consider the set of transactions {T0, T1, . . . , T100}.
Suppose most recent checkpoint took place during the execution of
transaction T67 and T69, while T68 and all transactions with subscripts lower
than 67 completed before the checkpoint. Thus, only transactions T67, T69, . . .
, T100 need to be considered during the recovery scheme.
A fuzzy checkpoint is a checkpoint where transactions are allowed to perform
updates even while buffer blocks are being written out.
Recovery Algorithm
1. Transaction Rollback
consider transaction rollback during normal operation
1. The log is scanned backward, and for each log record of Tiof the form
<Ti ,Xj , V1, V2>that is found:
a) The value V1 is written to data item Xj, and
b) A special redo-only log record <Ti ,Xj , V1>is written to the

log.
These log records are sometimes called compensation log records.
2. Once the log record <Tistart>is found the backward scan is stopped,
and a log record <Tiabort>is written to the log.
2. Recovery After a System Crash

• Recovery actions, when the database system is restarted after a crash,
take place in two phases:
In the redo phase, the system replays updates of all transactions by scanning
the log forward from the last checkpoint.
steps taken while scanning the log are as follows:
a. List of transactions to be rollback,undo-list, is set the list L in the
<checkpointL> log record.
b. Whenever a record<Ti, Xj, V1, V2>or redo-only record <Ti, Xj, V2>is
found, redo it by writing V2 to Xj.
c. Whenever a log record <Tistart>is found, add Tito undo-list.
d. Whenever a log record <Ticommit> or <Tiabort>is found, remove Tifrom
undo-list.
At the end of the redo phase, undo-list contains the list of all transactions that
are incomplete.
In the undo phase, the system rolls back all transactions in the undo-list by
scanning the log backward from the end.
a. Whenever it finds a log record belonging to a transaction in the
undolist,it performs undo actions.
b. When the system finds a <Ti,start>log record for a transaction Tiinundo-
list, it writes a <Ti,abort>log record to the log, and removesTifrom undo-
list.
c. The undo phase terminates once undo-list becomes empty.
After the undo phase of recovery terminates, normal transaction processing
can resume.
Example:
Failure with Loss of Non-Volatile Storage

Dump the entire contents of the database to stable storage periodically—say,
once per day.
Example: Dump the database to one or more magnetic tapes.
One approach to database dumping requires that no transaction may be active
during the dump procedure.
1. Output all log records currently residing in main memory onto stable
storage.
2. Output all buffer blocks onto the disk.
3. Copy the contents of the database to stable storage.
4. Output a log record <dump>onto the stable storage.
• A dump of the database contents is also referred to as an archival dump.

• Most database systems also support an SQL dump, which writes out SQL
DDL statements and SQL insert statements to a file.
• Fuzzy dump schemes have been developed that allow transactions to be
active while the dump is in progress.
Remote Backup Systems
• To achieve high availability by performing transaction processing at one
site, called the primary site, and having a remote backup site (secondary
site) where all the data from the primary site are replicated.
• The remote site must be kept synchronized with the primary site, as
updates are performed at the primary. To achieve synchronization by
sending all log records from primary site to the remote backup site.
Figure: Architecture of remote backup system.

Several issues must be addressed:
 Detection of failure: Failure of communication lines can fool theremote
backup into believing that the primary has failed
To avoid this problem, we maintain several communication links with
independent modes of failure between the primary and the remote
backup.Modem connection over a telephone line, may be used for manual
intervention by operators, who can communicate over the
telephonesystem.
 Transfer of control. When the primary fails, the backup site takes over
processing and becomes the new primary.
When the original primary site recovers, it can either play the role of
remote backup, or take over the role of primary site again. In either
case, the old primary must receive a log of updates carried out by the
backup site while the old primary was down.
 Time to recover. If the log at the remote backup grows large, recovery
will take a long time. The remote backup site can periodically process
the redo log records that it has received and can perform a checkpoint.
A hot-spare configuration can make takeover by the backup site almost
instantaneous.
 Time to commit
• One-safe:A transaction commits as soon as its’s commit log record is
written to stable storage at primary
– Problem: updates may not arrive at backup before it takes over.
• Two-very-safe: commit when transaction’s commit log record is written
to stable storage at primary and backup
– Reduces availability since transactions cannot commit if either site
fails.
• Two-safe: it is same as two-very-safe if both primary and backup are
active. If only the primary is active, the transaction commits as soon as is
commit log record is written to stable storage at the primary.
– Better availability than two-very-safe; avoids problem of lost
transactions in one-safe.
INDEXING
Ordered Indices:
 An index for a file in a database system is similar to the index in the text
book. Index is much smaller than the text book.
 Its help to retrieve information or recodes from files.
ex: retrieve a student record for given ID.
Two basic kinds of indices:
1. Ordered indices. Based on a sorted ordering of the values.
2. Hash indices. Based on a uniform distribution of values across a range of
buckets. The bucket to which a value is assigned is determined by a
function, called a hash function.
Each technique evaluated on the basis of following factors:
• Access types: Access types supported efficiently.
 finding records with a specified attribute value.
 finding records whose attribute values fall in a specified range.
• Access time: Time it takes to find a particular data item, or set of items.
• Insertion time: Time it takes to insert a new data item.
 Time to find the correct place to insert the new data item + time ittakes
to update the index structure.
• Deletion time: Time it takes to delete a data item.
 Time to find the item to be deleted + time it takes to update the index
structure.
Space overhead: sacrifices additional space for index to achieve
performance but that space is moderate.
• An attribute or set of attributes used to look up records in a file is called
a search key.
• A file may have more than one index. If several indices on a file, there
are several search keys.
• Example: search for a book by author, by subject, or by title.
Ordered Indices:
 Primary index or cluster index: search key defines an order same to the
sequential order of the file.
 search key of a primary index is usually but not necessarily the
primary key
 Secondary index or uncluster index: search key specifies an order
different from the sequential order of the file.
• All files are ordered sequentially on some search key, with a clustering
index on the search key, are called index-sequential files.
Dense and Sparse Indices

• An index entry, or index record, consists of a search-key value and
pointers to one or more records with that value as their search-key
value.
Pointer=identifier of disk block + offset within disk block.
Two types of ordered indices
• Dense Index : An index entry appears for every search-key value in
thefile.
Example: Search key is instructor ID
Ex: instructor ID “22222” in dense index the pointer directly pointing to

the desired record.
 Dense clustering index: An index record contains the search-key
value and a pointer to the first data record with that search-key value.
The rest of the records with the same search-key value would be stored
sequentially after the first record.
Ex: if file is ordered in deptname wise and search key is deptname.
• Dense non clustering index: the index must store a list of pointers to
allrecords with the same search-key value.
• Dense clustering index for the instructor file with the search key being
dept name.
Sparse Index:
An index entry appears for only some of the search-key values.
• It can be used only if the relation is stored in sorted order of the search
key.
search key value < desired search value. Ex find 22222 record.
10101<=22222
Multilevel Indices
• If primary index does not fit in memory, access becomes expensive. Even
main memory also required for number of other tasks.
• Solution: treat primary index kept on disk as a sequential file and
construct a sparse index on it.
outer index – a sparse index of primary index
inner index – the primary index file
• To locate a record, use binary search on the outer index to find the
record for the largest search-key value less than or equal to the one that
we desire. similarly inner index also.
• If even outer index is too large to fit in main memory, yet another level
of index can be created, and so on.
• Indices with two or more levels are called multilevel indices.
• Indices at all levels must be updated on insertion or deletion from the
file.
Index Update:
• every index must be updated whenever a record is either inserted into
or deleted from the file.
• Update = deletion of the old record, followed by an insertion of the new
value of the record.
updating single-level indices:
Insertion:
1. If the search-key value does not appear in the index, the system inserts
an index entry with the search-key value in the index at the appropriate
position.
2. Otherwise the following actions are taken:
a) If the index entry stores pointers to all records with the same search
keyvalue, the system adds a pointer to the new record in the index
entry.
b) Otherwise, the index entry stores a pointer to only the first record with
the search-key value then inserted record after the other records with
the same search-key values.
• Sparse indices: if index stores an entry for each block of the file, no
change needs to be made to the index unless a new block is created.
• ´If a new block is created, the first search-key value appearing in the new
block is inserted into the index.
Deletion:
Sparse Indices:
1. If the index does not contain an index entry with the search-key value of
the deleted record, nothing needs to be done to the index.
2. Otherwise the system takes the following actions:
a. If the deleted record was the only record with its search key, the
system replaces the corresponding index record with an index record
for the next search-key value. If the next search-key value already has
an index entry, the entry is deleted instead of being replaced.
b. Otherwise, if the index entry for the search-key value points to the
record being deleted, the system updates the index entry to point
to the next record with the same search-key value.
Secondary Indices:
• A secondary index on a candidate key similar to dense index but
secondary indices may have a different structure from sequential file.
• If the search key of a secondary index is not a candidate key, it is not
enough to point to first record with each search-key value because
remaining records with the same search-key value could be anywhere in
the file.
• secondary index must contain pointers to all the records.
• Use an extra level of indirection to implement secondary indices on
search keys that are not candidate keys.
• Scan the file sequentially in secondary-key order, the reading of each
record is likely to require the reading of a new block from disk, which is
very slow.
• impose a significant overhead on modification of the database.
Uses an extra level of indirection on the instructor file, on the search key
salary.
B+ Trees: A DYNAMIC INDEX STRUCTURE
 It is a dynamic index structure because tree structure grows and shrinks
dynamically.
 It is a balanced tree in which the internal nodes contain index entries
(direct the search) and the leaf nodes contain the data entries (sequence
set).
 To retrieve all leaf pages efficiently, we have to link them using page
pointers. By organizing them into a doubly linked list, we can easily
traverse the sequence of leaf pages (sometimes called the sequence set) in
either direction.
Properties of a B+ tree:
 Operations (insert, delete) on the tree keep it balanced.
 A minimum occupancy of 50 percent is guaranteed for each node except
the root.
 Searching for a record requires just a traversal from the root to the
appropriate leaf.
 every node contains m entries, where d ≤ m ≤ 2d. The value d is a
parameter of the B+ tree, called the order of the tree, and is a measure of
the capacity of a tree node.
 The root node is the only exception to this requirement on the number of
entries; for the root it is simply required that 1 ≤ m ≤ 2d
 B+ trees maintain 67 percent space occupancy.
 Efficient Insertion and deletion operations.
FORMAT OF A NODE:
• n − 1 search-key values K1, K2, . . . , Kn−1, and n pointers P1, P2, . . . , Pn.
• Search-key values within a node are kept in sorted order, if i< j, then Ki <
Kj
K1 < K2 < K3 < . . . <Kn–1
SEARCH Operation: Finds the leaf node in which a given data entry belongs.
Consider below figure B+ tree is of order d=2. Each node contains between 2
and 4 entries.
Data entries are denoted as K*.
Example:
To search for entry 5*, we follow the left-most child pointer, since 5 < 13.
To find 24*, we follow the fourth child pointer, since 24 ≤ 24 <30.
INSERT Operation:
 Insertion takes an entry, finds the leaf node where it belongs, and inserts
it there.
 A node is full and it must be split. When the node is split, an entry
pointing to the node created by the split must be inserted into its
parent. This entry is pointed to by the pointer variable newchildentry.
 If the root is split, a new root node is created and the height of the tree
increases by one.
Example:
leaf-level splits: when leaf node is split copied up the data entry to the parent
node.
Index-level splits: when non-leaf node is split pushed up the index entry to the
parent node.
Redistribution: It redistribute entries of a node N with a sibling before splitting
the node. Redistribution used when there is space in sibling node for insertion
operation.
Insert entry 8* into figure 9.10

DELETE Operation: Finds the leaf node where it belongs and delete it.
Either redistribute entries from an adjacent sibling or merge the node with a
sibling to maintain minimum occupancy. If entries are redistributed between
two nodes, their parent node must be updated.
Redistribution operation: After deleting 24, the non-leaf node has one single
index entry 30. Instead of merge we use redistribution operation.

GNR-18 DBMS Unit-5

Uploaded by

Copyright:

Available Formats

GNR-18 DBMS Unit-5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GNR-18 DBMS Unit-5

Uploaded by

Copyright:

Available Formats

UNIT-V

SYLLABUS: Recovery System: Failure Classification, Storage, Recovery and Atomicity,

Recovery Algorithms have two parts:

Disadvantage: comparing every pair of blocks during recovery is expansive

a) If block BXon which X resides is not in main memory, it

Recovery and Atomicity:

The recovery scheme uses two recovery procedures.

Undo(T0) redo(T0),undo(T1) redo(T0),redo(T1)

b) A special redo-only log record <Ti ,Xj , V1>is written to the

2. Recovery After a System Crash

Failure with Loss of Non-Volatile Storage

• A dump of the database contents is also referred to as an archival dump.

Figure: Architecture of remote backup system.

Dense and Sparse Indices

Ex: instructor ID “22222” in dense index the pointer directly pointing to

Insert entry 8* into figure 9.10

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.