16s PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

CHAPTER

16
Recovery System
Practice Exercises

16.1 Explain why log records for transactions on the undo-list must be pro-
cessed in reverse order, whereas redo is performed in a forward direction.
Answer: Within a single transaction in undo-list, suppose a data item
is updated more than once, say from 1 to 2, and then from 2 to 3. If the
undo log records are processed in forward order, the final value of the
data item would be incorrectly set to 2, whereas by processing them in
reverse order, the value is set to 1. The same logic also holds for data items
updated by more than one transaction on undo-list.
Using the same example as above, but assuming the transaction commit-
ted, it is easy to see that if redo processing processes the records in forward
order, the final value is set correctly to 3, but if done in reverse order, the
final value would be set incorrectly to 2.
16.2 Explain the purpose of the checkpoint mechanism. How often should
checkpoints be performed? How does the frequency of checkpoints affect:

• System performance when no failure occurs?


• The time it takes to recover from a system crash?
• The time it takes to recover from a media (disk) failure?
Answer: Checkpointing is done with log-based recovery schemes to re-
duce the time required for recovery after a crash. If there is no checkpoint-
ing, then the entire log must be searched after a crash, and all transactions
undone/redone from the log. If checkpointing had been performed, then
most of the log-records prior to the checkpoint can be ignored at the time
of recovery.
Another reason to perform checkpoints is to clear log-records from stable
storage as it gets full.
Since checkpoints cause some loss in performance while they are being
taken, their frequency should be reduced if fast recovery is not critical.
If we need fast recovery checkpointing frequency should be increased. If
15
16 Chapter 16 Recovery System

the amount of stable storage available is less, frequent checkpointing is


unavoidable.
Checkpoints have no effect on recovery from a disk crash; archival dumps
are the equivalent of checkpoints for recovery from disk crashes.
16.3 Some database systems allow the administrator to choose between two
forms of logging: normal logging, used to recover from system crashes, and
archival logging, used to recover from media (disk) failure. When can a log
record be deleted, in each of these cases, using the recovery algorithm of
Section 16.4?
Answer: Normal logging: The following log records cannot be deleted,
since they may be required for recovery:

a. Any log record corresponding to a transaction which was active dur-


ing the most recent checkpoint (i.e. which is part of the <checkpoint
L> entry)
b. Any log record corresponding to transactions started after the recent
checkpoint.

All other log records can be deleted. After each checkpoint, more records
become candidates for deletion as per the above rule.
Deleting a log record while retaining an earlier log record would result in
gaps in the log, and would require more complex log processing. There-
fore in practise, systems find a point in the log such that all earlier log
records can be deleted, and delete that part of the log. Often, the log is
broken up into multiple files, and a file is deleted when all log records in
the file can be deleted.

Archival logging: Archival logging retains log records that may be needed
for recovery from media failure (such as disk crashes). Archival dumps are
the equivalent of checkpoints for recovery from media failure. The rules
for deletion above can be used for archival logs, but based on the last
archival dump instead of the last checkpoint. The frequency of archival
dumps would be lesser than checkpointing, since a lot of data has to be
written. Thus more log records would need to be retained with archival
logging.
16.4 Describe how to modify the recovery algorithm of Section 16.4 to imple-
ment savepoints, and to perform rollback to a savepoint. (Savepoints are
described in Section 16.8.3.)
Answer: A savepoint can be performed as follows:

a. Output onto stable storage all log records for that transaction which
are currently in main memory.
b. Output onto stable storage a log record of the form <savepoint Ti >,
where TI is the transaction identifier.
Practice Exercises 17

To roll back a currently executing transaction partially till a particular


savepoint, execute undo processing for that transaction, till the savepoint
is reached. Redo log records are generated as usual during the undo phase
above.
It is possible to perform repeated undo to a single savepoint by writing
a fresh savepoint record after rolling back to that savepoint. The above
algorithm can be extended to support multiple savepoints for a single
transaction, by giving each savepoint a name. However, once undo has
rolled back past a savepoint, it is no longer possible to undo upto that
savepoint.
16.5 Suppose the deferred modification technique is used in a database.

a. Is the old-value part of an update log record required any more?


Why or why not?
b. If old values are not stored in update log records, transaction undo
is clearly not feasible. How would the redo-phase of recovery have
to be modified as a result?
c. Deferred modification can be implemented by keeping updated data
items in local memory of transactions, and reading data items that
have not been updated directly from the database buffer. Suggest
how to efficiently implement a data item read, ensuring that a trans-
action sees its own updates.
d. What problem would arise with the above technique, if transactions
perform a large number of updates?

Answer:

a. The old-value part of an update log record is not required. If the


transaction has committed, then the old value is no longer necessary
as there would be no need to undo the transaction. And if the trans-
action was active when the system crashed, the old values are still
safe in the stable storage as they haven’t been modified yet.
b. During the redo phase, the undo list need not be maintained any
more, since the stable storage does not reflect updates due to any
uncommitted transaction.
c. A data item read will first issue a read request on the local memory
of the transaction. If it is found there, it is returned. Otherwise, the
item is loaded from the database buffer into the local memory of the
transaction and then returned.
d. If a single transaction performs a large number of updates, there is
a possibility of the transaction running out of memory to store the
local copies of the data items.
18 Chapter 16 Recovery System

16.6 The shadow-paging scheme requires the page table to be copied. Suppose
the page table is represented as a B+ -tree.
a. Suggest how to share as many nodes as possible between the new
copy and the shadow-copy of the B+ -tree, assuming that updates
are made only to leaf entries, with no insertions and deletions.
b. Even with the above optimization, logging is much cheaper than a
shadow-copy scheme, for transactions that perform small updates.
Explain why.
Answer:
a. To begin with, we start with the copy of just the root node pointing to
the shadow-copy. As modifications are made, the leaf entry where
the modification is made and all the nodes in the path from that
leaf node till the root, are copied and updated. All other nodes are
shared.
b. For transactions that perform small updates, the shadow-paging
scheme, would copy multiple pages for a single update, even with
the above optimization. Logging, on the other hand just requires
small records to be created for every update; the log records are
physically together in one page or a few pages, and thus only a few
log page I/O operations are required to commit a transaction. Fur-
thermore, the log pages written out across subsequent transaction
commits are likely to be adjacent physically on disk, minimizng disk
arm movement.
16.7 Suppose we (incorrectly) modify the recovery algorithm of Section 16.4 to
not log actions taken during transaction rollback. When recovering from
a system crash, transactions that were rolled back earlier would then be
included in undo-list, and rolled back again. Give an example to show
how actions taken during the undo phase of recovery could result in
an incorrect database state. (Hint: Consider a data item updated by an
aborted transaction, and then updated by a transaction that commits.)
Answer: Consider the following log records generated with the (incor-
rectly) modified recovery algorithm:

1. <T1 start>
2. <T1 , A, 1000, 900>
3. <T2 start>
4. <T2 , A, 1000, 2000>
5. <T2 commit>

A rollback actually happened between steps 2 and 3, but there are no


log records reflecting the same. Now, this log data is processed by the
recovery algorithm. At the end of the redo phase, T1 would get added to
the undo-list, and the value of A would be 2000. During the undo phase,
Practice Exercises 19

since T1 is present in the undo-list, the recovery algorithm does an undo


of statement 2 and A takes the value 1000. The update made by T2 , though
commited, is lost.
The correct sequence of logs is as follows:

1. <T1 start>
2. <T1 , A, 1000, 900>
3. <T1 , A, 1000>
4. <T1 abort>
5. <T2 start>
6. <T2 , A, 1000, 2000>
7. <T2 commit>

This would make sure that T1 would not get added to the undo-list after
the redo phase.
16.8 Disk space allocated to a file as a result of a transaction should not be
released even if the transaction is rolled back. Explain why, and explain
how ARIES ensures that such actions are not rolled back.
Answer: If a transaction allocates a page to a relation, even if the transac-
tion is rolled back, the page allocation should not be undone because other
transactions may have stored records in the same page. Such operations
that should not be undone are called nested top actions in ARIES. They
can be modeled as operations whose undo action does nothing. In ARIES
such operations are implemented by creating a dummy CLR whose Un-
doNextLSN is set such that the transaction rollback skips the log records
generated by the operation.
16.9 Suppose a transaction deletes a record, and the free space generated thus
is allocated to a record inserted by another transaction, even before the
first transaction commits.
a. What problem can occur if the first transaction needs to be rolled
back?
b. Would this problem be an issue if page-level locking is used instead
of tuple-level locking?
c. Suggest how to solve this problem while supporting tuple-level
locking, by logging post-commit actions in special log records, and
executing them after commit. Make sure your scheme ensures that
such actions are performed exactly once.
Answer:
a. If the first transaction needs to be rolled back, the tuple deleted by
that transaction will have to be restored. If undo is performed in the
usual physical manner using the old values of data items, the space
allocated to the new tuple would get overwritten by the transaction
20 Chapter 16 Recovery System

undo, damaging the new tuples, and associated data structures on


the disk block. This means that a logical undo operation has to be
performed i.e., an insert has to be performed to undo the delete,
which complicates recovery.
On related note, if the second transaction inserts with the same key,
integrity constraints might be violated on rollback.
b. If page level locking is used, the free space generated by the first
transaction is not allocated to another transaction till the first one
commits. So this problem will not be an issue if page level locking
is used.
c. The problem can be solved by deferring freeing of space till after the
transaction commits. To ensure that space will be freed even if there
is a system crash immediately after commit, the commit log record
can be modified to contain information about freeing of space (and
other similar operations) which must be performed after commit.
The execution of these operations can be performed as a transaction
and log records generated, following by a post-commit log record
which indicates that post commit processing has been completed for
the transaction.
During recovery, if a commit log record is found with post-commit
actions, but no post-commit log record is found, the effects of any
partial execution of post-commit operations are rolled back during
recovery, and the post commit operations are reexecuted at the end
of recovery. If the post-commit log record is found, the post-commit
actions are not reexecuted. Thus, the actions are guaranteed to be
executed exactly once.
The problem of clashes on primary key values can be solved by
holding key-level locks so that no other transaction can use the key
till the first transaction completes.

16.10 Explain the reasons why recovery of interactive transactions is more dif-
ficult to deal with than is recovery of batch transactions. Is there a simple
way to deal with this difficulty? (Hint: Consider an automatic teller ma-
chine transaction in which cash is withdrawn.)
Answer: Interactive transactions are more difficult to recover from than
batch transactions because some actions may be irrevocable. For example,
an output (write) statement may have fired a missile, or caused a bank
machine to give money to a customer. The best way to deal with this is to
try to do all output statements at the end of the transaction. That way if
the transaction aborts in the middle, no harm will be have been done.
Output operations should ideally be done atomically; for example, ATM
machines often count out notes, and deliver all the notes together instead
of delivering notes one-at-a-time. If output operations cannot be done
atomically, a physical log of output operations, such as a disk log of
events, or even a video log of what happened in the physical world can be
Practice Exercises 21

maintained, to allow perform recovery to be performed manually later,


for example by crediting cash back to a customers account.
16.11 Sometimes a transaction has to be undone after it has committed because
it was erroneously executed, for example because of erroneous input by
a bank teller.
a. Give an example to show that using the normal transaction undo
mechanism to undo such a transaction could lead to an inconsistent
state.
b. One way to handle this situation is to bring the whole database
to a state prior to the commit of the erroneous transaction (called
point-in-time recovery). Transactions that committed later have their
effects rolled back with this scheme.
Suggest a modification to the recovery algorithm of Section 16.4
to implement point-in-time recovery using database dumps.
c. Later nonerroneous transactions can be re-executed logically, if the
updates are available in the form of SQL but cannot be re-executed
using their log records. Why?
Answer:
a. Consider the a bank account A with balance $100. Consider two
transactions T1 and T2 each depositing $10 in the account. Thus the
balance would be $120 after both these transactions are executed.
Let the transactions execute in sequence: T1 first and then T2 . The log
records corresponding to the updates of A by transactions T1 and T2
would be < T1 , A, 100, 110 > and < T2 , A, 110, 120 > resp.
Say, we wish to undo transaction T1 . The normal transaction undo
mechanism will replaces the value in question— Ain this example —
by the old-value field in the log record. Thus if we undo transaction
T1 using the normal transaction undo mechanism the resulting bal-
ance would be $100 and we would, in effect, undo both transactions,
whereas we intend to undo only transaction T1 .
b. Let the erroneous transaction be Te .
• Identify the latest archival dump, say D, before the log record
< Te , START>. Restore the database using the dump.
• Redo all log records starting from the dump D till the log record
< Te , COMMIT>. Some transaction—apart from transaction Te
—would be active at the commit time of transaction Te . Let S1
be the set of such transactions.
• Rollback Te and the transactions in the set S1 . This completes
point-in-time recovery.
In case logical redo is possible, later transactions can be rex-
ecuted logically assuming log records containing logical redo
22 Chapter 16 Recovery System

information were written for every transaction. To perform log-


ical redo of later transactions, scan the log further starting from
the log record < Te , COMMIT> till the end of the log. Note the
transactions that were started after the commit point of Te . Let
the set of such transactions be S2 . Re-execute the transactions in
set S1 and S2 logically.

c. Consider again an example from the first item. Let us assume that
both transactions are undone and the balance is reverted back to the
original value $100.
Now we wish to redo transaction T2 . If we redo the log record
< T2 , A, 110, 120 > corresponding to transaction T2 the balance
would become $120 and we would, in effect, redo both transactions,
whereas we intend to redo only transaction T2 .

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy