16s PDF
16s PDF
16s PDF
16
Recovery System
Practice Exercises
16.1 Explain why log records for transactions on the undo-list must be pro-
cessed in reverse order, whereas redo is performed in a forward direction.
Answer: Within a single transaction in undo-list, suppose a data item
is updated more than once, say from 1 to 2, and then from 2 to 3. If the
undo log records are processed in forward order, the final value of the
data item would be incorrectly set to 2, whereas by processing them in
reverse order, the value is set to 1. The same logic also holds for data items
updated by more than one transaction on undo-list.
Using the same example as above, but assuming the transaction commit-
ted, it is easy to see that if redo processing processes the records in forward
order, the final value is set correctly to 3, but if done in reverse order, the
final value would be set incorrectly to 2.
16.2 Explain the purpose of the checkpoint mechanism. How often should
checkpoints be performed? How does the frequency of checkpoints affect:
All other log records can be deleted. After each checkpoint, more records
become candidates for deletion as per the above rule.
Deleting a log record while retaining an earlier log record would result in
gaps in the log, and would require more complex log processing. There-
fore in practise, systems find a point in the log such that all earlier log
records can be deleted, and delete that part of the log. Often, the log is
broken up into multiple files, and a file is deleted when all log records in
the file can be deleted.
Archival logging: Archival logging retains log records that may be needed
for recovery from media failure (such as disk crashes). Archival dumps are
the equivalent of checkpoints for recovery from media failure. The rules
for deletion above can be used for archival logs, but based on the last
archival dump instead of the last checkpoint. The frequency of archival
dumps would be lesser than checkpointing, since a lot of data has to be
written. Thus more log records would need to be retained with archival
logging.
16.4 Describe how to modify the recovery algorithm of Section 16.4 to imple-
ment savepoints, and to perform rollback to a savepoint. (Savepoints are
described in Section 16.8.3.)
Answer: A savepoint can be performed as follows:
a. Output onto stable storage all log records for that transaction which
are currently in main memory.
b. Output onto stable storage a log record of the form <savepoint Ti >,
where TI is the transaction identifier.
Practice Exercises 17
Answer:
16.6 The shadow-paging scheme requires the page table to be copied. Suppose
the page table is represented as a B+ -tree.
a. Suggest how to share as many nodes as possible between the new
copy and the shadow-copy of the B+ -tree, assuming that updates
are made only to leaf entries, with no insertions and deletions.
b. Even with the above optimization, logging is much cheaper than a
shadow-copy scheme, for transactions that perform small updates.
Explain why.
Answer:
a. To begin with, we start with the copy of just the root node pointing to
the shadow-copy. As modifications are made, the leaf entry where
the modification is made and all the nodes in the path from that
leaf node till the root, are copied and updated. All other nodes are
shared.
b. For transactions that perform small updates, the shadow-paging
scheme, would copy multiple pages for a single update, even with
the above optimization. Logging, on the other hand just requires
small records to be created for every update; the log records are
physically together in one page or a few pages, and thus only a few
log page I/O operations are required to commit a transaction. Fur-
thermore, the log pages written out across subsequent transaction
commits are likely to be adjacent physically on disk, minimizng disk
arm movement.
16.7 Suppose we (incorrectly) modify the recovery algorithm of Section 16.4 to
not log actions taken during transaction rollback. When recovering from
a system crash, transactions that were rolled back earlier would then be
included in undo-list, and rolled back again. Give an example to show
how actions taken during the undo phase of recovery could result in
an incorrect database state. (Hint: Consider a data item updated by an
aborted transaction, and then updated by a transaction that commits.)
Answer: Consider the following log records generated with the (incor-
rectly) modified recovery algorithm:
1. <T1 start>
2. <T1 , A, 1000, 900>
3. <T2 start>
4. <T2 , A, 1000, 2000>
5. <T2 commit>
1. <T1 start>
2. <T1 , A, 1000, 900>
3. <T1 , A, 1000>
4. <T1 abort>
5. <T2 start>
6. <T2 , A, 1000, 2000>
7. <T2 commit>
This would make sure that T1 would not get added to the undo-list after
the redo phase.
16.8 Disk space allocated to a file as a result of a transaction should not be
released even if the transaction is rolled back. Explain why, and explain
how ARIES ensures that such actions are not rolled back.
Answer: If a transaction allocates a page to a relation, even if the transac-
tion is rolled back, the page allocation should not be undone because other
transactions may have stored records in the same page. Such operations
that should not be undone are called nested top actions in ARIES. They
can be modeled as operations whose undo action does nothing. In ARIES
such operations are implemented by creating a dummy CLR whose Un-
doNextLSN is set such that the transaction rollback skips the log records
generated by the operation.
16.9 Suppose a transaction deletes a record, and the free space generated thus
is allocated to a record inserted by another transaction, even before the
first transaction commits.
a. What problem can occur if the first transaction needs to be rolled
back?
b. Would this problem be an issue if page-level locking is used instead
of tuple-level locking?
c. Suggest how to solve this problem while supporting tuple-level
locking, by logging post-commit actions in special log records, and
executing them after commit. Make sure your scheme ensures that
such actions are performed exactly once.
Answer:
a. If the first transaction needs to be rolled back, the tuple deleted by
that transaction will have to be restored. If undo is performed in the
usual physical manner using the old values of data items, the space
allocated to the new tuple would get overwritten by the transaction
20 Chapter 16 Recovery System
16.10 Explain the reasons why recovery of interactive transactions is more dif-
ficult to deal with than is recovery of batch transactions. Is there a simple
way to deal with this difficulty? (Hint: Consider an automatic teller ma-
chine transaction in which cash is withdrawn.)
Answer: Interactive transactions are more difficult to recover from than
batch transactions because some actions may be irrevocable. For example,
an output (write) statement may have fired a missile, or caused a bank
machine to give money to a customer. The best way to deal with this is to
try to do all output statements at the end of the transaction. That way if
the transaction aborts in the middle, no harm will be have been done.
Output operations should ideally be done atomically; for example, ATM
machines often count out notes, and deliver all the notes together instead
of delivering notes one-at-a-time. If output operations cannot be done
atomically, a physical log of output operations, such as a disk log of
events, or even a video log of what happened in the physical world can be
Practice Exercises 21
c. Consider again an example from the first item. Let us assume that
both transactions are undone and the balance is reverted back to the
original value $100.
Now we wish to redo transaction T2 . If we redo the log record
< T2 , A, 110, 120 > corresponding to transaction T2 the balance
would become $120 and we would, in effect, redo both transactions,
whereas we intend to redo only transaction T2 .