0% found this document useful (0 votes)
22 views

dbms unit-3

The document outlines failure classifications in database management systems (DBMS), categorizing them into transaction failures, system crashes, and disk failures. It also describes various types of storage systems, including primary, secondary, and tertiary storage, along with their characteristics and recovery mechanisms to ensure data atomicity and durability. Additionally, it discusses transaction properties such as atomicity, consistency, isolation, and durability, along with the importance of concurrency control in maintaining database integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

dbms unit-3

The document outlines failure classifications in database management systems (DBMS), categorizing them into transaction failures, system crashes, and disk failures. It also describes various types of storage systems, including primary, secondary, and tertiary storage, along with their characteristics and recovery mechanisms to ensure data atomicity and durability. Additionally, it discusses transaction properties such as atomicity, consistency, isolation, and durability, along with the importance of concurrency control in maintaining database integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Failure Classification

To find that where the problem has occurred, we generalize a failure into the
following categories:

1. Transaction failure

2. System crash

3. Disk failure

1. Transaction failure

The transaction failure occurs when it fails to execute or when it reaches a point
from where it can't go any further. If a few transaction or process is hurt, then
this is called as transaction failure.

Reasons for a transaction failure could be -

1. Logical errors: If a transaction cannot complete due to some code


error or an internal error condition, then the logical error occurs.

2. Syntax error: It occurs where the DBMS itself terminates an active


transaction because the database system is not able to execute
it. For example, The system aborts an active transaction, in case of
deadlock or resource unavailability.

2. System Crash

o System failure can occur due to power failure or other hardware or


software failure. Example: Operating system error.

Fail-stop assumption: In the system crash, non-volatile storage is assumed not


to be corrupted.

3. Disk Failure
o It occurs where hard-disk drives or storage drives used to fail
frequently. It was a common problem in the early days of
technology evolution.

o Disk failure occurs due to the formation of bad sectors, disk head
crash, and unreachability to the disk or any other failure, which
destroy all or part of disk storage.

Storage structure

Storage System in DBMS

A database system provides an ultimate view of the stored data. However, data
in the form of bits, bytes get stored in different storage devices.

In this section, we will take an overview of various types of storage devices that
are used for accessing and storing data.

Types of Data Storage

For storing the data, there are different types of storage options available.
These storage types differ from one another as per the speed and accessibility.
There are the following types of storage devices used for storing the data:

o Primary Storage

o Secondary Storage

o Tertiary Storage
Primary Storage

It is the primary area that offers quick access to the stored data. We also know
the primary storage as volatile storage. It is because this type of memory does
not permanently store the data. As soon as the system leads to a power cut or a
crash, the data also get lost. Main memory and cache are the types of primary
storage.

PlayNext

Unmute

Current Time 0:00

Duration 18:10

Loaded: 0.37%

Â
Fullscreen

Backward Skip 10sPlay VideoForward Skip 10s

o Main Memory: It is the one that is responsible for operating the data that
is available by the storage medium. The main memory handles each
instruction of a computer machine. This type of memory can store
gigabytes of data on a system but is small enough to carry the entire
database. At last, the main memory loses the whole content if the system
shuts down because of power failure or other reasons.

1. Cache: It is one of the costly storage media. On the other hand, it is the
fastest one. A cache is a tiny storage media which is maintained by the
computer hardware usually. While designing the algorithms and query
processors for the data structures, the designers keep concern on the
cache effects.

Secondary Storage

Secondary storage is also called as Online storage. It is the storage area that
allows the user to save and store data permanently. This type of memory does
not lose the data due to any power failure or system crash. That's why we also
call it non-volatile storage.

There are some commonly described secondary storage media which are
available in almost every type of computer system:

o Flash Memory: A flash memory stores data in USB (Universal Serial Bus)
keys which are further plugged into the USB slots of a computer system.
These USB keys help transfer data to a computer system, but it varies in
size limits. Unlike the main memory, it is possible to get back the stored
data which may be lost due to a power cut or other reasons. This type of
memory storage is most commonly used in the server systems for caching
the frequently used data. This leads the systems towards high
performance and is capable of storing large amounts of databases than
the main memory.
o Magnetic Disk Storage: This type of storage media is also known as online
storage media. A magnetic disk is used for storing the data for a long
time. It is capable of storing an entire database. It is the responsibility of
the computer system to make availability of the data from a disk to the
main memory for further accessing. Also, if the system performs any
operation over the data, the modified data should be written back to the
disk. The tremendous capability of a magnetic disk is that it does not
affect the data due to a system crash or failure, but a disk failure can
easily ruin as well as destroy the stored data.

Tertiary Storage

It is the storage type that is external from the computer system. It has the
slowest speed. But it is capable of storing a large amount of data. It is also
known as Offline storage. Tertiary storage is generally used for data backup.
There are following tertiary storage devices available:

o Optical Storage: An optical storage can store megabytes or gigabytes of


data. A Compact Disk (CD) can store 700 megabytes of data with a
playtime of around 80 minutes. On the other hand, a Digital Video Disk or
a DVD can store 4.7 or 8.5 gigabytes of data on each side of the disk.

o Tape Storage: It is the cheapest storage medium than disks. Generally,


tapes are used for archiving or backing up the data. It provides slow
access to data as it accesses data sequentially from the start. Thus, tape
storage is also known as sequential-access storage. Disk storage is known
as direct-access storage as we can directly access the data from any
location on disk.

Storage Hierarchy

Besides the above, various other storage devices reside in the computer system.
These storage media are organized on the basis of data accessing speed, cost
per unit of data to buy the medium, and by medium's reliability. Thus, we can
create a hierarchy of storage media on the basis of its cost and speed.
Thus, on arranging the above-described storage media in a hierarchy according
to its speed and cost, we conclude the below-described image:

In the image, the higher levels are expensive but fast. On moving down, the cost
per bit is decreasing, and the access time is increasing. Also, the storage media
from the main memory to up represents the volatile nature, and below the
main memory, all are non-volatile devices.

Recovery and automicity

Recovery and Atomicity in DBMS

Introduction

Data may be monitored, stored, and changed rapidly and effectively using a
DBMS (Database Management System).A database possesses atomicity,
consistency, isolation, and durability qualities. The ability of a system to
preserve data and changes made to data defines its durability. A database could
fail for any of the following reasons:
o System breakdowns occur as a result of hardware or software issues in
the system.

o Transaction failures arise when a certain process dealing with data


updates cannot be completed.

o Disk crashes may occur as a result of the system's failure to read the disc.

o Physical damages include issues such as power outages or natural


disasters.

o The data in the database must be recoverable to the state they were in
prior to the system failure, even if the database system fails. In such
situations, database recovery procedures in DBMS are employed to
retrieve the data.

The recovery procedures in DBMS ensure the database's atomicity and


durability. If a system crashes in the middle of a transaction and all of its data is
lost, it is not regarded as durable. If just a portion of the data is updated during
the transaction, it is not considered atomic. Data recovery procedures in DBMS
make sure that the data is always recoverable to protect the durability property
and that its state is retained to protect the atomic property. The procedures
listed below are used to recover data from a DBMS,

o Recovery based on logs.

o Recovery through Deferred Update

o Immediate Recovery via Immediate Update

The atomicity attribute of DBMS safeguards the data state. If a data


modification is performed, the operation must be completed entirely, or the
data's state must be maintained as if the manipulation never occurred. This
characteristic may be impacted by DBMS failure brought on by transactions, but
DBMS recovery methods will protect it.

Log Based recovery:


Every DBMS has its own system logs, which record every system activity and
include timestamps for the event's timing. Databases manage several log files
for operations such as errors, queries, and other database updates. The log is
saved in the following file formats:

Backward Skip 10sPlay VideoForward Skip 10s

o The structure [start transaction, T] represents the start of transaction T


execution.

o [write the item, T, X, old value, new value] indicates that the transaction T
changes the value of the variable X from the old value to the new value.

o [read item, T, X] indicates that the transaction T reads the value of X.

o [commit, T] signifies that the modifications to the data have been


committed to the database and cannot be updated further by the
transaction. There will be no errors after the database has been
committed.

o [abort, T] indicates that the transaction, T, has been cancelled.

We may utilize these logs to see how the state of the data changes during a
transaction and recover it to the prior or new state.

An undo operation can be used to inspect the [write item, T, X, old value, new
value] operation and restore the data state to old data. The only way to restore
the previous state of data to the new state that was lost due to a system failure
is to do the [commit, T] action.

Consider the following series of transactions: t1, t2, t3, and t4. The system
crashes after the fourth transaction; however, the data can still be retrieved to
the state it was in before the checkpoint was established during transaction t1.

After all of the records of a transaction are written to logs, a checkpoint is


created to transfer all of the logs from local storage to permanent storage for
future usage.
Recovery with current transactions:

Concurrency Control deals with interleaved execution of more than one


transaction. In the next article, we will see what is serializability and how to find
whether a schedule is serializable or not.

What is Transaction?

A set of logically related operations is known as a transaction. The main


operations of a transaction are:

Read(A): Read operations Read(A) or R(A) reads the value of A from the
database and stores it in a buffer in the main memory.

Write (A): Write operation Write(A) or W(A) writes the value back to the
database from the buffer.

(Note: It doesn’t always need to write it to a database back it just writes the
changes to buffer this is the reason where dirty read comes into the picture)

Let us take a debit transaction from an account that consists of the following
operations:

1. R(A);

2. A=A-1000;

3. W(A);

Assume A’s value before starting the transaction is 5000.

• The first operation reads the value of A from the database and stores it in
a buffer.

• the Second operation will decrease its value by 1000. So buffer will
contain 4000.

• the Third operation will write the value from the buffer to the database.
So A’s final value will be 4000.
But it may also be possible that the transaction may fail after executing some of
its operations. The failure can be because of hardware, software or power, etc.
For example, if the debit transaction discussed above fails after executing
operation 2, the value of A will remain 5000 in the database which is not
acceptable by the bank. To avoid this, Database has two important operations:

Commit: After all instructions of a transaction are successfully executed, the


changes made by a transaction are made permanent in the database.

Rollback: If a transaction is not able to execute all operations successfully, all


the changes made by a transaction are undone.

Properties of a transaction:

Atomicity: As a transaction is a set of logically related operations, either all of


them should be executed or none. A debit transaction discussed above should
either execute all three operations or none. If the debit transaction fails after
executing operations 1 and 2 then its new value of 4000 will not be updated in
the database which leads to inconsistency.

Consistency: If operations of debit and credit transactions on the same account


are executed concurrently, it may leave the database in an inconsistent state.

• For Example, with T1 (debit of Rs. 1000 from A) and T2 (credit of 500 to A)
executing concurrently, the database reaches an inconsistent state.

• Let us assume the Account balance of A is Rs. 5000. T1 reads A(5000) and
stores the value in its local buffer space. Then T2 reads A(5000) and also
stores the value in its local buffer space.

• T1 performs A=A-1000 (5000-1000=4000) and 4000 is stored in T1 buffer


space. Then T2 performs A=A+500 (5000+500=5500) and 5500 is stored in
the T2 buffer space. T1 writes the value from its buffer back to the
database.

• A’s value is updated to 4000 in the database and then T2 writes the value
from its buffer back to the database. A’s value is updated to 5500 which
shows that the effect of the debit transaction is lost and the database has
become inconsistent.

• To maintain consistency of the database, we need concurrency control


protocols which will be discussed in the next article. The operations of T1
and T2 with their buffers and database have been shown in Table 1.

T1 T1’s buffer space T2 T2’s Buffer Space Database

A=5000

R(A); A=5000 A=5000

A=5000 R(A); A=5000 A=5000

A=A-1000; A=4000 A=5000 A=5000

A=4000 A=A+500; A=5500

W(A); A=5500 A=4000

W(A); A=5500

Table 1

Isolation: The result of a transaction should not be visible to others before the
transaction is committed. For example, let us assume that A’s balance is Rs.
5000 and T1 debits Rs. 1000 from A. A’s new balance will be 4000. If T2 credits
Rs. 500 to A’s new balance, A will become 4500, and after this T1 fails. Then we
have to roll back T2 as well because it is using the value produced by T1. So
transaction results are not made visible to other transactions before it commits.

Durable: Once the database has committed a transaction, the changes made by
the transaction should be permanent. e.g.; If a person has credited $500000 to
his account, the bank can’t say that the update has been lost. To avoid this
problem, multiple copies of the database are stored at different locations.

What is a Schedule?

A schedule is a series of operations from one or more transactions. A schedule


can be of two types:

• Serial Schedule: When one transaction completely executes before


starting another transaction, the schedule is called a serial schedule. A
serial schedule is always consistent. e.g.; If a schedule S has debit
transaction T1 and credit transaction T2, possible serial schedules are T1
followed by T2 (T1->T2) or T2 followed by T1 ((T2->T1). A serial schedule
has low throughput and less resource utilization.

• Concurrent Schedule: When operations of a transaction are interleaved


with operations of other transactions of a schedule, the schedule is called
a Concurrent schedule. e.g.; the Schedule of debit and credit transactions
shown in Table 1 is concurrent. But concurrency can lead to inconsistency
in the database. The above example of a concurrent schedule is also
inconsistent.

Question: Consider the following transaction involving two bank accounts x and
y:

1. read(x);

2. x := x – 50;

3. write(x);

4. read(y);
5. y := y + 50;

6. write(y);

The constraint that the sum of the accounts x and y should remain constant is
that of?

1. Atomicity

2. Consistency

3. Isolation

4. Durability
[GATE 2015]

Solution: As discussed in properties of transactions, consistency properties say


that sum of accounts x and y should remain constant before starting and after
completion of a transaction. So, the correct answer is B.

Advantages of Concurrency:

In general, concurrency means, that more than one transaction can work on a
system.

The advantages of a concurrent system are:

• Waiting Time: It means if a process is in a ready state but still the process
does not get the system to get execute is called waiting time. So,
concurrency leads to less waiting time.

• Response Time: The time wasted in getting the response from the cpu for
the first time, is called response time. So, concurrency leads to less
Response Time.

• Resource Utilization: The amount of Resource utilization in a particular


system is called Resource Utilization. Multiple transactions can run
parallel in a system. So, concurrency leads to more Resource Utilization.
• Efficiency: The amount of output produced in comparison to given input is
called efficiency. So, Concurrency leads to more Efficiency.

Difference Between Serial Schedule and Serializable Schedule:

Serializable
S.NO. Serial Schedule Schedule

In Serializable schedule
In Serial schedule, transactions will
1 transaction are executed
be executed one after other.
concurrently.

Serializable schedule are more


2 Serial schedule are less efficient.
efficient.

In Serializable schedule multiple


In serial schedule only one
3 transactions can be executed at
transaction executed at a time.
a time.

Serial schedule takes more time In Serializable schedule


4
for execution. execution is fast.

Buffer management:

What is a Buffer Manager?

Consider the following points to understand the functioning of buffer


management in DBMS:
• A buffer manager in DBMS is in charge of allocating buffer space in the
main memory so that the temporary data can be stored there.

• The buffer manager sends the block address if a user requests certain
data and the data block is present in the database buffer in the main
memory.

• It is also responsible for allocating the data blocks in the database buffer
if the data blocks are not found in the database buffer.

• In the absence of accessible empty space in the buffer, it removes a few


older blocks from the database buffer to make space for the new data
blocks.

• If the data blocks to be removed have been recently updated then the
changes are copied/written to the disk storage, else they are simply
removed from the database buffer.

• If a user requests one of these removed data, the buffer manager


copies/reads the data blocks present in the disk storage to the database
buffer and returns the requested block's address from the main memory.

• Programs that might interfere with requests from the disks and the
database buffer can't see what the buffer manager is doing inside as it
acts as a VM in the system.

Methods

The buffer manager applies the following techniques to provide the database
system with the best possible service:

Buffer Replacement Strategy

If there is no space for a new data block in the database buffer, an existing block
must be removed from the buffer for the allocation of a new data block. Here,
the Least Recently Used (LRU) technique is used by several operating systems.
The least recently used data block is taken out of the buffer and sent back to the
disk. The term Buffer Replacement Strategy refers to this kind of replacement
technique.

Pinned Blocks

When a user needs to restore any data block from a system crash or failure, it is
crucial to limit the number of times a block is copied/written to the disk storage
to preserve the data. The majority of the recovery systems forbid writing blocks
to the disk while a data block update is taking place. Pinned Blocks are the data
blocks that are restricted from being written back to the disk. It helps a
database to have the capability to prevent writing data blocks while doing
updates so that the correct data is persisted after all operations.

Forced Output of Blocks

Sometimes we may have to copy/write back the changes made in the data
blocks to the disk storage, even if the space that the data block takes up in the
database buffer is not required for usage. This method is regarded as a Forced
Output of Blocks. This method is used because system failure can cause data
stored in the database buffer to be lost, and often disk memory is not affected
by any type of system crash or failure.

Failure with loss of Nonvolatile Storage

The basic measure is to dump the entire contents of the database to stable
storage periodically.

One approach to dump the database requires that no transaction is active


during the dumping procedure and uses a procedure similar to checkpointing.

a) Output all the log records currently present in the main memory into the
stable storage.

b) Output all the buffer blocks into the disk.

c) Copy all the data present in the database to the stable storage.

d) Output a log record <dump> into the stable storage.


A dump of the database contents is also called the ‘archival dump’.

Most database systems support an ‘SQL dump’ as well, which writes out all the
SQL DDL statements and SQL insert statements into a file, which can then be re-
executed to recreate the database.

Advanced Revovery technique:

There are different other logging techniques which are more efficient than
above methods. Let us see some of them below :

o Logical Undo Logging

o Operation Logging

o Transaction Rollback

o Crash Recovery

▪ Redo Phase

▪ Undo Phase

o Check pointing

o Fuzzy Check pointing

Logical Undo Logging

Logging and checkpoints will work effectively in normal types of


executions. But when records are inserted and deleted in a B+ tree
form, then we have challenges. B+ tree structure will release the
locks early. Those records will be locked by other transactions in
the tree as soon as they are released. Hence rolling back those
record values are not so easy using above techniques i.e.; physical
undo is impossible. Hence we need an alternative technique to
undo these types of insertion and deletion. i.e.; rather than physical
undo techniques, we need to have logical undo techniques. In
physical undo method, it will see the logs for previous values or
commit, the record value will be updated to old value or re-
executed.

In Logical undo method; a separate undo log file is created along


with log file. In undo file, for any insertion operation, respective
deletion operation will be mentioned to rollback the changes.
Similarly for each deletion operation, respective insertion operation
will be described. This method is called as logical undo logging.
For example, suppose a transaction T1 is adding X = X + 5. Then in
our physical logging method, we will have log like <T1, X, 10, 15>
indicating X value is changed from 10 to 15. In case of failure, we
know what the previous value of X was and we can easily undo X to
10.But it will not work in case of B+ trees. We will have to maintain
how to undo X to 10. i.e.; a separate logical undo file is created
where we will mention undo for X= X+5 as X = X-5.
Suppose we have inserted a new entry for student as ‘INSERT INTO
STUDENT VALUES (200, …..’. The logical undo file will contain undo
operation for this as ‘DELETE FROM STUDENT WHERE STD_ID = 200’

Redo for the transaction can be done by following the log file –
physical log. We will not maintain logical log for redoing the
transaction. This is because; the state of the record would have
changes by the time system is recovered. Some other transactions
would have already executed and will lead to logical redo log to be
wrong. Hence the physical log itself is re-executed to redo the
operations.

Operation Logging

In any transaction, we can have multiple operations involved as


shown in below snippet. Here two operations are involved – one to
update X and another to update Y.

When we maintain the logs for the transaction, we can modify it to


store the logs for each operation. Hence during the crash, we will
have rollback information for each operation. Here in this method,
apart from physical undo and redo logs, we will have logical undo
logs too. Each one of them is useful and is used depending on when
the crash has occurred.

Let Ti be the transaction and Oj be the operation in Ti. Let U be the


logical undo information. Then operation logging for an operation
in a transaction is done as follows :

• When an operation begins in the transaction, an operation log <Ti, Oj,


Operartion_begin> is logged. It indicated the beginning of operation.

• When the operation is executed, logs for them are inserted as any other
normal logging method. It will contain physical undo and redo
informations.

• When the operation is complete, it will log <Ti, Oj, Operartion_end, U>.
This will have logical undo information for reverting the changes.

Suppose we have to insert values for (X, Y) as (‘ABC’, 20) at index I5


(this is an arbitrary; we can even consider this as inserting values
into table). Then operation log for this will be as follows :

When abort or crash occurs in the system while transaction is


executing :

• If it crashes before operation_end, then the physical undo information is


used to revert the operation. Here log will not have operation_end; hence
system will automatically take physical undo from the log. i.e.; X and Y
will be reverted to its old values ‘MNO’ and 100 respectively.

• If the system crashes after operation_end, then physical undo is ignored


and logical undo is used to revert the changes. i.e.; DELETE (I5, ‘ABC’, 20>
is used to revert the changes and it will delete newly added information
from index I5.

• In both the cases above, physical redo information is used to re-execute


the changes. i.e.; values of X and Y are updated to (‘ABC’, 20).

Transaction Rollback

When a system crashes while performing the transaction, log


entries are used to recover from failure. We know that logs will
have information on how it has to be rolled back or re-executed.
But whenever there is a failure, the log files will be updated with
logs to perform the undo and redo using the already entered
information. i.e.; if undo of <T1, X, ‘MNO’,’ABC’> has to be done
then it will enter another log after the crash as <T1, X, ‘MNO’ >.

Whenever there is crash and system is trying to recover by rolling


back, it will scan the logs in reverse order and log entries are
updated as below :

• If there is log entry <Ti, variable, Old_Value, New_Value>, then enter


undo log as <Ti, variable, Old_Value>. This undo log entry is known as
redo-only log entry. While recovering, if it finds redo-only record, it
ignores it.

• If it finds <Ti, Oj, Operartion_end, U> while traversing log, then rollback
the operation using logical undo, U. This logical undo operation is also
logged into log file as normal operation execution, but at the end instead
of <Ti, Oj, Operartion_end, U>, <Ti, Oj, Operartion_Abort> is logged. Then
skip all the operations till <Ti, Oj, Operartion_begin>is reached. i.e.; it
performs all the logical undo operation like any other normal operation
and its logs are entered into log file, and all the physical undo operations
are ignored.
Let us consider the transaction as below. We can observe that T1
has two operations O1 and O2, where O1 is completed fully and
while performing O2, system crashes. While recovering it starts
scan in reverse from the point where it failed and starts entering
the logs for recovering. Hence it finds only <T1, Z, ‘abc’, ‘xyz’> entry
in the log while recovering, and redo-only entry <T1, Z, ‘abc’> for O2
is entered. Then it finds operation end for O1. Hence it uses logical
undo to rollback the changes by O1. Though it finds logical undo as
‘DELETE’, it starts inserting the redo logs for performing ‘DELETE’.
This redo logs for delete will in turn delete the changes done by the
operation O1. It then traverses back the physical redo of O1
without executing it (ignores it) till it reaches <T1, Start>, and stops.
It adds <T1, Start> to the log file to indicate the end of reverting
transaction T1. We can see this in below log file- after logical undo
of O1, we do not have any logs of physical undo or redo, it jumps to
Abort log entries.

Crash Recovery

Whenever there is a system crash, the transactions which were in


execution phase has to be recovered and DB has to be brought to
consistent state. Log files are checked to do redo and undo
operations. It has two phases.

Redo Phase

Though the transactions and operations are rolled back in reverse


order of log file entries, the recovery system maintains the recovery
log list for undoing and redoing the operations by scanning the logs
from the last checkpoint to the end of file.
That means, undo / redo logs will have list of operations and how
to execute them, and are entered into the log file itself. A separate
list of entries will be created for maintaining the list of
transactions/ operations which needs to be undone while
recovering. This will be created by scanning the log files from last
checkpoint to the end of the file (forward direction). While creating
the undo list, all other operations which are not part of undo list
are redone.

While performing the forward scan to create undo list, L, it checks if

• <Ti, Start> found, then adds Ti to undo list L

• <Ti, Commit> or <Ti, Abort> is found, then it deletes Ti entry from undo
list, L

Hence undo list will have all the transactions which are partially
performed and all other committed transactions are re-done
(redoing the transaction is not exactly as re-executing them. This
forward scanning assumes that those transactions are already
performed and committed, and lists only those transactions that
are not committed yet and in partial execution state.)

Undo Phase

In this phase, the log files are scanned backward for the
transactions in the undo list. Undoing of transactions are performed
as described in transaction rollback. It checks for the end log for
each operations, if found then it performs logical undo, else
physical undo by entering the logs in log files.

This how a transaction is redone and undone, to maintain the


consistency and atomicity of the transaction.

Check pointing
Check pointing is the mechanism to mark the entries in the log file
that those changes are permanently updated into database, and if
there is any failure, log files need not be traversed beyond that
point. Only those entries after check point are not written to DB,
and have to be redone / undone. This is done at periodic intervals
or as per the schedule. It checks for the log records after the last
check point and outputs it to the stable memory / disks. If the
buffer blocks in main memory is full, then it outputs the logs into
disks. If there is any new checkpoint is defined, then all the entries
from last checkpoint to the new check points are written to disks.
But any transactions will not get executed during this check
pointing process.

Fuzzy Check pointing

Fuzzy check pointing, in contrast to normal check pointing allows


transactions to execute while logs are being copied to disk. During
fuzzy check pointing it follows below steps :

• Temporarily stops the transactions to make note of blocks to be copied.

• Marks the new checkpoint L in the log.

• Creates a list M for all the logs between the last checkpoint to new
checkpoint, i.e.; M is the list of log records which are yet to be written to
disk.

• Once all M is listed, it allows the transaction to execute. Now the


transaction will start entering the logs after the new checkpoint L. It
should not enter the logs into the blocks that are in M or old checkpoints.

• The buffer blocks in list M are written to the disk or stable storage. No
transactions should update these blocks. In addition, all the records in
these blocks in list M are written to the disk first, and then the block is
updated to the disk.
• Disk should have pointer to the last checkpoint – last_checkpoint in the
main memory at fixed location. This will help to read the blocks for the
next update and maintain new list M.

Whenever there is recovery from failure, the logs are read from the
last_checkpoint stored in DB. This is because logs before
last_checkpoint are already updated to DB and those after these
points have to be written. These logs are recovered as described in
above methods.

Remote Backup

Remote backup provides a sense of security in case the primary


location where the database is located gets destroyed. Remote
backup can be offline or real-time or online. In case it is offline, it is
maintained manually.

Online backup systems are more real-time and lifesavers for


database administrators and investors. An online backup system is
a mechanism where every bit of the real-time data is backed up
simultaneously at two distant places. One of them is directly
connected to the system and the other one is kept at a remote
place as backup.

As soon as the primary database storage fails, the backup system


senses the failure and switches the user system to the remote
storage. Sometimes this is so instant that the users can’t even
realize a failure.

Remote backup systems provide a wide range of availability,


allowing the transaction processing to continue even if the primary
site is destroyed by a fire, flood or earthquake.

Data and log records from a primary site are continuously backed
up into a remote backup site.

One can achieve ‘wide range availability’ of data by performing


transaction processing at one site, called the ‘primary site’, and
having a ‘remote backup’ site where all the data from the primary
site are duplicated.

The remote site is also called ‘secondary site’.

The remote site must be synchronized with the primary site, as


updates are performed at the primary.

In designing a remote backup system, the following points are


important.

a) Detection of failure: It is important for the remote backup system


to detect when the primary has failed.

b) Transfer of control: When the primary site fails, the backup site
takes over the processing and becomes the new primary site.

c) Time to recover: If the log at the remote backup becomes large,


recovery will take a long time

d) Time to commit: To ensure that the updates of a committed


transaction are durable, a transaction should not be announced
committed until its log records have reached the backup site.
d) Time to commit: To ensure that the updates of a committed
transaction are durable, a transaction should not be announced
committed until its log records have reached the backup site.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy