Information Management System 2
Information Management System 2
Storage
Management
UNIT
1
CH: 1 Introduction to
Information
Storage
What is Data? Factors that contributed
"Collection of raw
facts from which to the growth of digital
conclusion can be data:
drawn" •Increase in data
processing capabilities.
Types of Data: •New and cheaper peripherals.
structured •Lower cost and increased
speed of storage.
unstructured •Affordable and faster networks.
Big Data
•It refers to the data whose •These types of data
size is beyond the capabilities require real-time
of normal software tools for updates for analysing,
capturing, storing, analysing, predicting and decision
processing etc.................within making.
an acceptable time limit.
Informatio
: it is the intelligence and knowledge derived from
data.
Storage: data that are created must be stored so that it is
easily accessible for processing. Devices designed to store
data is called storage device or storage.
Evolution of Storage Architecture:
server - centric vs information - centric
Server - centric Information - centric
architecture architecture
RAID: Redundant
1 H:
C2 y of
Ar r
In a
d ependent Disks
Implementation
Methods
Software D
It is a method for managing multiple disks using a
software instead of a dedicated hardware.
Advantage Disadvantages:
s:
• cost effective • slow performance
• easy setup • doesn't support all RAID levels
• No hardware • due to OS dependency,
needed. upgrading is hard.
• managed by OS
HardwareD
t is a method of managing multiple disks using a dedicated
hardware controller.
Types of hardware RAID:
• controller card RAID: the RAID controller is build
inside the computer.
• external RAID controller: there is a separate RAID
controller device that manages the disks and
presents it as a single storage unit to the
computer.
Advantages: Functions:
• better manage and control a collection
performance • of disks
• independent of OS protect data in case of disk
• handles disk failures translate I/O requests
failures • without between logical and physical
losing any • data disks
RAID Techniques
Stripi g
• it is a method that splits data into small parts and
spread it across multiple disks.
• it allows multiple disks to work simultaneously,
increasing speed and performance.
Some s
• strip: a fixed-size block of data on each disk
• stripe: a full set of strips on all disks
• strip size (stripe depth): size of each strip
• stripe size: the total size of the stripe [ strip size X no.of
disks ]
• stripe width: no.of disks used for striping
Advantag :
• read & write speed is fast
• better performance
Disadvant e
• No data protection [data is
lost in case of disk
failures ]
Mirroring
• It is the method of storing the same data on two different
disks
• if one disk fails, the data is actually safe in the other disk.
How it works
• Data is written into both
the disks at the same
time.
• if one disk fails the other
disk takes over.
• the failed disk is
rewritten with the
data on the other
disk.
• this happens
automatically so there is
no system performance
issues.
Advantages Disadvantages
• Data is protected • Expensive
• fast recovery • lower write
• Bettes read performance
performance • not a backup solution
Parity
• It is a method of protecting stripped data from
disk failure without duplicating the data.
• there is a separate parity disk that stores extra
information that helps to rebuild the lost data.
How it works
• The data is split and
stored across multiple
disk.
• The parity
information of the
data is created and
stored in the extra
disk.
• if in case of disk
failures, the parity
information and the
remaining data is used
to rebuild the lost
data.
Advantages
Disadvantages
• saves storage
• data • slow performance
protection • lower write performance
• cost
effective
Types of RAID
RAID 0
• It splits data into small parts
and spread it across
multiple disks.
• RAID technique: stripping
• min disks: 2
• storage efficiency: 100%
• cost: low
• good performance for
random & sequential
reads.
• good performance for writing data.
• write penalty: none
RAID 1
• it stores the same data on two disks.
• RAID technique: mirroring
• min disks: 2
• storage efficiency: 50%
• cost: high
• good performance for
random & sequential
reads.
• poor performance for writing data.
• write penalty: moderate
Nested RAID
• this uses both stripping and mirroring to get the
performance of RAID 0 and redundancy benefits of
RAID 1.
• this requires even no.of disks
• there are 2 types of nested RAID:
RAID 0+1
• here the process of stripping data is done first
and then the stripped data is mirrored.
• this is also called mirror stripe.
RAID 1+0
• here the data is first mirrored then both the
copies of the data is stripped across the HDDs
in the RAID set.
• this is also called stripped mirror
• RAID technique: mirroring and stripping
• min disks: 4
• storage efficiency: 50%
• cost: high
• good performance for reading data.
• good performance for writing data.
• write penalty: moderate
RAID 3
• data is split into small parts and spread across
multiple disks, with one extra disk storing the parity
information of the stripped data
• RAID technique: parity protection for single disk failure.
• min disks: 3
• storage efficiency: [ ( n-1 ) / n ] x 100%
• cost: moderate
• read performance: fair for random & good for sequential
reads.
• write performance: fair for small random & fair
for large sequential writes.
• write penalty: high
RAID 4
• it is same as RAID 3 but the only difference is that,
instead of storing data byte wise it stores it block
wise making the reading/writing process
independent of other disks.
• RAID technique: parity protection for single disk failure.
• min disks: 3
• storage efficiency: [ ( n-1 ) / n ] x 100%
• cost: moderate
• read performance: good for random & sequential reads.
• write performance: fair for random & sequential writes.
• write penalty: high
RAID 5
• It is similar to RAID 4 but the
only difference is how the
parity is stored.
• Here the parity information of
the data is also split and spread
across the disks.
• RAID technique: parity
protection for single disk failure.
• min disks: 3
• storage efficiency: [ ( n-1 ) / n ] x 100%
• cost: moderate
• read performance: good for
random & sequential reads.
• write performance: fair for
random & sequential writes.
• write penalty: high
RAID 6
• It is similar to RAID 5 but
the only difference is that it
uses a second parity
element.
• RAID technique: parity
protection for two disk failures.
• min disks: 4
• storage efficiency: [ ( n-2 ) / n ] x 100%
• cost: moderate but more than RAID 5
• read performance: good for
random & sequential reads.
• write performance: fair for
random & sequential writes.
• write penalty: very high
Hot spares
• It is a spare HDD in a RAID array that temporarily
replaces a failed HDD of a RAID set.
• 2 ways how data is retrieved from failed HDD:
• It uses the parity informations form the
failed HDD and the data on the surviving
disk to rebuild the lost data.
• if mirroring was used by the failed HDD
then hot spares get the data from the
mirrored disk of the failed disk.
• when a HDD fails it is permanently replaced by a hot
spare due to this another hot spare has to be
configured to the array
• another method that happens is that, after the failure
of HDD, hot spare replaces it and retrieves the data
and when a new HDD is added to the array, all the
data in the hot spare is copied to the new HDD and
the hot spare goes back to it's idle state.
• The HDD must be a large enough to accommodate
the data of the failed HDD.