TXE0780
TXE0780
TXE0780
Continuity 1 / Certification
Preparation Course
TXE0780
In-System Replication:
y Hitachi ShadowImage® Heterogeneous Replication software
y Hitachi Copy-on-Write Snapshot Software
Products Used in this Course HDS Architect – Business Continuity 1 / Certification Preparation Course
INTRODUCTION ................................................................................ V
Welcome and Introductions......................................................................... v
Course Description ......................................................................................vi
Course Prerequisites ..................................................................................vii
Exercise: What Would You Like To Get Out of This Course? .................. viii
Course Objectives .......................................................................................ix
Course Topics ............................................................................................. x
Learning Paths ............................................................................................xi
GLOSSARY
EVALUATING THIS COURSE
• Introductions
– Name
– Position
– Professional skills
– Expectations from the course
Course Description
• This six hour virtual instructor-led course prepares the Hitachi Data
Systems Certified Professional to take the Architect - Business Continuity
Exam (HH0-400) by refreshing the existing knowledge on the Business
Continuity architecture concepts and Hitachi replications solutions. Also,
the workload analysis exercise helps to reinforce workload sizing in the
replication environments.
• The following skills and knowledge are validated upon successful
completion of the certification exam:
– The ability to examine data and information requirements from a business
perspective and respond with solutions, defined as a hardware and software
architecture meeting the requirements
– In a sales engineering role, preparing technical architecture implementation
strategies and plans and applying new technologies
– In an architect or storage engineering role, preparing detailed implementation
plans in association with and for execution by implementation specialists
Course Prerequisites
Course Objectives
Course Topics
• Session 1 – Day 1
– Introduction
– Understanding Business and Recovery Requirements
– Replication Strategies
– Technology Recommendations
• Session 2 – Day 2
– Workload Analysis
– Connectivity Requirements
– Configuration Sizing
Learning Paths
• RPO − Worst case time between the last back-up and interruption time
– It is based on a risk tolerance discussion with the customer
• Business impact of missing data
• Tolerance varies according to cost
Disaster
Timeline
RPO
• RTO − How long is the customer willing to live with down systems?
– RTO is outage duration; RPO is how much data must be recovered
– Time includes recovery of multiple components at the secondary site
Disaster
Timeline
RPO RTO
• Scale
– Site-wide Disaster – Large scale outage that impacts operations at an entire
facility, such as fire, hurricane, earthquake, terrorism
– Point Disaster – Single event outage that occurs at a single, readily
identifiable point in time, such as administrator error, viruses, isolated
hardware failure
• Time
– Immediate Disaster – Distinct event that affects all components at the same
time, such as a meteor
– Rolling Disaster − Several components failing at different points in time,
such as an AC failure or power failure where a server fails, then storage, then
network, eventually the whole site
Point-in-Time copies and real-time copies are discussed in the next module.
Regulatory Requirements
Recovery Techniques
Activity –
Who are some of the key stakeholders you typically talk to?
Is there only one set of correct stakeholders?
The order is important: top to bottom, or bottom to top.
Data Gathering
10
Server
Consistency Consistency
Group Group
14
Discussion of crash/recovery
• Balance cost versus benefit–delving into what customers really want and
what they are willing to pay for
• Identify customers solution flexibility–is there only one answer or are there
a variety of solutions that might work with relative costs
• Document and verify all collected information with entire customer team
16
At this point, we have covered Assess and Discovery and are moving into Design
phase of a methodology.
PIT Copies
• Definition
– Volume image that contains data from a specific point in time as opposed to a
volume replica that is continuously updated
– It is used to mitigate risks of logical corruption
• Products include:
– ShadowImage Replication software
– Copy-on-Write Snapshot software
• Point-in-Time Copy Management
– Schedule application freeze to ensure consistent image on disk
– Snap PIT copy (PAIR to PSUS)
– Resume application
• Automation tools include:
• Hitachi Command Control Interface for copy management
• Host scripting for application integration with freeze/thaw mechanism
• Hitachi Protection Manager software
• Split-second architecture
• Benefits
– Reduce impact to production hosts
– Centralized copy management
– Quick disk based recovery
– Replicated data available to other hosts for backup/reporting
• Products under Load
– ShadowImage Replication Software
• Imposes load while pairs are in PAIR status
– Copy-on-Write Snapshot Software
• Imposes load while pairs are suspended
• Resync Times
– ShadowImage Software
• quick resync
– Near instantaneous, but carries system performance penalty
• normal resync
– Time dependent on differential data
– Copy-on-Write Software
• instantaneous resync
• Exercise – When would you use each and predict behavior given certain
workload characteristics
– ShadowImage Software
• Used when recovery considerations exceed cost considerations
• Provides fastest recovery with minimal overhead to production
applications
– Copy-on-Write Software
• Used when storage cost considerations exceed recovery considerations
• Performance overhead may impact production applications
• Recovery time is a function of pool consumption plus initial overhead
(recovery time may be lengthy)
Reference Architectures
Site wide instant disaster Low RPO (0 for sync, minutes for Universal
Replicator)
Logical corruption (L3) Point-in-Time recovery images are schedulable
Testing implication No L3 recovery during Disaster Recovery (DR)
testing
Manageability ranking 1
10
Continuous line arrow indicates real-time; dotted arrow indicates Point –in- Time
(PIT).
11
12
13
• 3 Data Center
Delta resync
pass-thru
15
Failback
16
17
Scenarios
18
Exercise
19
Break group into two groups – each group will work together to link replication
architecture strategies to one of the scenarios.
Product Comparisons
“Best” Customer has two locations very close together with dark
applications fiber or DWDM (within 100miles [160km]).
Actual distance between systems dependent upon
application response time sensitivity.
Host write Expect double the current host response time plus delay
response due to distance.
time Distance delay is additive and can be calculated as a
(latency) minimum of 1ms per 62miles (100km) of circuit length,
each direction (times two).
Note TrueCopy Extended Distance software will not be covered in the exam.
• There are only a few cases where this configuration is useful. Universal Replicator
software and TrueCopy software are our remote replication products and should
be positioned as such.
• ShadowImage software in "PAIR" or "COPY" state does not provide data
consistency on the S-VOL's (even when using ShadowImage Consistency
Groups!). The resting state must be PSUS.
• If the externalized volume has been used in a LUSE or carved into a custom
volume on the Universal Storage Platform/Universal Storage Platform V, it's data
cannot be accessed directly from the externalized array because the structural
information defining those custom volumes is contained on the Universal Storage
Platform/Universal Storage Platform V itself.
• Mode 459 should be set, which holds PSUS until all differential has been
transferred out of cache and onto the S-VOL.
• If you intend to mount the S-VOL at the remote site, you must perform a manual
GUI-based Disconnect Operation of the external volumes and you must have
Check Path and Restore Operation in the Procedure.
• After mounting the S-VOL from the recovery site, a full initial copy will be required
to resynchronize the data.
Channel Extension
SAN SAN
Channel Extenders on Storage
Storage Storage
DWDM
Multiplexer Fiber
Buffer Credits
The distance is a total of 40km (20km in both directions), you need 10 buffer credits
in each direction. If you double the speed, you reduce the length of the frame to half.
The length of the frame is 2km. 40km/2km per frame = 20 buffer credits.
• How many buffer credits do you need to fill a 2Gb link between two sites
that are (13.8m) 20km apart – show your math?
• Answer
– 20km x 2(round trip) = 40km / 2km(frame length at 2Gb) =
20 Buffer Credits
10
11
12
Exercise
• Given only the following information, which replication product would you
recommend?
– Case 1:
• Two sites: Dallas and Fort Worth, 49 km
• Uses DWDM
– Case 2:
• Two sites
• Chooses telecom savings over recovery requirements
13
• For Mainframe
– Excalibur — Internal Hitachi Data Systems sizing tool (requires training from
ATC Americas and collector script) analyzes single interval
• No data collection tools need to be installed to collect Resource
Measurement Facility (RMF) data
– SAS — Can be used for data analysis, no standard kit available
– RMF Magic — Third party analysis tool, no Hitachi Data Systems license
agreement
• Open Systems
– RCEA — from Hitachi Data Systems Tools Competency Center (TCC)
(requires common data collection scripts)
– Microsoft Excel (manual)
Workload Study
Block Size
• ROT: If you don’t have enough information to calculate block size, use 8k
for open systems
If you do not have enough information to calculate block size, use 8k for open
systems.
Formulas
• When applying these formulas, include additional considerations for growth over
time, compensation for limited data sets, redundancy requirements, and more.
* For Universal Replicator software, you may want to consider bandwidth outage duration
** maxRollingAverage for TCE is based on cycle time (half RPO)
*** maxRollingAverage for all other formula are based on RPO
9
The “rated speed” values are obtained from the RSD sizing guidelines.
“Write MB/sec” can be obtained from a variety of sources. The most common is
from a host’s performance utilities like iostat or Performance Monitor. If a Hitachi
storage system is already in place, TMEA is a good option. Hitachi Tuning Manager
software can also be used. In the case of native host-based data collectors, the
resulting data may not be formatted as “MB/sec” but simple calculations can be
used to massage the data into the proper format.
=max(Write MB/sec)
11
Peak workload is useful for sizing replication when you do not have a substantial
buffering capability. To maintain a pair status, TrueCopy Synchronous will be sized
to peak. Universal Replicator may be sized to peak if the customer’s RPO is very low.
12
There are cases when you need to know how much write traffic has occurred. For
example, if you are going to suspend a TrueCopy pair, the resynchronization
process will transfer that information to the remote storage system. To estimate
resync duration, you will need to know how much work needs to be performed.
13
Now that we know how to calculate write volume for a single interval, here’s how to
calculate for a range.
Rolling Average
The rolling average is used to quickly characterize intervals that are longer than the
data collection interval. The actual data collection metrics are averages of activity
during that interval. A rolling average is similar, but allows the analyst to see how
the “window” of average workload moves over time. The “wider” the rolling
average, that is, the more intervals that are included in the average, the closer the
curve will move towards the absolute average workload.
• For each interval, sum the workload values for all hosts
• Align the interval boundaries by time and date
• Avoid “averages of averages”
• When working with clusters it is possible some hosts will report zero
activity and suddenly begin showing active workload due to cluster failover
events.
15
For multiple hosts or volumes within a host, over a given interval add together the
workload values.
Align the interval boundaries as much as possible to avoid skewing the results.
Make simultaneous peaks stack together, not line up next to one another.
Avoid making “averages of averages” whenever possible.
You may choose to sum the peaks of each host irrespective of time just to get a
glimpse of “the perfect storm”. Rarely do we plan for such an event, but it may be
work reviewing.
16
Workload Analysis
• What bandwidth
should be
recommended for
TrueCopy software
with an RPO of “very
low”?
• How about Universal
Replicator with an
RPO of 30 minutes?
• What about TrueCopy
Extended Distance
with an RPO of one
hour?
17
18
Do not replicate more than is necessary to support the replication objectives. Verify
that volumes being analyzed are really needed. For example, re-index operations or
DB dumps could use a disk temporarily that experience heavy workloads but
contributes nothing to help the customer achieve their goals.
0
5
10
15
20
25
30
35
40
45
50
date+day+hr
05/27 Fri 23:00
05/28 Sat 22:10
05/29 Sun 21:20
05/30 Mon 20:30
05/31 Tue 19:40
06/01 Wed 18:50
Response Time
data1
06/06 Mon 14:40
06/07 Tue 13:50
06/08 Wed 13:00
data2
06/09 Thu 12:10
06/10 Fri 11:20
06/11 Sat 10:30
06/12 Sun 09:40
06/13 Mon 08:50
06/14 Tue 08:00
06/15 Wed 07:10
Page 4-17
Response Time
Workload Analysis
Workload Analysis
Example
Example
20
Telecom Recommendations
Telecom Capacity
Source: http://en.wikipedia.org/wiki/List_of_device_bandwidths
TrueCopy sync suspend cycle: Size to the total write volume during the suspend
duration divided by the length of the resync window. Some savings will occur if
there is strong locality of reference in the write activity, but for calculation purposes
this can not be predicted.
TrueCopy Extended Distance: Can be sized to average workload, but pay attention
to RPO and that the update cycle will elongate during workloads that exceed
bandwidth.
As volumes experience workload during initial copy, any additional updates must
also be transferred until the volumes are fully paired.
More background on the math of geometric series can be found at:
http://en.wikipedia.org/wiki/Evaluating_sums#Geometric_series
• Order devices in HORCM file from lowest change rate to highest change
rate
• Resync time
depends on workload
and available
bandwidth
• Journals are a
buffering mechanism
10
Universal Replicator uses a queue. When you exceed your bandwidth you extend
your queue. How far behind you are is a function of current workload and
additional bandwidth to drain the queue.
Latency
Latency is the amount of it takes for a packet of data to get from one
designated point to another.
11
• Protocol overhead
• Compression
• Growth over time
• Distance Latency
– Initial copy
– Update copy
Assumptions: Add 20% per assumption you make. You didn’t have enough
workload data to analyze? Add 20%. You didn’t get workload data from all servers?
Add another 20%?
Protocol Overhead
The conversion from pure Fibre Channel to iFCP or FCIP or whatever protocol
traffic ends up with a net increase in the total data transferred per frame. We use
10% as a factor to allow for this. This is conservative – the true protocol overhead is
between 2% and 5% in most cases.
Compression
As with all data compression, it depends on the data being compressed. When
replicating an image archive or a VTL, it is probable that the channel extender will
not provide additional data reduction. As a rule of thumb, we use 1.8:1 – this is
slightly lower than the 2:1 that then channel extender vendor use.
Distance Latency
Initial copy - During initial (or resync) copy, the number of LDEVs being copied
determines the number of copy jobs. Combined with the amount of distance latency,
this has a significant impact on both the data throughput and the impact to the host
resulting from the transfer process.
Update copy: When the volumes are in PAIR state, the amount of distance latency
is generally not a factor. The channel extenders will mask the typical FC flow control
mechanism and allow it to operate at full speed even over a high latency link. This
generally works well up to ~150ms of round trip latency.
Inflow Control
13
14
Exercise
• Review questions
– How does latency affect system performance?
– How does insufficient bandwidth affect system performance?
15
Cache recommendation Per manuals, increase cache by 50% over the Simplex recommendation
FED Requirements FED MPs must be allocated for use as TrueCopy Initiators and RCU Target ports. These
microprocessors service two ports each and are rated at 2500 IOPS.
For bi-directional replication, both arrays will have MPs allocated to Initiator and RCU Targets
Strategies to increase Increase bandwidth and ensure balanced system design principles were followed
throughput
Inputs to a sizing decision Write IOPS
Write MB/s
Application latency tolerance
Distance
Outputs of a sizing decision Qty Replication Paths
Recommendations for application/database restructuring
Bandwidth recommendations
Bandwidth and deployment Do not deploy in bandwidth constrained environment
strategy Do not deploy over significant distances
Host throughput Throughput may be reduced but it’s independent of distance, and typically less than 5%
Strategies to reduce Install maximum cache and balance workload across controllers and multiple parity groups.
impact to host response
time
FED Requirements TrueCopy Extended requires two ports on each array for replication (no more, no less).
Strategies to increase Install maximum cache and balance workload across controllers and multiple parity groups.
throughput Pools should be on dedicated parity groups behind each controller
Inputs to a sizing Write IOPS
decision Write MB/s
Cycle time dependent of number of consistency groups and RPO
Outputs of a sizing Qty CT Groups
decision Required bandwidth
Pool size
Recommendations for application/database restructuring
Bandwidth and Procure bandwidth equivalent to cycle time peak rolling average given expected compression and overhead
deployment strategy
Calculations Bandwidth = cycle time peak rolling average/compression + safety factor
Pool = (maxRollingAverage(write MB/sec) x RPO/2) x 1.2
10
Volume Placement
11
Cache
Cache recom- Matrix recommends Per manuals, increase Maximum Cache Per manuals, increase
mendation 1GB cache per TB cache by 50% over the cache by 25% over the
storage. Simplex Simplex
However cache recommendation recommendation. Add
recommendation 1GB of cache per
should be based on journal group on the
workload, example: R-DKC.
1GB per 100 IOPS.
12
Host Delay
13
• ShadowImage
– No quick restore available
• Quick restore is default behavior so take care with Protection Manager
– Normal to Hitachi Dynamic Pool (HDP) replication will result in a thick HDP S-
VOL
– HDP to HDP replication will result in a thin HDP S-VOL
• Universal Replicator
– Normal to HDP replication will result in a thick HDP S-VOL
– HDP to HDP replication will result in a thin HDP S-VOL
– HDP journal volumes are not supported
14
15
• Business Requirements
– Two applications running on two hosts (hosta and hostb)
– Applications need to be consistent with each other for recovery
– Flexible RPO from near zero to four hours
• Technical Environment
– Two Universal Storage Platform V systems 500 miles apart – 20TB usable
each
– All parity groups are RAID-5(3D+1P) 146GB
– Two OC-3 telecom links
– Channel extenders with expected1.8:1 compression and 10% overhead
16
ACC— Action Code. A SIM System Information AMS —Adaptable Modular Storage
Message. Will produce an ACC which takes APID — An ID to identify a command device.
an engineer to the correct fix procedures in APF (Authorized Program Facility) — In z/OS and
the ACC directory in the MM (Maintenance OS/390 environments, a facility that permits
Manual) the identification of programs that are
ACE (Access Control Entry) — Stores access authorized to use restricted functions.
rights for a single user or group within the Application Management —The processes that
Windows security model manage the capacity and performance of
ACL (Access Control List)— stores a set of ACEs, applications
so describes the complete set of access ARB — Arbitration or “request”
rights for a file system object within the
Microsoft Windows security model Array Domain—all functions, paths, and disk
drives controlled by a single ACP pair. An
ACP (Array Control Processor) ― Microprocessor array domain can contain a variety of LVI
mounted on the disk adapter circuit board and/or LU configurations.
(DKA) that controls the drives in a specific
disk array. Considered part of the back-end, ARRAY UNIT - A group of Hard Disk Drives in one
it controls data transfer between cache and RAID structure. Same as Parity Group
the hard drives. ASIC — Application specific integrated circuit
ACP PAIR ― Physical disk access control logic. ASSY — Assembly
Each ACP consists of two DKA PCBs. To Asymmetric virtualization — See Out-of-band
provide 8 loop paths to the real HDDs virtualization.
Actuator (arm) — read/write heads are attached to Asynchronous— An I/O operation whose initiator
a single head actuator, or actuator arm, that does not await its completion before
moves the heads around the platters proceeding with other work. Asynchronous
AD — Active Directory I/O operations enable an initiator to have
ADC — Accelerated Data Copy multiple concurrent I/O operations in
progress.
ADP —Adapter
ATA — Short for Advanced Technology
ADS — Active Directory Service Attachment, a disk drive implementation that
Address— A location of data, usually in main integrates the controller on the disk drive
memory or on a disk. A name or token that itself, also known as IDE (Integrated Drive
identifies a network component. In local area Electronics) Advanced Technology
networks (LANs), for example, every node Attachment is a standard designed to
has a unique address connect hard and removable disk drives
AIX — IBM UNIX Authentication — The process of identifying an
AL (Arbitrated Loop) — A network in which nodes individual, usually based on a username and
contend to send data and only one node at a password.
time is able to send data.
IPSEC — IP security
iSCSI (Internet SCSI ) — Pronounced eye skuzzy. —K—
Short for Internet SCSI, an IP-based kVA— Kilovolt Ampere
standard for linking data storage devices
over a network and transferring data by kW — Kilowatt
carrying SCSI commands over IP networks.
iSCSI supports a Gigabit Ethernet interface -back to top-
at the physical layer, which allows systems
supporting iSCSI interfaces to connect
directly to standard Gigabit Ethernet —L—
switches and/or IP routers. When an LACP — Link Aggregation Control Protocol
operating system receives a request it
LAG — Link Aggregation Groups
generates the SCSI command and then
sends an IP packet over an Ethernet LAN— Local Area Network
connection. At the receiving end, the SCSI LBA (logical block address) — A 28-bit value that
commands are separated from the request, maps to a specific cylinder-head-sector
and the SCSI commands and data are sent address on the disk.
to the SCSI controller and then to the SCSI
LC (Lucent connector) — Fibre Channel connector
storage device. iSCSI will also return a
that is smaller than a simplex connector (SC)
response to the request using the same
protocol. iSCSI is important to SAN LCDG—Link Processor Control Diagnostics
technology because it enables a SAN to be LCM— Link Control Module
deployed in a LAN, WAN or MAN.
LCP (Link Control Processor) — Controls the
iSER — iSCSI Extensions for RDMA optical links. LCP is located in the LCM.
ISL — Inter-Switch Link LCU — Logical Control Unit
iSNS — Internet Storage Name Service LD — Logical Device
ISPF — Interactive System Productivity Facility LDAP — Lightweight Directory Access Protocol
ISC — Initial shipping condition LDEV (Logical Device) ― A set of physical disk
ISOE — iSCSI Offload Engine partitions (all or portions of one or more
disks) that are combined so that the
ISP — Internet service provider
subsystem sees and treats them as a single
area of data storage; also called a volume.
-back to top- An LDEV has a specific and unique address
-back to top-
In recent years, automated storage provisioning, RAID-1 — Mirrored array & duplexing
also called auto-provisioning, programs have RAID-3 — Striped array with typically non-rotating
become available. These programs can parity, optimized for long, single-threaded
reduce the time required for the storage transfers
provisioning process, and can free the RAID-4 — Striped array with typically non-rotating
administrator from the often distasteful task parity, optimized for short, multi-threaded
of performing this chore manually transfers
Protocol — A convention or standard that enables RAID-5 — Striped array with typically rotating
the communication between two computing parity, optimized for short, multithreaded
endpoints. In its simplest form, a protocol transfers
can be defined as the rules governing the
syntax, semantics, and synchronization of
communication. Protocols may be
-back to top-
6. Under Attachments, click the Class Eval link. The Class Evaluation form opens.
Complete the form and submit.