High Availability Guide For SGI InfiniteStorage
High Availability Guide For SGI InfiniteStorage
InfiniteStorage
007–5617–007
COPYRIGHT
© 2010–2013 Silicon Graphics International Corp. All rights reserved; provided portions may be copyright in third parties, as indicated
elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic
documentation in any manner, in whole or in part, without the prior written permission of SGI.
Intel is a trademark of Intel Corporation in the U.S. and other countries. Linux is a registered trademark of Linus Torvalds in the U.S.
and other countries. Novell is a registered trademark and SUSE is a trademark of Novell, Inc. in the United States and other countries.
Supermicro is a registered trademark of Super Micro Computer Inc. All other trademarks mentioned herein are the property of their
respective owners.
New Features in this Guide
007–5617–007 iii
Record of Revision
Version Description
007–5617–007 v
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . 1
High Availability Overview . . . . . . . . . . . . . . . . . . . . 1
SGI Resource Agents and RPMs . . . . . . . . . . . . . . . . . . . 2
Failover Example Scenarios . . . . . . . . . . . . . . . . . . . . 4
CXFS TM
NFS Edge-Serving Failover . . . . . . . . . . . . . . . . . 4
DMF Failover . . . . . . . . . . . . . . . . . . . . . . . . 7
COPAN MAID OpenVault Client HA Service for Mover Nodes . . . . . . . . 8
Configuration Tools . . . . . . . . . . . . . . . . . . . . . . . 10
2. Best Practices . . . . . . . . . . . . . . . . . . . . . . 11
Preliminary Best Practices Before Introducing HA . . . . . . . . . . . . . 11
Ensure the System is Ready for HA . . . . . . . . . . . . . . . . . 11
Prepare to Use the GUI . . . . . . . . . . . . . . . . . . . . . 12
Use a Separate Filesystem for CXFS NFS State Information . . . . . . . . . . 12
Use Consistent Virtual Hostnames . . . . . . . . . . . . . . . . . 12
Ensure that the Debug RPM Matches the Kernel . . . . . . . . . . . . . 12
HA Configuration and Testing Best Practices . . . . . . . . . . . . . . . 13
Use the Appropriate HA Tools . . . . . . . . . . . . . . . . . . 13
007–5617–007 vii
Contents
viii 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Requirements . . . . . . . . . . . . . . . . . . . . . . 25
HA Support Requirements . . . . . . . . . . . . . . . . . . . . 26
Licensing Requirements . . . . . . . . . . . . . . . . . . . . . 26
Software Version Requirements . . . . . . . . . . . . . . . . . . . 26
Hardware Requirements . . . . . . . . . . . . . . . . . . . . . 26
System Reset Requirements . . . . . . . . . . . . . . . . . . . . 27
Time Synchronization Requirements . . . . . . . . . . . . . . . . . 27
CXFS NFS Edge-Serving Requirements . . . . . . . . . . . . . . . . . 27
CXFS Requirements . . . . . . . . . . . . . . . . . . . . . . . 29
CXFS Server-Capable Administration Nodes . . . . . . . . . . . . . . 29
CXFS Relocation Support . . . . . . . . . . . . . . . . . . . . 29
Applications that Depend Upon CXFS Filesystems . . . . . . . . . . . . 30
CXFS and System Reset . . . . . . . . . . . . . . . . . . . . 30
CXFS Start/Stop Issues . . . . . . . . . . . . . . . . . . . . . 30
CXFS Volumes and DMF-Managed User Filesystems . . . . . . . . . . . . 31
Local XVM Requirements . . . . . . . . . . . . . . . . . . . . . 31
Filesystem Requirements . . . . . . . . . . . . . . . . . . . . . 31
Virtual IP Address Requirements . . . . . . . . . . . . . . . . . . 31
TMF Requirements . . . . . . . . . . . . . . . . . . . . . . . 32
OpenVaultTM Requirements . . . . . . . . . . . . . . . . . . . . 32
COPAN MAID Requirements . . . . . . . . . . . . . . . . . . . . 33
COPAN MAID in Any HA Cluster . . . . . . . . . . . . . . . . . 34
COPAN MAID in a DMF HA Cluster . . . . . . . . . . . . . . . . 34
COPAN MAID in a Mover-Node HA Cluster . . . . . . . . . . . . . . 34
DMF Requirements . . . . . . . . . . . . . . . . . . . . . . . 35
NFS Requirements . . . . . . . . . . . . . . . . . . . . . . . 37
Samba Requirements . . . . . . . . . . . . . . . . . . . . . . 37
007–5617–007 ix
Contents
5. Standard Services . . . . . . . . . . . . . . . . . . . . 45
CXFS NFS Edge-Serving Standard Service . . . . . . . . . . . . . . . . 46
CXFS Standard Service . . . . . . . . . . . . . . . . . . . . . . 47
Local XVM Standard Service . . . . . . . . . . . . . . . . . . . . 47
TMF Standard Service . . . . . . . . . . . . . . . . . . . . . . 48
OpenVault Standard Service . . . . . . . . . . . . . . . . . . . . 48
COPAN MAID Standard Service . . . . . . . . . . . . . . . . . . . 49
DMF Standard Service . . . . . . . . . . . . . . . . . . . . . . 50
NFS Standard Service . . . . . . . . . . . . . . . . . . . . . . 50
Samba Standard Service . . . . . . . . . . . . . . . . . . . . . 51
DMF Manager Standard Service . . . . . . . . . . . . . . . . . . . 51
DMF Client SOAP Standard Service . . . . . . . . . . . . . . . . . . 52
x 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xi
Contents
7. DMF HA Service . . . . . . . . . . . . . . . . . . . . 75
DMF HA Example Procedure . . . . . . . . . . . . . . . . . . . . 76
CXFS Resource . . . . . . . . . . . . . . . . . . . . . . . . 79
Creating the CXFS Primitive . . . . . . . . . . . . . . . . . . . 79
Required Fields for CXFS . . . . . . . . . . . . . . . . . . . 79
Instance Attributes for CXFS . . . . . . . . . . . . . . . . . . 79
Meta Attributes for CXFS . . . . . . . . . . . . . . . . . . . 80
Probe Operation for CXFS . . . . . . . . . . . . . . . . . . . 80
Monitor Operation for CXFS . . . . . . . . . . . . . . . . . . 80
Start Operation for CXFS . . . . . . . . . . . . . . . . . . . 80
Stop Operation for CXFS . . . . . . . . . . . . . . . . . . . 81
Testing the CXFS Resource . . . . . . . . . . . . . . . . . . . . 81
Local XVM Resource . . . . . . . . . . . . . . . . . . . . . . 82
Creating the Local XVM Primitive . . . . . . . . . . . . . . . . . 83
Required Fields for Local XVM . . . . . . . . . . . . . . . . . 83
xii 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xiii
Contents
xiv 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xv
Contents
xvi 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xvii
Contents
xviii 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xix
Contents
xx 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Glossary . . . . . . . . . . . . . . . . . . . . . . . . 211
Index . . . . . . . . . . . . . . . . . . . . . . . . . . 217
007–5617–007 xxi
Figures
007–5617–007 xxiii
Tables
007–5617–007 xxv
About This Guide
Prerequisites
To use this guide, you must have access to the SUSE HAE High Availability Guide
provided by the following website:
http://www.suse.com/documentation/sle_ha/
007–5617–007 xxvii
About This Guide
Conventions
In this guide, High Availability Extension and HAE refer to the SUSE Linux Enterprise
High Availability Extension product.
Note: This guide shows administrative commands that act on a group by using the
variable resourceGROUP or the example group name, such as dmfGroup. Other
commands that act on a resource primitive use the variable resourcePRIMITIVE or an
example primitive name, such as dmf.
Convention Meaning
command This fixed-space font denotes literal items such as
commands, files, routines, path names, signals,
messages, and programming language structures.
xxviii 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 xxix
About This Guide
Reader Comments
If you have comments about the technical accuracy, content, or organization of this
publication, contact SGI. Be sure to include the title and document number of the
publication with your comments. (Online, the document number is located in the
front matter of the publication. In printed publications, the document number is
located at the bottom of each page.)
You can contact SGI in any of the following ways:
• Send e-mail to the following address:
techpubs@sgi.com
• Contact your customer service representative and ask that an incident be filed in
the SGI incident tracking system.
• Send mail to the following address:
SGI
Technical Publications
46600 Landing Parkway
Fremont, CA 94538
SGI values your comments and will respond to them promptly.
xxx 007–5617–007
Chapter 1
Introduction
007–5617–007 1
1: Introduction
2 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Although the SGI resource agents can be used independently, this guide provides
example procedures to configure the set of resources required to provide highly
available versions of the following:
• CXFS NFS edge-serving from CXFS client-only nodes in a two-node active/active
HA cluster. See "CXFSTM NFS Edge-Serving Failover" on page 4.
• DMF in a two–node active/passive HA cluster (which can optionally include
COPAN MAID shelves). See "DMF Failover" on page 7.
• COPAN MAID shelves in an active/active HA cluster that consists of two parallel
data mover nodes. See "COPAN MAID OpenVault Client HA Service for Mover
Nodes" on page 8.
Although other configurations may be possible, SGI has tested and recommends the
above HA environments.
007–5617–007 3
1: Introduction
Note: The attributes and the various value recommendations listed in this guide are
in support of the examples used in this guide. If you are using the resources in a
different manner, you must evaluate whether these recommendations and use of meta
attributes apply to your intended site-specific purpose.
4 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Normal mode:
Network
ipalias-group-1 ipalias-group-2
(smnotify-for-rack1) (smnotify-for-rack2)
(ipalias-for-rack1) (ipalias-for-rack2)
cxfs-nfs-clone
cxfs-nfs-group cxfs-nfs-group
(cxfs-client-nfsserver) (cxfs-client-nfsserver)
(cxfs-client) (cxfs-client)
HA cluster
007–5617–007 5
1: Introduction
After failure:
Network
HA virtual IP HA virtual IP
address 1 address 2
ipalias-group-1
(smnotify-for-rack1)
(ipalias-for-rack1)
ipalias-group-2
(smnotify-for-rack2)
(ipalias-for-rack2)
cxfs-nfs-clone
cxfs-nfs-group cxfs-nfs-group
(cxfs-client-nfsserver) (cxfs-client-nfsserver)
(cxfs-client) (cxfs-client)
HA cluster
6 007–5617–007
High Availability Guide for SGI® InfiniteStorage
DMF Failover
Figure 1-3 and Figure 1-4 describe an example process of failing over a DMF HA
service in a two-node HA cluster using active/passive mode.
Normal mode:
Network
HA virtual IP address
dmfGroup
(lxvm, tmf, dmf, etc.)
HA cluster
After failure:
Network
HA virtual IP address
dmfGroup
(lxvm, tmf, dmf, etc.)
HA cluster
007–5617–007 7
1: Introduction
COPAN cabinet
Shelf 3
Shelf 2
Shelf 1
Shelf 0
Normal mode:
copan_ov_client-0 copan_ov_client-2
(default = pdmn1) (default = pdmn2)
copan_ov_client-1 copan_ov_client-3
(default = pdmn1) (default = pdmn2)
cxfs-client-clone
cxfs-client cxfs-client
HA cluster
Figure 1-5 COPAN OpenVault Client HA Service for Mover Nodes — Normal State
8 007–5617–007
High Availability Guide for SGI® InfiniteStorage
When pdmn1 fails, its COPAN OpenVault client resources move to pdmn2 and pdmn2
becomes the current owner node of all of the shelves, as shown in Figure 1-6.
COPAN cabinet
Shelf 3
Shelf 2
Shelf 1
Shelf 0
After failure:
cxfs-client cxfs-client
HA cluster
Figure 1-6 COPAN OpenVault Client HA Service for Mover Nodes — After Failover
After pdmn1 recovers and rejoins the HA cluster, you can choose a convenient time to
manually move copan_ov_client_0 and copan_ov_client_1 back to pdmn1 to
balance the load and return the HA cluster to its normal state (see "Manually Moving
a copan_ov_client Resource" on page 178). You should perform this procedure
007–5617–007 9
1: Introduction
Configuration Tools
The procedures in this guide use the following tools as documented in the SUSE High
Availability Guide and the crm(8) online help:
• YaST installation and configuration tool
• Pacemaker graphical user interface (GUI) for HA resource management, accessed
with the crm_gui command
• Linux HA Management Client command-line administration tools such as crm(8),
cibadmin(8), and crm_verify(8)
10 007–5617–007
Chapter 2
Best Practices
The following are best practices when using SGI resource agents:
• "Preliminary Best Practices Before Introducing HA" on page 11
• "HA Configuration and Testing Best Practices" on page 13
• "Administrative Best Practices" on page 18
• "Maintenance Best Practices" on page 22
007–5617–007 11
2: Best Practices
Configure and test the base HA cluster before adding the SGI resource group and
resource primitives. To do these things, review Chapter 4, "Outline of the
Configuration Procedure", and then follow the detailed example steps in the
remainder of the guide, as appropriate for your site.
12 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Caution: If your HA cluster includes CXFS nodes, you should use a CXFS fail policy
! that does not include reset. If the fail policy does not include reset, you must
perform a manual reset of a failed node if the node does not reboot automatically.
007–5617–007 13
2: Best Practices
Note: If you make configuration changes with the crm command, use a shadow
environment (crm cib new), so that you can verify those changes before
applying them to the running cluster information base (CIB). See the crm(8) online
help for more information.
• Use the crm(8) command or the HA GUI to test resource primitives. This guide
typically provides the crm command-line method.
• Use the following command to verify changes you make to the CIB, with each
resource primitive that you define:
ha# crm_verify -LV
Note: Fields that are unnecessary or for which the GUI provides appropriate
defaults are not addressed in this guide.
• monitor is the name value of the operation that determines if the resource is
operating correctly. There are two types of monitor operations:
– Standard monitor operations monitor the operation of the resources at an interval
of the specified time (the interval begins at the end of the last monitor
completion). Each monitor operation will time-out after the specified number
of seconds. If the monitor operation fails, it will attempt to restart the resource.
– Probe monitor operations check to see if the resources are already running.
Note: Always use a probe operation, even if you do not use a standard
monitor operation.
14 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• start is the name value of the operation that initiates the resource. It will time-out
after a specified time. It requires that fencing is configured and active in order to
start the resource. Using system reset as a fencing method is required in order to
preserve data integrity. If the start operation fails, it will attempt to restart the
resource.
• stop is the name value of the operation that terminates or gives up control of the
resource. It will time-out after the specified time. If the stop operation fails, it will
attempt to fence the node on which the failure occurred. The stop fail policy must
be set to fence and a STONITH facility must be configured according to the
requirements for your site (see Chapter 9, "STONITH Examples" on page 165.)
Note: Longer stop operation timeouts may result in longer failover times, and
shorter stop operation timeouts may result in more frequent system reset events.
• resource-stickiness specifies a score for the preference to keep this resource on the
node on which it is currently running. A positive value specifies a preference for
the resource to remain on the node on which it is currently running. This
preference may only be overridden if the node becomes ineligible to run the
resource (if the node fails over) or if there is a start, monitor, or stop failure for
this resource or another resource in the same resource group.
• Some Operations fields are accessed under the Optional tab. Those that are
required for the SGI implementation of HA are listed in this guide using the
format Optional > Field Name.
Note: Although the GUI organizes these items under the Optional heading, they
are not optional for the SGI implementation of HA; you must provide them in
your configuration. Similarly, the Required subsections in this chapter refer to
those items located under that heading in the GUI; the label does not imply that
only those values are required.
007–5617–007 15
2: Best Practices
• In many cases, there are pull-down lists that contain possible values (shown in
boldface in this guide). In other cases, you must enter in text (shown in
literalfont).
• In general, you must use the values shown in this guide for meta attributes and
for those values available from a pull-down list.
• ID is the unique identification of the clone, resource group, or resource primitive,
such as cxfs. This can be any name you like as long as it does not include spaces.
For the monitor, start, and stop operations, a unique ID will be generated for you
based on the primitive name and interval, such as cxfs_op_monitor_30s.
• Class, Provider, and Type must use the exact values shown in this guide.
16 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 17
2: Best Practices
Note: The crm configure verify command is not equivalent to the crm_verify
command.
18 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Before making changes to an existing HA configuration, ensure that you have a good
backup copy of the current CIB so that you can return to it if necessary. (If you
encounter a corrupted CIB, you must erase it by force and then restore the information
about resources, constraints, and configuration from a backup copy of a good CIB.)
After you establish that your changed configuration is good, make a new backup of
the CIB.
For more information, see the SUSE High Availability Guide and the crm(8) man page.
Refer to the /var/log/messages system log periodically (to ensure that you are
aware of operations automatically initiated by HA) and if you notice errors. See
"Reviewing the Log File" on page 175.
007–5617–007 19
2: Best Practices
Note: If conflicting constraints already exist, this preference might not be honored.
You must remember to remove implicit constraints when they are no longer needed,
such as after the resource or resource group has successfully moved to the new node.
Do the following:
ha# crm resource unmove resource_or_resourceGROUP
In this case, the CXFS heartbeat timeout is 500 (5 seconds), so you would set the HA
totem token value to at least 15s. If mtcp_hb_period was set to 6000 (60
seconds), you would use an HA totem token value of at least 90s.
For more information, see the corosync.conf(5) man page.
Note: There is an error on the corosync.conf(5) man page; the correct file location
is /etc/corosync/corosync.conf.
20 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Upgrade Appropriately
When upgrading the software, follow the procedure in "Performing a Rolling
Upgrade" on page 180.
Changes made to /etc/sysctl.conf will take effect on the next boot and will be
persistent. To make the settings effective immediately for the current session as well,
enter the following:
ha# echo "core.%p.%t" > /proc/sys/kernel/core_pattern
007–5617–007 21
2: Best Practices
Note: Core files are generally placed in the current working directory (cwd) of the
process that dumped the core file. For example, to locate the core file for a PID of
25478:
node1# ps -fp 25478
UID PID PPID C STIME TTY TIME CMD
root 25478 25469 0 Feb02 ? 00:02:40 /usr/lib/heartbeat/stonithd
node1# ls -l /proc/25478/cwd
lrwxrwxrwx 1 root root 0 Mar 2 17:29 /proc/25478/cwd -> /var/lib/heartbeat/cores/root
22 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Hardware Maintenance
Hardware changes are generally disruptive to the HA environment and always
require careful planning. You should consider whether or not the hardware change
will also require a software change. In many cases, you must entirely shut down the
HA cluster. See "Maintenance with a Full Cluster Outage" on page 189.
007–5617–007 23
2: Best Practices
Note: In general, you should not simply unmanage a given resource because that can
adversely impact failcounts and cause inappropriate failovers.
24 007–5617–007
Chapter 3
Requirements
This chapter discusses the following requirements for a high-availability (HA) cluster
using SGI resource agents:
Note: All of the stop/start requirements for services and resources that are noted in
this chapter will be fulfilled if you follow the steps in Chapter 4, "Outline of the
Configuration Procedure" on page 39.
007–5617–007 25
3: Requirements
HA Support Requirements
HA may in some cases require the purchase of additional support from SUSE.
Licensing Requirements
All nodes in an HA cluster must have the appropriate software licenses installed. The
following software requires licenses if used:
• CXFS
• DMF
• DMF Parallel Data Mover Option
For information about obtaining licenses, see the individual product administration
guides.
Hardware Requirements
All nodes in an SGI HA cluster must be x86_64 architecture with a BMC supporting
the IPMI protocol and administrative privileges.
Note: If you form an HA cluster using only members of a partitioned system with a
single power supply, a failure of that power supply may result in failure of the HA
cluster. CXFS does not support these members as nodes in the CXFS cluster.
26 007–5617–007
High Availability Guide for SGI® InfiniteStorage
DMF supports only one instance running on a given node in an HA cluster at any
given time, thus active/active mode is not a possible configuration. If the cluster also
runs CXFS, the DMF server nodes in the cluster must also be CXFS server-capable
administration nodes. For additional requirements when using the DMF Parallel Data
Mover Option, see DMF 6 Administrator Guide for SGI InfiniteStorage.
Note: There can be multiple two-node HA clusters within one CXFS cluster.
• Due to the way that NLM grace notification is implemented, all of the
server-capable administration nodes in the CXFS cluster must run the same
version of CXFS in order to use CXFS relocation. This means that if you want to
do a CXFS rolling upgrade of the metadata servers while running HA CXFS NFS
edge-serving, you must use CXFS recovery and not CXFS relocation.
007–5617–007 27
3: Requirements
28 007–5617–007
High Availability Guide for SGI® InfiniteStorage
CXFS Requirements
The CXFS resource agent allows you to associate the location of the CXFS metadata
server with other products, such as DMF. This section discusses the following:
• "CXFS Server-Capable Administration Nodes" on page 29
• "CXFS Relocation Support" on page 29
• "Applications that Depend Upon CXFS Filesystems" on page 30
• "CXFS and System Reset" on page 30
• "CXFS Start/Stop Issues" on page 30
• "CXFS Volumes and DMF-Managed User Filesystems" on page 31
007–5617–007 29
3: Requirements
30 007–5617–007
High Availability Guide for SGI® InfiniteStorage
fence and you must configure a STONITH facility according to the requirements for
your site. See Chapter 9, "STONITH Examples" on page 165.
In this case, the offending CXFS metadata server will be reset, causing recovery to an
alternate node.
Filesystem Requirements
For DMF HA purposes, filesystems used by the Filesystem resource should use a
filesystem type of xfs.
007–5617–007 31
3: Requirements
IPaddr2 resource within the same resource group as the openvault resource. You
must also add an associated virtual hostname to your local DNS or to the
/etc/hosts file on all hosts in the cluster that could be used as a DMF server or as
an OpenVault client node.
Each HA node must have a physical Ethernet interface on the same subnet as the
virtual IP address defined for the IPaddr2 resource.
You may use the IPaddr2 virtual address for other services, such as for accessing
DMF Manager or serving NFS. However, if DMF and OpenVault are configured to
use a dedicated subnet, you should instead define a second IPaddr2 address on an
appropriate subnet for accessing these services. You should define this IPaddr2
resource in the same resource group as the dmfman resource.
See also "Virtual IP Address Resource" on page 96.
TMF Requirements
All tape devices should be configured as DOWN in the tmf.config file on all nodes.
The loaders should be configured as UP but the tmf service must be disabled at boot
time (chkconfig tmf off) for all nodes.The resource agent will start tmf and
configure the loader up as needed.
Note: If tape drives are defined and used outside of DMF, you must manually start
TMF on the inactive server.
OpenVaultTM Requirements
If OpenVault is to be used as the DMF mounting service, you must do the following:
• If upgrading to an entirely new root filesystem, as would be required if upgrading
from a SLES 10 system, you should create a copy of the OpenVault configuration
directory (/var/opt/openvault) from the old root before upgrading the OS.
You can then reinstall it on the new root so that you do not need to entirely
reconfigure OpenVault. See the section about taking appropriate steps when
upgrading DMF in the DMF 6 Administrator Guide for SGI InfiniteStorage.
• Provide a directory for OpenVault’s use within an HA filesystem in the DMF
resource group. This is known as the serverdir directory (as specified in "OpenVault
32 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Resource" on page 107). The directory will hold OpenVault’s database and logs.
The directory can be either of the following:
– Within the root of an HA-managed filesystem dedicated for OpenVault use
– Within another HA-managed filesystem, such as the filesystem specified by the
HOME_DIR parameter in the DMF configuration file
In non-HA configurations, the OpenVault server’s files reside in
/var/opt/openvault/server. During the conversion to HA, OpenVault will
move its databases and logs into the specified directory within an HA-managed
filesystem and change /var/opt/openvault/server to be a symbolic link to
that directory.
• Ensure that you do not have the OV_SERVER parameter set in the base object of
the DMF configuration file, because in an HA environment the OpenVault server
must be the same machine as the DMF server.
• Configure the DMF application instances in OpenVault to use a wildcard ("*") for
the hostname and instance name. For more information, see the chapter about
mounting service configuration tasks in the DMF 6 Administrator Guide for SGI
InfiniteStorage.
• On all HA nodes during HA operation, disable the openvault service from being
started automatically at boot time:
ha# chkconfig openvault off
007–5617–007 33
3: Requirements
• Activity to all shelves controlled by a given node must be stopped before moving
the control of any one of those shelves to another node
34 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• The CXFS client resource on each parallel data mover node must be started (via a
clone) before the COPAN OpenVault client resource is started on those nodes
• The COPAN OpenVault client resource on each parallel data mover node must be
stopped before the CXFS client resource is stopped on those nodes
• A parallel data mover node must be configured as the owner node for each shelf.
For load-balancing purposes, one mover node will be the default owner of half of
the shelves and the other mover node will be the default owner of the remaining
shelves.
• On both parallel data mover nodes during HA operation, disable the
cxfs_client and openvault services from being started automatically at boot
time:
ha# chkconfig cxfs_client off
ha# chkconfig openvault off
DMF Requirements
Using DMF with HA software requires the following:
• The HA cluster must contain all nodes that could be DMF servers.
• Each DMF server must run the required product and HA software.
• All DMF server nodes in the HA cluster must have connectivity to the same set of
libraries and drives. If one node has access to only a subset of the drives, and the
DMF server is failed over to that node, DMF would then not be able to access data
on volumes left mounted in inaccessible drives.
• All DMF server nodes must have connectivity to all of the CXFS and XFS®
filesystems that DMF either depends upon or manages:
– Each of the local XVM volumes that make up those filesystems must be
managed by an lxvm resource within the same resource group as the dmf
resource. Each of the XFS filesystems must be managed by a community
Filesystem resource in that resource group.
007–5617–007 35
3: Requirements
– Each of the CXFS filesystems (other than DMF-managed user filesystems) must
be managed by the cxfs resource in that resource group.
The DMF filesystems to be managed are:
– The DMF-managed user filesystems (do not include these in the volnames
attribute list for the cxfs resource; see "Instance Attributes for CXFS" on page
79)
– DMF administrative filesystems specified by the following parameters in the
DMF configuration file:
• HOME_DIR
• JOURNAL_DIR
• SPOOL_DIR
• TMP_DIR
• MOVE_FS
• CACHE_DIR for any library servers
• STORE_DIRECTORY for any disk cache manager (DCM) and disk MSPs
using local disk storage
DMF requires independent paths to drives so that they are not fenced by CXFS.
The ports for the drive paths on the switch should be masked from I/O fencing in
a CXFS configuration.
The SAN must be zoned so that XVM does not fail over CXFS filesystem I/O to
the paths visible through the HBA ports when Fibre Channel port fencing occurs.
Therefore, you should use either independent switches or independent switch
zones for CXFS/XVM volume paths and DMF drive paths.
For more information about DMF filesystems, see the DMF 6 Administrator Guide
for SGI InfiniteStorage.
• The ordering of resources within a resource group containing a dmf resource must
be such that the dmf resource starts after any filesystems it uses are mounted and
volume resources it uses are available (and the dmf resource must be stopped
before those resources are stopped).
• There must be a virtual hostname for use by DMF. See "Virtual IP Address
Requirements" on page 31.
36 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• Set the INTERFACE parameter in the node object for each potential DMF server
node to the same virtual hostname used for SERVER_NAME in the base object.
• On all HA nodes during HA operation, disable the dmf service from being started
automatically at boot time:
ha# chkconfig dmf off
NFS Requirements
On all HA nodes during HA operation, disable the nfsserver service from being
started automatically at boot time:
ha# chkconfig nfsserver off
Samba Requirements
The /etc/samba and /var/lib/samba directories must be on shared storage. SGI
recommends using symbolic links.
On all HA nodes during HA operation, disable the smb and nmb services from being
started automatically at boot time:
ha# chkconfig smb off
ha# chkconfig nmb off
007–5617–007 37
3: Requirements
38 007–5617–007
Chapter 4
Prepare for HA
To prepare for an HA cluster, do the following:
1. Understand the requirements for the SGI products that you want to include in
your HA cluster. See Chapter 3, "Requirements" on page 25.
2. Ensure that you have installed the required SGI products on node1 and node2
(from the SGI ISSP High Availability YaST pattern) according to the installation
procedure in the SGI InfiniteStorage Software Platform Release Note.
3. Configure and test each of the standard SGI product services on node1 before
making them highly available. All of the filesystems must be mounted and and
all drives and libraries must be accessible on node1. See Chapter 5, "Standard
Services" on page 45.
007–5617–007 39
4: Outline of the Configuration Procedure
4. On both nodes, use the procedures in the following chapters to disable the noted
standard services (other than cxfs and cxfs_cluster) from being started
automatically at boot time and then stop the currently running services:
• Chapter 6, "CXFS NFS Edge-Serving HA Service" on page 53:
cxfs_client
nfsserver
nmb
smb
40 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Ensure that multiple HA clusters on the same local network use different
multicast port numbers.
4. Explicitly set the HA node ID, such as 1 for node1. Each node must have a
unique HA node ID number. (The HA node ID may be different from the CXFS
node ID.)
Caution: With explicit node IDs, you must not use csync2 on node2 (despite
! the directions in the SUSE High Availability Guide) because it will result in
duplicate node IDs.
007–5617–007 41
4: Outline of the Configuration Procedure
• On node2:
node2# chkconfig logd on
node2# chkconfig openais on
42 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: You will reenable system reset later (in "Put the HA Cluster Into
Production Mode" on page 43) after testing all of the SGI resource primitives.
4. Configure and test the resources required for your HA configuration. Proceed to
the next resource primitive only if the current resource is behaving as expected, as
defined by the documentation. Using the instructions in this guide, you must
configure resources in the specific order shown in the following:
007–5617–007 43
4: Outline of the Configuration Procedure
3. Ensure that any constraints remaining in the cluster are appropriate for a
production environment. To remove any remaining implicit constraints imposed
by an administrative move, enter the following:
node1# crm resource unmove resourceGROUP
44 007–5617–007
Chapter 5
Standard Services
You should configure and test all standard services before applying high availability.
In general, you should do this on one host (known in this guide as node1 or pdmn1).
This host will later become a node in the high-availability (HA) cluster, on which all
of the filesystems will be mounted and on which all drives and libraries are accessible.
If you already have a stable configuration, you can skip the steps in this chapter.
This chapter discusses the following:
• "CXFS NFS Edge-Serving Standard Service" on page 46
• "CXFS Standard Service" on page 47
• "Local XVM Standard Service" on page 47
• "TMF Standard Service" on page 48
• "OpenVault Standard Service" on page 48
• "COPAN MAID Standard Service" on page 49
• "DMF Standard Service" on page 50
• "NFS Standard Service" on page 50
• "Samba Standard Service" on page 51
• "DMF Manager Standard Service" on page 51
• "DMF Client SOAP Standard Service" on page 52
007–5617–007 45
5: Standard Services
2. Mount the filesystems on a node that will not be a member of the HA cluster
(otherhost):
otherhost# mount initial:/nfsexportedfilesystem /mnt/test
46 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: If you have multiple clusters on the same network, add the -i clustername
option to identify the cluster name. For more information, see the
cxfs_admin(8) man page.
007–5617–007 47
5: Standard Services
Note: In the tmf.config file, drives in drive groups managed by HA should have
access configured as EXCLUSIVE and should have status configured as DOWN
when TMF starts. Loaders in the tmf.config file should have status configured as
UP when TMF starts.
2. Use tmmls to verify that all of the loaders have a status of UP:
node1# tmmls
Note: Configuration of OpenVault on the alternate DMF server (node2) will be done
when the conversion to HA is performed.
To test the OpenVault standard service, verify that you can perform operational tasks
documented in the OpenVault guide, such as mounting and unmounting of cartridges
using the ov_mount and ov_unmount commands.
For example, in an OpenVault configuration with two drives (drive0 and drive1)
where you have configured a volume named DMF105 for use by DMF, the following
48 007–5617–007
High Availability Guide for SGI® InfiniteStorage
sequence of commands will verify that drive drive0 and the library are working
correctly:
node1# ov_mount -A dmf -V DMF105 -d drive0
Mounted DMF105 on /var/opt/openvault/clients/handles/An96H0uA3xr0
node1# tsmt status
Controller: SCSI
Device: SONY: SDZ-130 0202
Status: 0x20262
Drive type: Sony SAIT
Media : READY, writable, at BOT
node1# ov_stat -d | grep DMF105
drive0 drives true false false inuse loaded ready true DMF105S1
node1# ov_unmount -A dmf -V DMF105 -d drive0
Unmounted DMF105
node1# exit
Note: You will not run ov_shelf on node2 at this point. You will do that later in
"Creating the OpenVault Components on the Failover Node" on page 119.
If you are using the Parallel Data Mover Option, also see the instructions in DMF 6
Administrator Guide for SGI InfiniteStorage.
Note: You will create the OpenVault components on the alternate node later, using
the instructions in this guide.
007–5617–007 49
5: Standard Services
To test the standard service, follow the instructions to test that OpenVault can mount
a migration volume, as described in the COPAN MAID for DMF Quick Start Guide.
Wait a bit to allow time for the volume to be written and unmounted.
3. Verify that the volumes are mounted and written successfully.
4. Verify that the volumes can be read and the data can be retrieved:
node1# dmget files_to_test
50 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Mount the filesystems on a node that will not be a member of the HA cluster
(otherhost):
otherhost# mount initial:/nfsexportedfilesystem /mnt/test
Then verify that you can log in and use DMF Manager, such as by viewing the
Overview panel.
007–5617–007 51
5: Standard Services
Then verify that you can access the GUI and view the WSDL for one of the DMF
client functions.
52 007–5617–007
Chapter 6
Note: The attributes listed in this chapter and the various value recommendations are
in support of this example. If you are using the resources in a different manner, you
must evaluate whether these recommendations and the use of meta attributes apply
to your intended site-specific purpose.
Figure 6-1 on page 54 shows a map of an example configuration process for CXFS
NFS edge-serving in an active/active HA cluster, referring to resource agent type
names such as cxfs-client and IPaddr2.
007–5617–007 53
6: CXFS NFS Edge-Serving HA Service
First Clone
Group_Service
cxfs-client
cxfs-client-nfsserver
Group_IPalias1
IPaddr2
cxfs-client-smnotify
Group_IPalias2
IPaddr2
cxfs-client-smnotify
Constraint_Resource_Colocation
Constraint_Resource_Order
Last
Start Clone before Group_IPalias1
54 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Click the Operations tab to edit the monitor operations and to add the
probe, start, and stop operations as needed for a resource.
007–5617–007 55
6: CXFS NFS Edge-Serving HA Service
56 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Set node2 to standby state to ensure that the resources remain on node1:
node1# crm node standby node2
4. Confirm that node2 is offline and that the resources are off:
a. View the status of the cluster on node1, which should show that node2 is in
standby state:
c. View the status of the NFS daemons on node2, which should show that
statd is dead and nfsd is unused:
node2# rcnfsserver status
Checking for kernel based NFS server: idmapd running
mountd unused
statd dead
nfsd unused
007–5617–007 57
6: CXFS NFS Edge-Serving HA Service
6. Confirm that the clone has returned to normal status, as described in step 2.
Note: You will just create the primitives in this step. You will not test them until
later, in "Test the IP Alias Groups" on page 59.
58 007–5617–007
High Availability Guide for SGI® InfiniteStorage
b. For Resource, select the primitive created in step 5a of "Create Two IP Alias
Groups" on page 58 above, such as ipalias-rack1.
c. For With Resource, select the clone created in step 3 of "Create the Clone" on
page 55 above, such as cxfs-nfs-clone.
d. For Score, select INFINITY.
e. Click OK.
f. Repeat steps 3a through 3e to create the colocation constraint for the second
rack.
4. Click the Add button, select Resource Order, and click OK.
5. Create two resource order constraints so that the clone will be started before the
IP addresses and notifications, one constraint for each rack:
a. Enter the ID of the constraint for the IP address, such as
nfs-before-ipalias-rack1.
b. For First, select the name of the clone created in step 3 of "Create the Clone"
on page 55 above, such as cxfs-nfs-clone.
c. For Then, select the group name defined in step 3 of "Create Two IP Alias
Groups" on page 58 above, such as ipalias-group-1.
d. Under Optional, set score to INFINITY.
e. Click OK.
f. Repeat steps 5a through 5e to create the resource order constraint for the
second rack.
007–5617–007 59
6: CXFS NFS Edge-Serving HA Service
b. Verify that node2 does not accept the IP address packets. For example, run
the following command on node2 (the output should be 0):
node2# ip -o addr show | grep -c 128.162.244.240
0
c. Connect to the virtual address using ssh or telnet and verify that the IP
address is being served by the correct system. For example, for the IP address
128.162.244.240 and the machine named node1:
nfsclient# ssh root@128.162.244.240
Last login: Mon Jul 14 10:34:58 2008 from mynode.mycompany.com
node1# uname -n
node1
d. Move the resource group containing the IPaddr2 resource from node1 to
node2:
node1# crm resource move ipalias-group-1 node2
f. Verify that node1 does not accept the IP address packets by running the
following command on node1 (the output should be 0):
node1# ip -o addr show | grep -c 128.162.244.240
0
g. Connect to the virtual address using ssh or telnet and verify that the IP
address is being served by the correct system. For example, for the IP address
128.162.244.240 and the machine named node2:
nfsclient# ssh root@128.162.244.240
Last login: Mon Jul 14 10:34:58 2008 from mynode.mycompany.com
node2# uname -n
node2
h. Move the resource group containing the IPaddr2 resource back to node1:
60 007–5617–007
High Availability Guide for SGI® InfiniteStorage
c. Acquire locks:
nfsclient:/hostalias1 # touch file
nfsclient:/hostalias1 # flock -x file -c "sleep 1000000" &
nfsclient:/hostalias2 # touch file2
nfsclient:/hostalias2 # flock -x file2 -c "sleep 1000000" &
d. Check in the shared sm-notify statedir directory on the NFS server for
resources hostalias1 and hostalias2 to ensure that a file has been
created by statd. The name should be the hostname of the node on which
you have taken the locks.
If the file is not present, it indicates a misconfiguration of name resolution.
Ensure that fully qualified domain name entries for each NFS client are
present in /etc/hosts on each NFS server. (If the /etc/hosts file is not
present, NSM reboot notification will not be sent to the client and locks will
not be reclaimed.)
e. On the NFS clients, check in the /var/lib/nfs/sm for a filename that is the
fully qualified domain name of each server from which you have requested
locks. If this file is not present, NSM reboot notification will be rejected by the
client. (The client must mount the ipalias node, such as hostalias1, by
hostname and not by the IP address in order for this to work.)
007–5617–007 61
6: CXFS NFS Edge-Serving HA Service
62 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• On node2:
node2# chkconfig cxfs_client off
2. Add the CXFS client NFS resource primitive. See "Creating the CXFS Client
Primitive" on page 63.
Note: There are no meta attributes for this primitive in this example procedure
because it is part of a clone resource that should always restart locally.
Note: Click the Operations tab to edit the monitor operations and to add the probe,
start, and stop operations as needed for a resource.
007–5617–007 63
6: CXFS NFS Edge-Serving HA Service
• Checks the /proc/mounts file until all volumes in volnames are mounted
• Fails if the CXFS client fails to start
64 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Using the example procedure in this guide, you should go on to add the
primitive for "CXFS Client NFS Server Resource" on page 65 before testing the clone.
You will test the resources later, after completing the clone.
007–5617–007 65
6: CXFS NFS Edge-Serving HA Service
2. On both nodes, disable the nfsserver service from being started automatically
at boot time:
• On node1:
node1# chkconfig nfsserver off
• On node2:
node2# chkconfig nfsserver off
3. Add the CXFS client NFS resource primitive. See "Creating the CXFS Client NFS
Server Primitive" on page 66.
/etc/init.d/nfsserver
66 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 67
6: CXFS NFS Edge-Serving HA Service
• Fails if the NFS server does not start or if the NLM grace notification cannot be
enabled
68 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Using the example procedure in this guide, you should return to step 10 of
"Create the Clone" on page 55.
Note: If you do not specify values for nic, cidr_netmask, and broadcast, appropriate
values will be determined automatically.
007–5617–007 69
6: CXFS NFS Edge-Serving HA Service
70 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Using the example procedure in this guide, you should go on to "Creating the
CXFS Client NSM Notification Primitive" on page 71. You will test the resources later.
007–5617–007 71
6: CXFS NFS Edge-Serving HA Service
/mnt/cxfsvol2/statd/smnotify-rack1.pid
gracedir Directory on the NFS state filesystem that specifies the file
containing the grace-period state, such as:
/mnt/cxfsvol2/grace
72 007–5617–007
High Availability Guide for SGI® InfiniteStorage
name monitor
Interval 0
Timeout Timeout, such as 60s
The probe operation checks to see if the resource is already running.
007–5617–007 73
6: CXFS NFS Edge-Serving HA Service
Note: Using the example procedure in this guide, you should go back to step 6 of
"Create Two IP Alias Groups" on page 58. You will test the resources later.
74 007–5617–007
Chapter 7
DMF HA Service
Note: The attributes listed in this chapter and the various value recommendations are
in support of this example. If you are using the resources in a different manner, you
must evaluate whether these recommendations and the use of meta attributes apply
to your intended site-specific purpose.
007–5617–007 75
7: DMF HA Service
Figure 7-1 on page 77 shows a map of an example configuration process for DMF HA
in a two-node active/passive HA cluster (node1 and node2), referring to resource
agent type names such as lxvm and IPaddr2. This map also describes the start/stop
order for resources.
You must configure a resource group and then add and test resource primitives in the
order shown in this chapter, skipping products that do not apply to your site.
76 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Group_DMF
Filesystem
Filesystem
Filesystem
IPaddr2
tmf or openvault
copan_ov_client (optional)
Filesystem
Filesystem
dmf
nfsserver (optional)
dmfman (optional)
007–5617–007 77
7: DMF HA Service
To create the resource group (referred to in the examples in this guide as dmfGroup),
do the following:
1. Invoke the HA GUI:
node1# crm_gui
See the information about setting the password and using the HA GUI in:
Note: Click the Operations tab to edit the monitor operations and to add the
probe, start, and stop operations as needed for a resource.
8. Add additional primitives for the other resources that should be part of
dmfGroup, in the order shown in this guide:
a. "Virtual IP Address Resource" on page 96
b. A mounting service, either:
• "TMF Resource" on page 100
• "OpenVault Resource" on page 107 and (optionally) "COPAN MAID
OpenVault Client Resource" on page 119
78 007–5617–007
High Availability Guide for SGI® InfiniteStorage
CXFS Resource
This section discusses examples of the following:
• "Creating the CXFS Primitive" on page 79
• "Testing the CXFS Resource" on page 81
007–5617–007 79
7: DMF HA Service
Note: Click the Operations tab to edit the monitor operations and to add the probe,
start, and stop operations as needed for a resource.
80 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: If you have multiple clusters on the same network, add the -i
clustername option to identify the cluster name. For more information, see the
cxfs_admin(8) man page.
007–5617–007 81
7: DMF HA Service
Note: After a cxfs primitive has been added to a resource group’s configuration,
moving that resource group will unmount the filesystem defined in the primitive.
This will result in killing any process that has that filesystem in the path of its
current working directory.
4. Move the resource group containing the cxfs resource back to node1:
node1# crm resource move dmfGroup node1
82 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 83
7: DMF HA Service
84 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: If the timeout is too short for a start operation, the crm status and
crm_verify -LV output and the /var/log/messages file will have an entry
that refers to the action being “Timed Out”. For example (line breaks shown
here for readability):
node1# crm status | grep Timed
lxvm_start_0 (node=node1, call=222, rc=-2): Timed Out
3. Verify that the local XVM volumes are visible and online on node2:
node2# xvm -d local show vol
007–5617–007 85
7: DMF HA Service
6. Verify that the local XVM volumes are visible and online on node1:
Filesystem Resources
This section discusses examples of the following:
• "Filesystems Supported" on page 87
• "Configuring a DMF-Managed User Filesystem or DMF Administrative Filesystem
for HA" on page 88
• "Creating a DMF-Managed User Filesystem Primitive" on page 88
• "Creating a DMF Administrative Filesystem Primitive" on page 90
• "Creating a Dedicated OpenVault Server Filesystem Primitive (Optional)" on page
93
• "Testing Filesystem Resources" on page 95
86 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Filesystems Supported
In this release, SGI supports the following types of filesystems for DMF HA:
• DMF-managed user filesystems
Note: You must specify the dmi and mtpt mount options when configuring a
DMF-managed user filesystem.
007–5617–007 87
7: DMF HA Service
88 007–5617–007
High Availability Guide for SGI® InfiniteStorage
fstype xfs
007–5617–007 89
7: DMF HA Service
90 007–5617–007
High Availability Guide for SGI® InfiniteStorage
fstype xfs
007–5617–007 91
7: DMF HA Service
92 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 93
7: DMF HA Service
migration_threshold 1
94 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Move the resource group containing all of the Filesystem resources from
node1 to node2:
node1# crm resource move dmfGroup node2
007–5617–007 95
7: DMF HA Service
• On node1, check the mount table and verify that none of the filesystems are
mounted.
5. Move the resource group containing all of the Filesystem resources back to
node1:
node1# crm resource move dmfGroup node1
96 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Type IPaddr2
007–5617–007 97
7: DMF HA Service
98 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Verify that node2 does not accept the IP address packets by running the
following command on node2 (there should be no output):
node2# ip -o addr show | grep ’128.162.244.240/’
node2#
3. Connect to the virtual address using ssh or telnet and verify that the IP
address is being served by the correct system. For example, for the IP address
128.162.244.240 and the machine named node1:
ha# ssh root@128.162.244.240
Last login: Mon Jul 14 10:34:58 2008 from mynode.mycompany.com
ha# uname -n
node1
4. Move the resource group containing the IPaddr2 resource from node1 to node2:
node1# crm resource move dmfGroup node2
6. Verify that node1 does not accept the IP address packets by running the
following command on node1 (the output should be no output):
node1# ip -o addr show | grep ’128.162.244.240/’
node1#
007–5617–007 99
7: DMF HA Service
7. Connect to the virtual address using ssh or telnet and verify that the IP
address is being served by the correct system. For example, for the IP address
128.162.244.240 and the machine named node2:
ha# ssh root@128.162.244.240
Last login: Mon Jul 14 10:34:58 2008 from mynode.mycompany.com
ha# uname -n
node2
8. Move the resource group containing the IPaddr2 resource back to node1:
node1# crm resource move dmfGroup node1
TMF Resource
This section discusses examples of the following:
• "Configuring TMF for HA" on page 100
• "Creating the TMF Primitive" on page 101
• "Testing the TMF Resource" on page 105
/etc/tmf/tmf.config
On node2, if the tape drive pathname (the FILE parameter in the DEVICE
definition) for a given drive is not the same as the pathname for the same drive
100 007–5617–007
High Availability Guide for SGI® InfiniteStorage
4. Create the TMF resource primitive with the fields shown in "Creating the TMF
Primitive" on page 101.
007–5617–007 101
7: DMF HA Service
102 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: You can use the same email address for more
than one device group (such as admin1,admin1). The
email address will be used to send a message whenever
tape drives that were previously available become
unavailable, so that the administrator can take action to
repair the drives in a timely fashion.
007–5617–007 103
7: DMF HA Service
104 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 105
7: DMF HA Service
6. If you need to change any values, modify the primitive using the HA GUI
(crm_gui).
106 007–5617–007
High Availability Guide for SGI® InfiniteStorage
7. Move the resource group containing the tmf resource back to node1:
node1# crm resource move dmfGroup node1
• Use tmmls to verify that all of the loaders on node2 still have a status of UP
9. Remove the implicit location constraints imposed by the administrative move
command above:
node1# crm resource unmove dmfGroup
OpenVault Resource
This section discusses examples of the following:
• "Configuring OpenVault for HA" on page 107
• "Creating the OpenVault Primitive" on page 114
• "Testing the OpenVault Resource" on page 117
When asked for the server hostname, specify the virtual hostname (the
virtualhost value). ov_admin will automatically convert the OpenVault
configuration to an HA configuration by doing the following:
007–5617–007 107
7: DMF HA Service
108 007–5617–007
High Availability Guide for SGI® InfiniteStorage
b. On node2:
i. Use ov_admin to enable the node to issue administrative commands by
entering the virtual hostname:
node2# ov_admin
...
Name where the OpenVault server is listening? [virtualhostname]
6. If your site contains a physical or virtual tape library, define DCPs and LCPs on
the passive node (node2). Whenever ov_admin asks for the server hostname,
use the virtual hostname.
Note: If your site contains COPAN MAID shelves, you will create their
OpenVault components later in "Creating the OpenVault Components on the
Failover Node" on page 119. Therefore, you can skip this step if your site contains
only COPAN MAID shelves (and no physical tape library or COPAN VTL).
You must specify the drive for which would you like to add a DCP and the
DCP name.
On node2, you must configure at least one DCP for each drive that is already
configured on node1.
b. Configure libraries by selecting:
On node2, you must configure at least one LCP for each library that is
already configured on node1:
• When asked for the name of the device, use the same library name that
was used on node1. The LCP instance name will automatically reflect the
node2 name (for example, for the l700a library, the LCP instance name
on node1 is l700a@node1 and the LCP instance name on node2 will be
l700a@node2).
007–5617–007 109
7: DMF HA Service
You must enable each remote LCP using the same library and LCP names
that you used on node2:
4 - Activate another LCP for an existing Library
Now that server configuration is complete, the LCPs on node2 will shortly
discover that they are able to connect to the server.
d. On node2:
i. Verify that the DCPs are running successfully. For example, the following
output shows under DCPHost and DCPStateSoft columns that the DCP
is running and connected to the OpenVault server (ready) on the active
110 007–5617–007
High Availability Guide for SGI® InfiniteStorage
ii. Verify that the LCPs are running. For example, the following output
shows under LCPHost and LCPStateSoft columns that the LCP is
running and connected to the OpenVault server (ready) on the active HA
node (node1) and running in disconnected mode (disconnected) on
the failover node (node2):
node2# ov_dumptable -c LibraryName,LCPName,LCPHost,LCPStateSoft LCP
LibraryName LCPName LCPHost LCPStateSoft
SL500-2 SL500-2@node1 node1 ready
L700A L700A@node1 node1 ready
SL500-2 SL500-2@node2 node2 disconnected
L700A L700A@node2 node2 disconnected
007–5617–007 111
7: DMF HA Service
Note: It may take a minute or two for the LCPs to notice that they are
able to connect to the server and activate themselves. All of the alternate
LCPs should transition to disconnected state, meaning that they have
successfully contacted the server. Do not proceed until they all transition
to disconnected. A state of inactive means that the LCP has not
contacted the server, so if the state remains inactive for more than a
couple of minutes, the LCP may be having problems connecting to the
server.
b. Disable the openvault service from being started automatically at boot time:
node2# chkconfig openvault off
b. Update the server name for each DCP using item 6 in the OpenVault DCP
Configuration menu:
2 - Manage DCPs for locally attached Drives
6 - Change Server Used by DCPs
a - Change server for all DCPs.
c. Restart the DCPs to connect to the OpenVault server using the virtual server
name:
pdmn# service openvault stop
pdmn# service openvault start
112 007–5617–007
High Availability Guide for SGI® InfiniteStorage
d. Update the server name for each LCP using item 8 in the OpenVault LCP
Configuration menu:
1 - Manage LCPs for locally attached Libraries
8 - Change Server Used by LCPs
a - Change server for all LCPs.
e. Restart the LCPs to connect to the OpenVault server using the virtual server
name:
pdmn# service openvault stop
pdmn# service openvault start
This step may generate errors for COPAN MAID shelf DCPs and LCPs whose
default host is not on this host. You can ignore errors such as the following:
shelf C02 is owned by owner_nodename
9. On node1, stop the OpenVault server and any DCPs and LCPs:
node1# ov_stop
10. On node1, disable the openvault service from being started automatically at
boot time:
node1# chkconfig openvault off
12. (Optional) If you want to have additional OpenVault clients that are not DMF
servers, such as for running administrative commands, install the OpenVault
software on those clients and run ov_admin as shown below. When asked for the
server hostname, specify the virtual hostname. This connects the clients to the
virtual cluster, rather than a fixed host, so that upon migration they follow the
server.
Note: You may wish to set the environment variable OVSERVER to the virtual
hostname so that you can use the OpenVault administrative commands without
having to specify the -S parameter on each command.
007–5617–007 113
7: DMF HA Service
a. On node1:
To allow node2 to act as an administrative client, run ov_admin and select
the following menus, answering the questions when prompted:
node1# ov_admin
...
22 - Manage OpenVault Client Machines
1 - Activate an OpenVault Client Machine
b. On the OpenVault client node, use ov_admin to enable the node to issue
administrative commands by entering the virtual hostname, the port number,
and security key as needed:
node2# ov_admin
...
Name where the OpenVault server is listening? [virtualhostname]
What port number is the OpenVault server on virtualhostname using? [44444]
What security key is used for admin commands on the HA OpenVault servers? [none]
114 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 115
7: DMF HA Service
116 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• Fails if OpenVault could not be stopped or if the semaphore could not be cleared
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
9940B_25a1 9940B_drives true false false ready unloaded ready false
9940B_93c8 9940B_drives true false false ready unloaded ready false
9940B_b7ba 9940B_drives true false false ready unloaded ready false
LTO2_682f LTO2_drives true false false ready unloaded ready false
LTO2_6832 LTO2_drives true false false ready unloaded ready false
LTO2_6835 LTO2_drives true false false ready unloaded ready false
LTO2_6838 LTO2_drives true false false ready unloaded ready false
2. Move the resource group containing the openvault resource from node1 to
node2:
node1# crm resource move dmfGroup node2
007–5617–007 117
7: DMF HA Service
3. Verify that all of the drives become available after a few moments. For example:
node2# ov_stat -ld
Library Name Broken Disabled State LCP State
L700A false false ready ready
SL500-2 false false ready ready
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
9940B_25a1 9940B_drives true false false ready unloaded ready false
9940B_93c8 9940B_drives true false false ready unloaded ready false
9940B_b7ba 9940B_drives true false false ready unloaded ready false
LTO2_682f LTO2_drives true false false ready unloaded ready false
LTO2_6832 LTO2_drives true false false ready unloaded ready false
LTO2_6835 LTO2_drives true false false ready unloaded ready false
LTO2_6838 LTO2_drives true false false ready unloaded ready false
4. Move the resource group containing the openvault resource back to node1:
node1# crm resource move dmfGroup node1
5. Verify that all of the drives become available after a few moments. For example:
node1# ov_stat -ld
Library Name Broken Disabled State LCP State
L700A false false ready ready
SL500-2 false false ready ready
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
9940B_25a1 9940B_drives true false false ready unloaded ready false
9940B_93c8 9940B_drives true false false ready unloaded ready false
9940B_b7ba 9940B_drives true false false ready unloaded ready false
LTO2_682f LTO2_drives true false false ready unloaded ready false
LTO2_6832 LTO2_drives true false false ready unloaded ready false
LTO2_6835 LTO2_drives true false false ready unloaded ready false
LTO2_6838 LTO2_drives true false false ready unloaded ready false
118 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: For more information about the shelf identifier, see COPAN MAID for DMF
Quick Start Guide.
Do the following:
1. On node1:
a. Stop all of the shelf’s OpenVault clients:
node1# ov_stop C00*
b. Export the OCF shelf, hostname, and root environment variables for use by
the copan_ov_client script:
node1# export OCF_RESKEY_shelf_name=C00
node1# export OCF_RESKEY_give_host=node2
node1# export OCF_ROOT=/usr/lib/ocf
007–5617–007 119
7: DMF HA Service
2. On node2:
a. Verify that node2 now owns the shelf’s XVM volumes (C00A through C00Z,
although not necessarily listed in alphabetical order):
d. Export the shelf, hostname, and OCF root environment variables for use by
the copan_ov_client script:
node2# export OCF_RESKEY_shelf_name=C00
node2# export OCF_RESKEY_give_host=node1
node2# export OCF_ROOT=/usr/lib/ocf
3. On node1:
a. Verify that node1 once again owns the shelf’s XVM volumes (C00A through
C00Z, although not necessarily listed in alphabetical order):
node1# xvm -d local probe | grep C00
phys/copan_C00M
phys/copan_C00B
phys/copan_C00G
120 007–5617–007
High Availability Guide for SGI® InfiniteStorage
...
007–5617–007 121
7: DMF HA Service
122 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
C02d00 dg_c02 true false false ready unloaded ready false
C02d01 dg_c02 true false false ready unloaded ready false
C02d02 dg_c02 true false false ready unloaded ready false
C02d03 dg_c02 true false false ready unloaded ready false
C02d04 dg_c02 true false false ready unloaded ready false
C02d05 dg_c02 true false false ready unloaded ready false
C02d06 dg_c02 true false false ready unloaded ready false
3. Verify that shelf C02 becomes available after a few minutes on node2:
node2# ov_stat -L C02 -D ’C02.*’
Library Name Broken Disabled State LCP State
C02 false false ready ready
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
C02d00 dg_c02 true false false ready unloaded ready false
C02d01 dg_c02 true false false ready unloaded ready false
007–5617–007 123
7: DMF HA Service
5. Verify that shelf C02 becomes available after a few minutes on node1:
node1# ov_stat -L C02 -D ’C02.*’
Library Name Broken Disabled State LCP State
C02 false false ready ready
Drive Name Group Access Broken Disabled SoftState HardState DCP State Occupied Cartridge PCL
C02d00 dg_c02 true false false ready unloaded ready false
C02d01 dg_c02 true false false ready unloaded ready false
C02d02 dg_c02 true false false ready unloaded ready false
C02d03 dg_c02 true false false ready unloaded ready false
C02d04 dg_c02 true false false ready unloaded ready false
C02d05 dg_c02 true false false ready unloaded ready false
C02d06 dg_c02 true false false ready unloaded ready false
DMF Resource
This section discusses examples of the following:
• "Configuring DMF for HA" on page 125
• "Creating the DMF Primitive" on page 127
• "Testing the DMF Resource " on page 130
124 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: The following procedure requires that the DMF application instances in
OpenVault are configured to use a wildcard ("*") for the hostname and instance
name. For more information, see the chapter about mounting service configuration
tasks in the DMF 6 Administrator Guide for SGI InfiniteStorage.
007–5617–007 125
7: DMF HA Service
Note: If you change this parameter, you must copy the DMF configuration file
(/etc/dmf/dmf.conf) manually to each parallel data mover node and then
restart the services related to DMF. Do not change this parameter while DMF
is running.
•
– Set the INTERFACE parameter in the node object for each potential DMF
server node to the same virtual hostname used for SERVER_NAME in the
base object.
• If using the DMF Parallel Data Mover Option, create node objects for each
parallel data mover node in the HA cluster.
For more information, see the dmf.conf(5) man page and the DMF 6
Administrator Guide for SGI InfiniteStorage.
126 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: You cannot use a symbolic link for parallel data mover nodes because DMF
itself keeps the dmf.conf file synchronized with the server node.
4. If you are using OpenVault and you explicitly set hostnames when you defined
the ov_keys file during initial OpenVault setup, edit the ov_keys file and
replace the hostname in field 1 of the DMF lines with the OpenVault virtual
hostname. For example, if the virtual hostname is virtualhost:
virtualhost dmf * CAPI none
virtualhost dmf * AAPI none
Note: If you used a wildcard hostname (*) when you defined the ov_keys file
during initial OpenVault setup, there is no need to edit this file.
5. On each potential DMF server node in the HA cluster, disable the dmf, dmfman,
and dmfsoap services from being started automatically at boot time:
dmfserver# chkconfig dmf off
dmfserver# chkconfig dmfman off
dmfserver# chkconfig dmfsoap off
6. Create the DMF resource with the fields shown in "Creating the DMF Primitive"
on page 127.
007–5617–007 127
7: DMF HA Service
Type dmf
128 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• Waits for a successful DMF startup by calling dmstat in a loop until dmfdaemon
responds successfully
• Fails if dmfdaemon does not respond to a dmdstat query before the resource
times out
007–5617–007 129
7: DMF HA Service
2. Move the resource group containing the dmf resource to node2 (because the
mounting service is in the same resource group, it must be colocated and thus
should failover with DMF to the new node):
node1# crm resource move dmfGroup node2
3. Verify that DMF has started on the new node by using the dmdstat -v
command and manual dmput and dmget commands on node2:
node2# dmdstat -v
node2# xfs_mkfile size another_test_file
node2# dmput -r another_test_file
node2# dmdidle
(wait a bit to allow time for the volume to be written and unmounted)
node2# dmget another_test_file
node2# rm another_test_file
4. Move the resource group containing the dmf resource back to node1:
node1# crm resource move dmfGroup node1
5. Verify that DMF has started by using the dmdstat -v command and manual
dmput and dmget commands on node1:
node1# dmdstat -v
node1# xfs_mkfile size test_file
node1# dmput -r test_file
node1# dmdidle
(wait a bit to allow time for the volume to be written and unmounted)
130 007–5617–007
High Availability Guide for SGI® InfiniteStorage
NFS Resource
This section discusses examples of the following:
• "Configuring NFS for HA" on page 131
• "Creating the NFS Primitive" on page 132
• "Testing the NFS Resource" on page 134
1. Copy the /etc/exports entries that you would like to make highly available
from node1 to the /etc/exports file on node2.
2. On both nodes, disable the nfsserver service from being started automatically
at boot time:
• On node1:
node1# chkconfig nfsserver off
• On node2:
node2# chkconfig nfsserver off
3. Add the NFS resource primitive. See "Creating the NFS Primitive" on page 132.
007–5617–007 131
7: DMF HA Service
132 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 133
7: DMF HA Service
2. Mount the filesystems on a node that will not be a member of the HA cluster
(otherhost):
otherhost# mount node1:/nfsexportedfilesystem /mnt/test
4. Move the resource group containing the nfsserver resource from node1 to
node2:
node1# crm resource move dmfGroup node2
134 007–5617–007
High Availability Guide for SGI® InfiniteStorage
5. Run the following command on node2 to verify that the NFS filesystems are
exported:
node2# exportfs -v
/work.mynode1 <world>(rw,wdelay,root_squash,no_subtree_check,fsid=8001)
/work.mynode2 <world>(rw,wdelay,root_squash,no_subtree_check,fsid=8002)
/work.mynode3 <world>(rw,wdelay,root_squash,no_subtree_check,fsid=8003)
/work.mynode4 <world>(rw,wdelay,root_squash,no_subtree_check,fsid=8004)
/mirrors <world>(ro,wdelay,root_squash,no_subtree_check,fsid=8005)
/ <world>(ro,wdelay,root_squash,no_subtree_check,fsid=8006)
7. Move the resource group containing the nfsserver resource back to node1:
node1# crm resource move dmfGroup node1
Samba Resources
This section discusses examples of the following:
• "Configuring Samba for HA" on page 135
• "Creating the smb Primitive" on page 137
• "Creating the nmb Primitive" on page 138
• "Testing the Samba Resources" on page 140
007–5617–007 135
7: DMF HA Service
• On node2:
node2# rm -r /etc/samba
c. Make a symbolic link from the shared location to the original name on both
nodes:
• On node1:
node1# ln -s /mnt/data/.ha/etc-samba /etc/samba
• On node2:
node2# ln -s /mnt/data/.ha/etc-samba /etc/samba
2. Make the /var/lib/samba directory and the files within it available on both
nodes. For example:
a. Copy the /var/lib/samba directory and its contents to
/mnt/data/.ha/var-lib-samba on one node. For example, if using
node1:
node1# cp -r /var/lib/samba /mnt/data/.ha/var-lib-samba
136 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Disable the smb and nmb services from being started automatically at boot time
on both nodes:
• On node1:
node1# chkconfig smb off
node1# chkconfig nmb off
• On node2:
node2# chkconfig smb off
node2# chkconfig nmb off
Note: The Samba resources do not have as many required fields or attributes as other
resources.
007–5617–007 137
7: DMF HA Service
/etc/init.d/smb start
• Fails if the smbd service does not start
138 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Type nmb
007–5617–007 139
7: DMF HA Service
Note: Depending upon the setting of the security parameter in the smb.conf
file, this may involve using a Samba account that already exists.
3. Move the resource group containing the smb and nmb resources from node1 to
node2:
node1# crm resource move dmfGroup node2
5. Move the resource group containing the smb and nmb resources back to node1:
node1# crm resource move dmfGroup node1
140 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• On node2:
node2# chkconfig dmfman off
2. Add a primitive for DMF Manager using the values shown in "Creating the DMF
Manager Primitive" on page 142.
3. Run the dmfman_setup_ha script to create the required links and directories in a
commonly accessible filesystem (such as the DMF HOME_DIR) that will allow
DMF statistics archives to be accessible across the HA cluster (include the -u
option only if you are upgrading from a previous release):
• On node1:
node1# /usr/lib/dmf/dmfman_setup_ha -d HOME_DIR [-u node2name]
• On node2:
007–5617–007 141
7: DMF HA Service
• On node1:
node1# /usr/lib/dmf/dmfman_setup_ha -d /dmf/home -u node2
• On node2:
node2# /usr/lib/dmf/dmfman_setup_ha -d /dmf/home -u node1
142 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: You should only define a monitor operation for the dmfman resource if you
want a failure of the DMF Manager resource to cause a failover for the entire resource
group.
• Waits for DMF Manager to start successfully by calling the following in a loop:
/etc/init.d/dmfman status
• Fails if DMF Manager does not start successfully before the resource times out
007–5617–007 143
7: DMF HA Service
144 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• On node1:
node1# chkconfig dmfsoap off
• On node2:
node2# chkconfig dmfsoap off
2. Add a primitive for DMF client SOAP resource using the values shown in
"Creating the DMF Client SOAP Service Primitive" on page 146.
007–5617–007 145
7: DMF HA Service
Note: You should only define a monitor operation for the dmfsoap resource if you
want a failure of DMF client SOAP service to cause a failover for the entire resource
group.
146 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• Waits for DMF client SOAP service to start successfully by calling the following in
a loop:
/etc/init.d/dmfsoap status
• Fails if DMF client SOAP service does not start successfully before the resource
times out
007–5617–007 147
7: DMF HA Service
• Verifies the DMF client SOAP service status by calling the following:
/etc/init.d/dmfsoap status
• Fails if the DMF client SOAP service does not stop successfully
3. Repeat step 1 to verify that DMF client SOAP service is still available.
4. Move the resource group containing the dmfsoap resource back to node1:
node1# crm resource move dmfGroup node1
148 007–5617–007
Chapter 8
Note: The attributes listed in this chapter and the various value recommendations are
in support of this example. If you are using the resources in a different manner, you
must evaluate whether these recommendations and the use of meta attributes apply
to your intended site-specific purpose.
Figure 8-1 on page 150 shows a map of an example configuration process for the
COPAN OpenVault client service for COPAN MAID shelves in an active/active HA
cluster that consists of two parallel data mover nodes named pdmn1 and pdmn2.
(pdmn1 is the same node as node1 referred to in Chapter 4, "Outline of the
Configuration Procedure" on page 39.)The map refers to resource agent type names
such as cxfs-client and copan_ov_client.
007–5617–007 149
8: COPAN MAID HA Service for Mover Nodes
First Clone
cxfs-client
copan_ov_client
Constraint_Resource_Location
Constraint_Resource_Colocation
Constraint_Resource_Order
Last
Start the cxfs-client clone before copan_ov_client
150 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Verify that there are no dmatwc or dmatrc data mover processes running on
either parallel data mover node. For example, the output of the following
command should be empty on both nodes:
• On pdmn1:
pdmn1# ps -ef | egrep ’dmatrc|dmatwc’ | grep -v grep
pdmn1#
• On pdmn2:
pdmn2# ps -ef | egrep ’dmatrc|dmatwc’ | grep -v grep
pdmn2#
If the output is not empty, you must wait for the dmnode_admin -d action from
step 1 to complete (the entire process can take 6 minutes or longer). Rerun the ps
command until there is no output.
3. Determine which CXFS filesystems are mounted:
# ls /dev/cxvm
Save the output from this command for use later when you define the volnames
instance attribute in "Instance Attributes for a CXFS Client" on page 159.
4. On both parallel data mover nodes, disable the openvault and cxfs_client
services from being started automatically at boot time and stop the currently
running services:
• On pdmn1:
pdmn1# chkconfig openvault off
pdmn1# chkconfig cxfs_client off
007–5617–007 151
8: COPAN MAID HA Service for Mover Nodes
• On pdmn2:
pdmn2# chkconfig openvault off
pdmn2# chkconfig cxfs_client off
Note: For more information about the shelf identifier, see COPAN MAID for DMF
Quick Start Guide.
Do the following:
1. On pdmn1:
a. Export the shelf, hostname, and OCF root environment variables for use by
the copan_ov_client script:
pdmn1# export OCF_RESKEY_shelf_name=C00
pdmn1# export OCF_RESKEY_give_host=pdmn2
pdmn1# export OCF_ROOT=/usr/lib/ocf
152 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. On pdmn2:
a. Verify that pdmn2 now owns the shelf’s XVM volumes (C00A through C00Z,
although not necessarily listed in alphabetical order):
pdmn2# xvm -d local probe | grep C00
phys/copan_C00M
phys/copan_C00B
phys/copan_C00G
...
For more information, see COPAN MAID for DMF Quick Start Guide.
c. Stop the newly created LCP and DCPs for the shelf:
pdmn2# ov_stop C00*
d. Export the shelf, hostname, and OCF root environment variables for use by
the copan_ov_client script:
pdmn2# export OCF_RESKEY_shelf_name=C00
pdmn2# export OCF_RESKEY_give_host=pdmn1
pdmn2# export OCF_ROOT=/usr/lib/ocf
3. On pdmn1, verify that pdmn1 once again owns the shelf’s XVM volumes:
pdmn1# xvm -d local probe | grep C00
phys/copan_C00M
phys/copan_C00B
phys/copan_C00G
...
Note: For load-balancing purposes, pdmn1 should be the default node for half of
the shelves and pdmn2 should be the default node for the remaining shelves.
007–5617–007 153
8: COPAN MAID HA Service for Mover Nodes
154 007–5617–007
High Availability Guide for SGI® InfiniteStorage
b. Verify that the cxfs_client process is running on pdmn1 and pdmn2. For
example:
• On pdmn1:
pdmn1# ps -ef | grep cxfs_client | grep -v grep
root 11575 1 0 10:32 ? 00:00:00 /usr/cluster/bin/cxfs_client -p /var/run/cxfs_client.pid -i TEST
• On pdmn2:
pdmn2# ps -ef | grep cxfs_client | grep -v grep
root 11576 1 0 10:32 ? 00:00:00 /usr/cluster/bin/cxfs_client -p /var/run/cxfs_client.pid -i TEST
3. Set pdmn2 to standby state to ensure that the resources remain on pdmn1:
pdmn1# crm node standby pdmn2
007–5617–007 155
8: COPAN MAID HA Service for Mover Nodes
4. Confirm that pdmn2 is offline and that the resources are off:
a. View the status of the cluster on pdmn1, which should show that pdmn2 is in
standby state:
pdmn1# crm status
============
Last updated: Tue Aug 23 12:16:55 2011
Stack: openais
Current DC: pdmn1 - partition with quorum
Version: 1.1.5-5bd2b9154d7d9f86d7f56fe0a74072a5a6590c60
2 Nodes configured, 2 expected votes
2 Resources configured.
============
b. Verify that the cxfs_client process is not running on pdmn2. For example,
executing the following command on pdmn2 should provide no output:
pdmn2# ps -ef | grep cxfs_client | grep -v grep
pdmn2#
6. Confirm that the clone has returned to started status, as described in step 2.
Note: It may take several minutes for all filesystems to mount successfully.
156 007–5617–007
High Availability Guide for SGI® InfiniteStorage
a. Click the Add button, select Resource Location, and click OK.
b. Enter the constraint ID, such as C00_on_pdmn1 for the constraint for shelf
C00 managed by pdmn1.
c. Select the name of the COPAN OpenVault client primitive as the Resource.
d. Enter a score based on whether the node is the default node (200) or failover
node (100).
e. Enter the node name.
007–5617–007 157
8: COPAN MAID HA Service for Mover Nodes
f. Click OK.
g. Repeat steps 2a through 2f to create the remaining location constraints.
3. Create a colocation constraint for each shelf:
a. Click the Add button, select Resource Colocation, and click OK.
b. Enter the ID of the constraint for the shelf, such as C00_with_cxfs for shelf
C00.
c. Select the name of the COPAN OpenVault client primitive as the Resource.
d. Select the name of the CXFS client clone as the With Resource.
e. Select INFINITY for Score.
f. Click OK.
g. Repeat steps 3a through 3f to create the colocation constraint for the
remaining shelves.
4. Create an order constraint for each shelf:
a. Click the Add button, select Resource Order, and click OK.
b. Enter the ID of the constraint for the shelf, such as C00_cxfs_first for
shelf C00.
c. Select the name of the CXFS client clone as First.
d. Select the name of the COPAN OpenVault client primitive as Then.
e. Open Optional and select true for Symmetrical.
f. Click OK.
g. Repeat steps 4a through 4f to create the order constraint for the remaining
shelves.
158 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: There are no meta attributes for this primitive in this example procedure
because it is part of a clone resource that should always restart locally.
007–5617–007 159
8: COPAN MAID HA Service for Mover Nodes
Note: Click the Operations tab to edit the monitor operations and to add the probe,
start, and stop operations as needed for a resource.
160 007–5617–007
High Availability Guide for SGI® InfiniteStorage
• Checks the /proc/mounts file until all volumes in volnames are mounted
• Fails if the CXFS client fails to start
007–5617–007 161
8: COPAN MAID HA Service for Mover Nodes
162 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 163
Chapter 9
STONITH Examples
007–5617–007 165
9: STONITH Examples
4. Use the target-role of Started and the default options (2 maximum number of
copies and 1 number of copies on a single node) and click Forward.
5. Select OK to add a Primitive. Add the STONITH primitive according to the steps
described in "Creating the SGI IPMI STONITH Primitive " on page 166.
166 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 167
9: STONITH Examples
2. Verify that the specified node was reset and was able to successfully complete a
reboot.
168 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 169
9: STONITH Examples
170 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Verify that the specified node was reset and was able to successfully complete a
reboot.
007–5617–007 171
Chapter 10
007–5617–007 173
10: Administrative Tasks and Considerations
You can then manually stop and restart individual resources as needed.
To return the cluster to managed status, enter the following:
ha# crm configure property maintenance-mode=false
You can view the resulting text file with any text tool, such as cat(1) or vi(1).
174 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Caution: Do not clear the resource state on the node where a resource is currently
! running.
After you resolve the cause of action error messages in the crm status output,
you should enter the following to clear the resource state from a given node:
ha# crm resource cleanup resourcePRIMITIVE nodename
Note: Sometimes, the resource state can be cleared automatically if the same action
for the same resource on the same node subsequently completes successfully.
007–5617–007 175
10: Administrative Tasks and Considerations
176 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Make the required changes to the DMF configuration file according to the
instructions in the DMF administrator’s guide, such as by using DMF Manager.
4. Verify the parameter changes by using DMF Manager or the following command:
ha# dmcheck
6. Verify DMF functionality, such as by running the following command and other
DMF commands (based on the changes made):
ha# dmdstat -v
007–5617–007 177
10: Administrative Tasks and Considerations
178 007–5617–007
High Availability Guide for SGI® InfiniteStorage
3. Verify that there are no dmatwc or dmatrc data mover processes running on
pdmn1. For example, the output of the following command should be empty:
pdmn1# ps -ef | egrep ’dmatrc|dmatwc’ | grep -v grep
pdmn1#
If the output is not empty, you must wait for the dmnode_admin -d action from
step 2 to complete (the entire process can take 6 minutes or longer). Rerun the ps
command until there is no output.
4. Clear any failcounts and move the resource to pdmn2:
pdmn2# crm resource failcount copan_maid_C01 delete pdmn2
pdmn2# crm resource move copan_maid_C01 pdmn2
007–5617–007 179
10: Administrative Tasks and Considerations
180 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Assuming that you have a two-node production HA environment in place and want
to perform a rolling upgrade of appropriate software with minimal testing, use the
procedures in the following sections:
• "CXFS NFS Edge-Serving HA Rolling Upgrade" on page 181
• "DMF HA Rolling Upgrade" on page 183
• "COPAN MAID HA Service for Mover Nodes Rolling Upgrade" on page 185
4. Set the node you intend to upgrade to standby state. (Putting a node in
standby state will move, if possible, or stop any resources that are running on
that node.) For example, if you intend to upgrade node2:
node1# crm node standby node2
007–5617–007 181
10: Administrative Tasks and Considerations
11. (Optional) Allow the resource groups to run on node2 for a period of time as a
test.
12. Repeat steps 4 through 11 above but switching the roles for node1 and node2.
Note: In most cases, you will want to leave the resource groups running on node2 in
order to avoid any unnecessary interruptions to the services that would have to be
restarted if they were moved to node1. However, if you prefer to have the resource
groups run on node1 despite any potential interruptions, do the following:
1. Move the appropriate resource group from node2 back to node1. For example:
node1# crm resource move ipalias-group-1 node1
182 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Note: Moving the dmfGroup resource group will involve CXFS relocation of the
DMF administrative filesystems and DMF managed user filesystems. However,
you cannot use CXFS relocation if your CXFS cluster also includes a CXFS NFS
edge-server HA pair and the CXFS server-capable administration nodes are
running different software levels. If that is the case, you must move the
dmfGroup resource group via CXFS recovery by resetting the node that is
running the dmfGroup resource.
Note: Stopping openais will cause a failover if there are resources running
on the node. Depending on how things are defined and whether the resource
stop actions succeed, it might even cause the node to be reset.
007–5617–007 183
10: Administrative Tasks and Considerations
7. Verify that node2 is fully back in the CXFS cluster with filesystems mounted.
8. Enable the openais service to be started automatically at boot time and then
start it immediately on node2:
node2# chkconfig openais on
node2# service openais start
Note: In most cases, you will want to leave the resource group running on node2 in
order to avoid any unnecessary interruptions to the services that would have to be
restarted if they were moved to node1. However, if you prefer to have the resource
group run on node1 despite any potential interruptions, do the following:
1. Move the appropriate resource groups from node2 back to node1:
node1# crm resource move dmfGroup node1
184 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 185
10: Administrative Tasks and Considerations
To stop the openais service on the local node, enter the following:
ha# service openais stop
Note: This command requires that the stonith resource is defined in the CIB.
186 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 187
10: Administrative Tasks and Considerations
Note: In most cases, you will want to leave the resources running on upnode in
order to avoid any unnecessary interruptions to the services that would have to be
restarted if they were moved to downnode. However, if you prefer to have the
resources run on downnode despite any potential interruptions, do the following:
1. Restart resources on downnode:
188 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Stop all resources in the proper order (bottom up). For example, using the
example procedures in this guide, you would stop the IP alias resource groups
and the clone:
ha# crm resource stop ipalias-group-2
ha# crm resource stop ipalias-group-1
ha# crm resource stop cxfs-nfs-clone
3. Disable the services related to HA and CXFS from being started automatically at
boot time:
• On all HA servers:
ha# chkconfig openais off
4. Shut down all of the HA cluster systems and the CXFS cluster systems.
5. Perform the required maintenance.
6. Perform component-level testing associated with the maintenance.
007–5617–007 189
10: Administrative Tasks and Considerations
7. Reboot all of the HA cluster systems and the CXFS cluster systems.
8. Enable the services related to CXFS to be started automatically at boot time and
start them immediately as follows:
• On all CXFS servers:
cxfsserver# chkconfig cxfs_cluster on
cxfsserver# chkconfig cxfs on
cxfsserver# service cxfs_cluster start
cxfsserver# service cxfs start
9. On the NFS edge servers, disable the cxfs_client service from being started
automatically at boot time and stop the currently running service immediately:
edge# chkconfig cxfs_client off
edge# service cxfs_client stop
10. On the HA servers, disable the openais service from being started automatically
at boot time and stop the currently running service immediately:
ha# chkconfig openais on
ha# service openais start
190 007–5617–007
High Availability Guide for SGI® InfiniteStorage
14. Remove the implicit location constraints imposed by the administrative move
command above. For example:
ha# crm resource unmove ipalias-group-1
ha# crm resource unmove ipalias-group-2
3. Disable the services related to HA and CXFS (if applicable) from being started
automatically at boot time:
• On all HA servers:
ha# chkconfig openais off
4. Shut down all of the HA cluster systems and any CXFS cluster systems.
007–5617–007 191
10: Administrative Tasks and Considerations
13. Remove the implicit location constraints imposed by the administrative move
command above:
ha# crm resource unmove dmfGroup
192 007–5617–007
High Availability Guide for SGI® InfiniteStorage
2. Verify that there are no dmatwc or dmatrc data mover processes running on
either node. For example, the output of the following command should be empty
on each parallel data mover node:
ha# ps -ef | egrep ’dmatrc|dmatwc’ | grep -v grep
ha#
If the output is not empty, you must wait for the dmnode_admin -d action from
step 1 to complete (the entire process can take 6 minutes or longer). Rerun the ps
command until there is no output.
3. On both parallel data mover nodes, disable services related to HA:
ha# chkconfig openais off
007–5617–007 193
10: Administrative Tasks and Considerations
12. On the DMF server, reenable both nodes for COPAN MAID activity:
dmfserver# dmnode_admin -e pdmn1
dmfserver# dmnode_admin -e pdmn2
194 007–5617–007
Chapter 11
Troubleshooting
Diagnosing Problems
If you notice problems, do the following:
• "Monitor the Status Output" on page 195
• "Verify the Configuration in Greater Detail" on page 196
• "Increase the Verbosity of Error Messages" on page 196
• "Match Status Events To Error Messages" on page 196
• "Verify chkconfig Settings" on page 197
• "Diagnose the Problem Resource" on page 197
• "Examine Application-Specific Problems that Impact HA" on page 197
• "Test the STONITH Capability" on page 198
• "Gather Troubleshooting Data" on page 198
• "Use SGI Knowledgebase" on page 201
007–5617–007 195
11: Troubleshooting
Note: If you run crm_verify before STONITH is enabled, you will see errors. Errors
similar to the following may be ignored if STONITH is intentionally disabled and will
go away after STONITH is reenabled (line breaks shown here for readability):
crm_verify[182641]: 2008/07/11_16:26:54 ERROR: unpack_operation:
Specifying on_fail=fence and
stonith-enabled=false makes no sense
196 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Caution: Ensure that you do not start a resource on multiple nodes. Verify that a
! resource is not already up on another node before you start it.
007–5617–007 197
11: Troubleshooting
2. Verify that the specified node was reset and was able to successfully complete a
reboot.
198 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Run the following commands as root on every node in the cluster in order to gather
system configuration information:
ha# /usr/sbin/system_info_gather -A -o node.out
ha# /sbin/supportconfig
Collect any other system log files that may contain information about the openais
service or the services included in the HA configuration (if not otherwise gathered by
the above tools).
where:
• priortime specifies a time prior to when the problem began (specify priortime in
Date::Parse Perl module format)
• destination_directory is the absolute pathname of a nonexistent directory that will
be created as a compressed bunzip2 tarball in the format:
destination_directory.tar.bz2
For example, if run at 3:06 PM on June 2, 2010, the following command line will
create a report starting from 1:00 AM (0100) that day and place the output in
/tmp/hb_report.20100602-1506.tar.bz2:
ha# hb_report -f 1am /tmp/hb_report.$(date+%Y%m%d-%H%M)
007–5617–007 199
11: Troubleshooting
200 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 201
11: Troubleshooting
d. Clear any implicit location constraints that may have been created by a
previous administrative move command:
ha# crm resource unmove resourceGROUP
2. Clearly delineate the start of each test in the logs by using the logger(1)
command. For example:
ha# logger "TEST START - testdescription".
Note: The HBA tests presume that the system has redundant Fibre Channel HBA
paths to storage.
202 007–5617–007
007–5617–007
Table 11-1 Failover Tests
System Reboot the active node: All resources running on the rebooted
reboot server should move to the failover
active# reboot
node
11: Troubleshooting
Test Type Action Expected Result
Single Disable the port for the Fibre Channel HBA. For A device failover will not actually
simulated example: occur until I/O is attempted via the
HBA failure failed HBA path. An XVM failover to
brocade> portdisable portnumber an alternate path should occur after
I/O is performed on the system.
Note: Remember to reenable the port
after the test. For example:
brocade> portenable portnumber
Multiple Disable the port for the Fibre Channel HBA. For The server should be reset after I/O is
simulated example: performed on the system. There will
HBA failures likely be multiple monitor operation
brocade> portdisable portnumber
failures for various resources followed
by a stop operation failure, which will
Repeat for every HBA port on the system. result in a system reset and a forced
XVM failover.
Note: Remember to reenable the port
after the test. For example:
brocade> portenable portnumber
007–5617–007
High Availability Guide for SGI® InfiniteStorage
Corrective Actions
The following are corrective actions:
• "Recovering from an Incomplete Failover" on page 205
• "Recovering from a CIB Corruption" on page 206
• "Clearing the Failcounts After a Severe Error" on page 206
007–5617–007 205
11: Troubleshooting
Note: This procedure assumes that you have a good backup copy of the CIB that
contains only static configuration information, as directed in "Backing Up the CIB" on
page 174.
For more information, see the SUSE High Availability Guide and the cibadmin(8) man
page.
206 007–5617–007
Appendix A
Table A-1 summarizes the differences among the following, for those readers who
may be familiar with the older products:
• FailSafe®
• Linux-HA Heartbeat
• Pacemaker and Corosync
Note: These products do not work together and cannot form an HA cluster.
Size of cluster 8 nodes 8+ nodes (Specific resource agents 16 nodes in active/passive mode for
may have cluster size limitations. DMF, but 2 nodes recommended. 2
DMF can run on only 2 nodes in nodes for CXFS NFS edge-serving in
active/passive mode.) active/active mode. 2 parallel data
mover nodes for COPAN OpenVault
client in active/active mode.
007–5617–007 207
A: Differences Among FailSafe®, Heartbeat, and Pacemaker/Corosync
Node/member name Hostname or private network Hostname and private network Hostname and private network
address address address
NFS lock failover Supported Not supported by the operating Supported in active/passive
system configurations
Network tiebreaker A node that is participating in You can configure Heartbeat to use a You can configure HA to use a
the cluster membership. variety of methods to provide variety of methods to provide
FailSafe tries to include the tiebreaker functionality. tiebreaker functionality.
tiebreaker node in the
membership in case of a split
cluster.
Making changes while Depends upon the plug-in and Service parameters can be changed Service parameters can be changed
the service is enabled the configuration device while a service is running. Depending while a service is running.
parameter. on the service and parameter, a Depending on the service and
change may cause a stop/start parameter, a change may cause a
or a trigger a restart action. SGI stop/start or a trigger a
recommends that you do not make restart action. SGI recommends
any changes that could stop or restart that you do not make any changes
DMF and CXFS. that could stop or restart DMF and
CXFS.
208 007–5617–007
High Availability Guide for SGI® InfiniteStorage
Heartbeat interval and You can specify cluster Provides a number of parameters to Provides a number of parameters to
timeout membership heartbeat interval tune node status monitoring and tune node status monitoring and
and timeout (in milliseconds). failure actions. failure actions.
Heartbeat networks Allows multiple networks to You can configure Heartbeat to You can configure HA to
be designated as heartbeat communicate over one or more communicate over one or more
networks. You can choose a private or public networks. private or public networks.
list of networks.
Action scripts Separate scripts named Open Cluster Framework (OCF) OCF resource agent specification,
start, stop, monitor, resource agent specification, which which may support start,
restart, exclusive. may supportstart, monitor, monitor, stop, and restart
stop, and restart actions as actions as well as other
well as other more-specialized more-specialized actions.
actions.
Resource timeouts Timeouts can be specified for Timeouts and failover actions are Timeouts and failover actions are
each action (start,
stop, highly configurable. highly configurable.
monitor, restart,
exclusive) and for each
resource type independently.
Resource Resource and resource type Provides great flexibility to configure Provides great flexibility to configure
dependencies dependencies are supported resource dependencies. resource dependencies.
and can be modified by the
user.
Failover policies The ordered and round-robin Heartbeat provides great flexibility to HAE provides great flexibility to
failover policies are configure resource failover policies. configure resource failover policies.
predefined. User-defined
failover policies are supported.
007–5617–007 209
Glossary
This glossary lists terms and abbreviations used within this guide. For a more
information, see the SUSE High Availability Guide:
http://www.suse.com/documentation/sle_ha/
active/active mode
An HA cluster in which multiple nodes are able to run disjoint sets of resources, with
each node serving as a backup for another node’s resources in case of node failure.
active/passive mode
An HA cluster in which all of the resources run on one node and one or more other
nodes are the failover nodes in case the first node fails.
active node
The node on which resources are running.
BMC
Baseboard management controller, a system controller used in resetting x86_64 systems.
CIB
Cluster information base, used to define the HA cluster.
clone
A resource that is active on more than one node.
COPAN MAID
Power-efficient long-term data storage based on an enterprise massive array of idle
disks (MAID) platform.
Corosync
The underlying HA architecture that provides core messaging and membership
functionality
007–5617–007 211
Glossary
CXFS
Clustered XFS.
DCP
Drive control program.
DMF
Data Migration Facility, a hierarchical storage management system for SGI
environments.
DMF Manager
A web-based tool you can use to deal with day-to-day DMF operational issues and
focus on work flow.
edge-serving
See CXFS NFS edge-serving.
failover node
The node on which resources will run if the active node fails or if they are manually
moved by the administrator. Also known as the passive node or standby node.
fencing
The method that guarantees a known cluster state when communication to a node
fails or actions on a node fail. (This is node-level fencing , which differs from the
concept of I/O fencing in CXFS.)
HA
High availability, in which resources fail over from one node to another without
disrupting services for clients.
212 007–5617–007
High Availability Guide for SGI® InfiniteStorage
HA fail policy
A parameter defined in the CIB that determines what happens when a resource fails.
HA-managed filesystem
A filesystem that will be made highly available according to the instructions in this
guide.
HA service
The set of resources and resource groups that can fail over from one node to another
in an HA cluster. The HA service is usually associated with an IP address.
IPMI
Intelligent Platform Management Interface, a system reset method for x86_64 systems.
ISSP
InfiniteStorage Software Platform, an SGI software distribution.
LCP
library control program.
LSB
Linux Standard Base.
node1
In the examples in this guide, the initial host (which will later become a node in the
HA cluster) on which all of the filesystems will be mounted and on which all tape
drives and libraries are accessible. See also alternate node.
NSM
Network Status Monitor.
007–5617–007 213
Glossary
node2
In the examples in this guide, the alternate host in the HA cluster other than the first
node (node1). See also node1.
OCF
Open Cluster Framework.
OpenAIS
The underlying HA product that provides an HA API for certain applications, using
the openais service
OpenVault
A tape mounting service used by DMF.
owner node
The node that DMF will use to perform migrations and recalls on a given shelf. The
node on which you run ov_copan becomes the owner node of that shelf. In an HA
environment, ownership is transferred as part of HA failover.
Pacemaker
The underlying HA architecture that provides cluster resource management
physvol
XVM physical volume.
primitive
Used to define a resource in the CIB.
resource
An application that is managed by HA.
resource agent
The software that allows an application to be highly available without modifying the
application itself.
214 007–5617–007
High Availability Guide for SGI® InfiniteStorage
resource group
A set of resources that are colocated on the same node and ordered to start and stop
serially. The resources in a resource group will fail over together as a set.
resource stickiness
An HA concept that determines whether a resource should migrate to another node
or stay on the node on which it is currently running.
serverdir directory
A directory dedicated to holding OpenVault’s database and logs within a highly
available filesystem in the DMF resource group.
SOAP
Simple Object Access Protocol
split cluster
A situation in which cluster membership divides into multiple clusters, each claiming
ownership of the same filesystems, which can result in filesystem data corruption.
Also known as split-brain syndrome.
standard service
An application before HA has been applied to it.
STONITH
Shoot the other node in the head, the facility that guarantees cluster state by fencing
non-responsive or failing nodes.
TMF
Tape Management Facility, a tape mounting service used by DMF.
XFS
A filesystem implementation type for the Linux operating system. It defines the
format that is used to store data on disks managed by the filesystem.
007–5617–007 215
Glossary
WSDL
Web Service Definition Language
XVM
Volume manager for XFS filesystems (local XVM).
216 007–5617–007
Index
007–5617–007 217
Index
218 007–5617–007
High Availability Guide for SGI® InfiniteStorage
HA service L
terminology, 1
hardware requirements, 26 licensing requirements, 26
Heartbeat differences, 207 Linux—HA Heartbeat differences, 207
heartbeat timeout, 20 local XVM
high availability and SGI products, 1 configuring the lxvm resource, 83
historical files, 176 lxvm resource agent, 3
HOME_DIR, 36 requirements, 31
hostname consistency, 12 testing lxvm, 85
testing the standard service, 47
log files, 175
I logging, 196
lxvm resource agent, 3
I/O fencing and system reset, 30
implicit constraint, 20
incomplete failover, 205 M
INFINITY, 20
initial node, 40 MAID shelf, 33
INTERFACE, 37 maintenance mode, 174
Interleave option, 55 manual system reset, 186
introduction, 1
007–5617–007 219
Index
220 007–5617–007
High Availability Guide for SGI® InfiniteStorage
007–5617–007 221
Index
222 007–5617–007
High Availability Guide for SGI® InfiniteStorage
U YaST pattern, 2
007–5617–007 223