PowerHA 7.2 Configuration Guide - V1.0 PDF
PowerHA 7.2 Configuration Guide - V1.0 PDF
PowerHA 7.2 Configuration Guide - V1.0 PDF
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Table of Contents
Contents
Contents.................................................................................................................................................. 2
1) Pre-requisites verification and installation............................................................................................ 7
2) PowerHA Removal............................................................................................................................... 7
3) cfgscsi_id Configuration....................................................................................................................... 9
3.1) Create Event.................................................................................................................................... 9
3.2) Validate Event.................................................................................................................................. 9
3.3) Update node_up Event..................................................................................................................... 9
3.4) Validate node_up............................................................................................................................. 9
3.5) Update node_down Event.............................................................................................................. 10
3.6) Validate node_down....................................................................................................................... 10
4) /etc/hosts Update [Multi-node Configuration Section].........................................................................11
4.1) Sample Configuration.................................................................................................................... 11
5) /etc/cluster/rhosts Update [Multi-node Configuration Section]............................................................11
5.1) Sample Configuration.................................................................................................................... 11
6) Host Connectivity Test [Multi-node Configuration Section]................................................................11
7) Multicast Connectivity Test [Multi-node Configuration Section]........................................................12
8) Create Cluster..................................................................................................................................... 12
8.1) Configuration options.................................................................................................................... 12
8.2) Sample Configuration Selections................................................................................................... 12
8.3) Validation...................................................................................................................................... 12
8.4) Set the non-primary networks to “private”..................................................................................... 13
9) Disable ROOTVG failure detection.................................................................................................... 13
10) Perform initial cluster synchronization.............................................................................................. 14
11) Configure netmon.cf [Multi-node Configuration Section].................................................................15
11.1) Example....................................................................................................................................... 15
11.2) Validation..................................................................................................................................... 15
12) Define Service IPs............................................................................................................................ 16
12.1) Configuration Selections.............................................................................................................. 16
12.2) Sample Configuration Selections................................................................................................. 16
12.3) Validation..................................................................................................................................... 16
13) Define Each Resource Group............................................................................................................ 16
13.1) Configuration Selections.............................................................................................................. 17
13.2) Sample Configuration Selections................................................................................................. 17
13.3) Additional HADR group dependency configuration.....................................................................17
13.4) Sample HADR group dependency configuration Selections.........................................................17
13.5) Validation..................................................................................................................................... 17
14.1) Prerequisite [Multi-node Configuration Section]..........................................................................18
14.2) PowerHA Menu........................................................................................................................... 19
14.3) Configuration Selections.............................................................................................................. 19
14.4) Sample Configuration Selections................................................................................................. 19
14.5) Validation..................................................................................................................................... 19
15) Create Shared Logical Volumes........................................................................................................ 21
15.1) PowerHA Menu........................................................................................................................... 21
15.2) Configuration Selections.............................................................................................................. 21
15.3) Sample Configuration Selections................................................................................................. 21
15.4) Validation..................................................................................................................................... 21
16) Create Shared Filesystems................................................................................................................ 22
16.1) PowerHA Menu........................................................................................................................... 22
16.2) Configuration Selections.............................................................................................................. 22
16.3) Sample Configuration Selections................................................................................................. 22
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
16.4) Validation.................................................................................................................................... 22
17) Syncronize the cluster....................................................................................................................... 22
18) Start Cluster Services........................................................................................................................ 23
18.1) Validation..................................................................................................................................... 23
19) Application Install............................................................................................................................ 23
20) Create an Application Server............................................................................................................ 23
20.1) Sample Configuration command (it’s a single line)......................................................................24
20.2) Validation.................................................................................................................................... 24
21) Configure Application Monitoring.................................................................................................... 25
21.1) Sample Configuration Commands (single line commands)...........................................................25
21.2) Validation..................................................................................................................................... 26
22) Finalize Resource Group................................................................................................................... 26
22.1) Sample Configuration command (single line command)..............................................................26
22.2) Validation.................................................................................................................................... 26
23) Synchronize HACMP Resources...................................................................................................... 27
24) Failover Resource Groups................................................................................................................. 27
24.1) Failover command....................................................................................................................... 27
24.2) Sample command......................................................................................................................... 27
24.3) Validation.................................................................................................................................... 27
25) Create a Cluster Snapshot................................................................................................................. 28
25.1) Create snapshot command............................................................................................................ 28
25.2) Configuration Selections.............................................................................................................. 28
25.3) Sample command......................................................................................................................... 28
25.4) Validation.................................................................................................................................... 28
1) Convert an Existing Volume Group..................................................................................................... 29
1.1) Prerequisite.................................................................................................................................... 29
1.2) Smit Menu..................................................................................................................................... 29
1.3) Configuration Selections................................................................................................................ 29
1.4) Sample Configuration Selections................................................................................................... 30
1.5) Prepare Volume Group on Primary Node.......................................................................................30
1.6) Import Volume Group on Secondary Node.....................................................................................30
1.7) PowerHA Menu............................................................................................................................. 30
1.8) Configuration command................................................................................................................ 30
1.9) Sample Configuration command.................................................................................................... 30
1.10) Validation..................................................................................................................................... 30
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Document Log
Summary of Changes
Revision Revision Editor Nature of Change
Date Number
02/10/17 1.0 Hector Aguirre Initial Release – based on PowerHA 7 document
v1.3
Document Distribution
This document is automatically distributed to all document approvers and for future reference kept in
IBM QMX Database. Printed copies are for reference only and are not controlled.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Overview
This document outlines the procedure for configuring IBM’s PowerHA clustering software for use within
the American Express environment. PowerHA provides high availability for applications through
utilizing various levels of hardware and software redundancy. This document should be used for AIX
servers only.
Prerequisites
1) Server must be racked, powered and cabled.
2) Operating system must be AIX 7.1 TL4 or 7.2 TL1 last SP possible
3) All required IPs and shared disk storage must be configured
Document Convention
For most of the configuration options a CLI command will be provided. For those that require the use of
smitty a table will be provided similar to below.
Fastpath smitty cm_config_nodes.add_dmn
PowerHA Initialization and Standard Configuration
Menu Configure an HACMP Cluster and Nodes
The first option lists out the smitty fastpath to take you directly to this configuration option. The second
listing provides how you can navigate to this menu from the main “smitty hacmp” menu.
Most sections will then contain a “Configuration Selections” section that outlines the parameters that
need to be set for any given option. Fields that will need user input will be bolded blue text. These
sections will be followed by an example which shows the values that were used when setting up a lab
cluster. Finally, there will be a validation section that contains commands to verify that the intended
outcome has been achieved.
Topology – The communication network used between the nodes within the cluster. This is used for the
nodes to communicate their health and resource group status. Typically this will be made up of TCP/IP
based networks such as standard IP based interfaces and non-TCP/IP based networks such as heartbeating
through shared disks.
PowerHA 7.2 topology differs from PowerHA 5 and from PowerHA 7.1. The disk heartbeat is replaced
with a repository disk and the network communicaction also changes from PowerHA 7.1, Multicast for
cluster communication is not anymore mandatory.
Resource Group – This is a combination of all of the elements required to run a given application.
Generally speaking this will consist of the shared storage and filesystems required for the application, any
required IPs (service IPs), the corresponding application server and monitors, and the application failover
behavior preferences.
Service IPs – Virtual IPs that are kept highly available between the nodes in a cluster.
Application Controller Script – Any start up and shut down processes that are required to bring an
application online and offline.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Application Monitor – The process required to evaluate the health of the application.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
PowerHA Software Installation
Pre-requisites verification and installation
To install PowerHA 7.2 we need AIX 7.1 TL 4 or AIX 7.2 TL1 with some additional filesets.So we need
to check the oslevel and validate the required filesets are in place.
“oslevel -s” should return 7100-04-04-1717 or higher for AIX 7.1 or 7200-01-02-1717 or higher for AIX
7.2. If it’s lower, apply the latest patch bundle.
The following filesets need to be installed and at the level corresponding to the “oslevel”:
PowerHA Removal
If you need to remove PowerHA execute the following commands.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
# mount sysadmin:/wasmast/env_scripts /wasmast
# /wasmast/installPowerHA72.pl -backout
User is logged in as root
++++Checking for PowerHA Software++++
++++Mounting remote NFS filesystem++++
++++Removing PowerHA Software++++
Successful uninstall
++++Backout completed successfully!++++
++++Unmounting remote NFS filesystem++++
# umount /wasmast
After running the script in backout mode on both nodes, do this on one of the nodes:
# lscluster -c
# rmcluster –n <cluster name>
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
PowerHA Cluster Configuration
The sections below begin the process of defining the PowerHA cluster configuration. The configuration
process takes place on only one node within the cluster. There are several points in the process defined
below where cluster configuration is synchronized between the nodes. Actions that require updates on
the secondary nodes (such as removing reserve locks) are annotated with [Multi-node Configuration
Section] in the header.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
notify = ""
pre = "cfgscsi_id"
post = ""
recv = ""
count = 0
event_duration = 0
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
PowerHA Cluster Configuration
/etc/hosts Update [Multi-node Configuration Section]
The /etc/hosts file is the source for PowerHA host related resolution. You must first update the /etc/hosts
file to ensure that all aliases and IPs are listed and are identical on all nodes. Specify both the short name
and the fully qualified name for each IP entered. PowerHA does not use DNS server resolution.
After copying this file to all nodes, run “refresh –s clcomd” on all of them.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Multicast Connectivity Test [Multi-node Configuration Section]
PowerHA 7.2 does not require a multicast IP and a multicast enabled on the network.
PowerHA 7.2 can be setup to use unicast.
If you decide to use unicast instead of multicast continue with next point.
To validate multicast is working we need to rung the "mping" command as receiver on one node and
sender on another.
On one node run:
mping -v -r -c 5 -a <xxx.xxx.xxx.xxx>
Create Cluster
We will now create the cluster using the new command line interface.
IMPORTANT: The repository disk along with any other shared disk MUST have the reserve_policy
changed to “no_reserve” before they are used in any cluster configuration.
Multicast:
# clmgr add cluster cl_avlmd510_avlmd511 repository=hdiskpower13 nodes=avlmd510,avlmd511
CLUSTER_IP=239.192.0.105
8.3) Validation
After the command has completed, issue the command /usr/es/sbin/cluster/utilities/cltopinfo and validate
that the cluster configuration process has properly identified all of the common networks you defined
within the /etc/hosts file.
# /usr/es/sbin/cluster/utilities/cltopinfo
Cluster Name: cl_avlmd510_avlmd511
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Use Persistent Labels for Communication: No
Repository Disk: hdiskpower13
Cluster IP Address: 239.192.0.105
There are 2 node(s) and 3 network(s) defined
NODE avlmd510:
Network net_ether_01
avlmd510 10.22.84.53
Network net_ether_02
bu-avlmd510 10.29.12.197
Network net_ether_03
gpfs_avlmd510 192.168.34.10
NODE avlmd511:
Network net_ether_01
avlmd511 10.22.84.61
Network net_ether_02
bu-avlmd511 10.29.12.57
Network net_ether_03
gpfs_avlmd511 192.168.34.11
For example, in the cluster shown above we would need to define net_ether_02 and 03 as private:
WARNING: After this change the system logs may start to get a lot of false alerts about rootvg.
This is a known issue and at the time of this writing it’s being investigated under PMR: 55508,227,000
After performing the next step (cluster sync) you must immediately reboot all nodes to prevent the log
file to grow and fill the filesystem.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Keep an eye on log file /var/adm/syslogs/system.debug and if you see errors like “kern:debug unix:
ROOTVG” being repeated all the time report the issue immediately.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Configure netmon.cf [Multi-node Configuration Section]
There is currently a limitation within PowerHA when using virtual ethernet cards, or logical host ethernet
adapters. In short, the hypervisor sends traffic to these interfaces which can introduce scenarios where
PowerHA cannot determine whether an interface is properly functioning. Full details can be found at
http://www-01.ibm.com/support/docview.wss?uid=isg1IZ01331. To resolve this issue we will configure
a series of ping addresses within the netmon.conf file.
Edit the file /usr/es/sbin/cluster/netmon.cf on all nodes with pingable addresses in the following format:
!REQD <interface as enX> <ping address>
The ping addresses to use for the public interface are as follows for both Phoenix and DR:
<Your default router>
148.173.250.27 # DNS server
148.173.250.201 # DNS server
148.173.251.90 # NIM Server
For IPC2 use the default GW and the IPC2’s DNS and NIM servers.
For DR:
10.10.40.1 # BEN admin/utility router
10.10.40.87 # NIM Server
11.1) Example
# cat /usr/es/sbin/cluster/netmon.cf
!REQD en0 148.171.94.1
!REQD en0 148.173.250.27
!REQD en0 148.173.250.201
!REQD en0 148.173.251.90
!REQD en1 10.74.248.55
!REQD en1 10.74.248.1
!REQD en1 10.74.250.1
11.2) Validation
Ensure you can ping all of the addresses you have selected for your netmon.cf file. The following for
loop can be used as an easy test:
for i in `awk '{print $3}' /usr/es/sbin/cluster/netmon.cf`
do
ping -c 2 -w 2 $i >/dev/null 2>&1
if [ $? -eq 0 ]; then
echo Ping to $i succeeded
else
echo Ping to $i failed
fi
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
done
12.3) Validation
After the command has completed, issue the command /usr/es/sbin/cluster/utilities/cltopinfo and validate
that the service IPs have been associated with the appropriate networks.
# /usr/es/sbin/cluster/utilities/cltopinfo
Cluster Name: cl_avlmd510_avlmd511
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
Repository Disk: hdiskpower13
Cluster IP Address: 239.192.0.105
There are 2 node(s) and 3 network(s) defined
NODE avlmd510:
Network net_ether_01
pddd013sb 10.22.84.93
pddd013 10.22.84.66
avlmd510 10.22.84.53
Network net_ether_02
bu-avlmd510 10.29.12.197
Network net_ether_03
gpfs_avlmd510 192.168.34.10
NODE avlmd511:
Network net_ether_01
pddd013sb 10.22.84.93
pddd013 10.22.84.66
avlmd511 10.22.84.61
Network net_ether_02
bu-avlmd511 10.29.12.57
Network net_ether_03
gpfs_avlmd511 192.168.34.11
No resource groups defined
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Define Each Resource Group
You are now defining a resource group for each application that needs to failover independently.
For DB2 HADR configuration you must create 2 resource groups for the same instance.
Only one HADR instance per cluster is supported with this procedure.
Command line
# clmgr add resource_group rg_name startup=OHN fallover=FNPN fallback=NFB nodes=node1,node2
HADR requires a dependency between the RG to make sure they are not started on the same node.
13.5) Validation
After the command has completed, issue the command /usr/es/sbin/cluster/utilities/cltopinfo and validate
that the new resource group is listed with the required settings.
For HADR also validate the dependency with this command: clmgr -v query dependency
type="DIFFERENT_NODES"
# /usr/es/sbin/cluster/utilities/cltopinfo
Cluster Name: cl_avlmd510_avlmd511
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
Repository Disk: hdiskpower13
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Cluster IP Address: 239.192.0.105
There are 2 node(s) and 3 network(s) defined
NODE avlmd510:
Network net_ether_01
pddd013sb 10.22.84.93
pddd013 10.22.84.66
avlmd510 10.22.84.53
Network net_ether_02
bu-avlmd510 10.29.12.197
Network net_ether_03
gpfs_avlmd510 192.168.34.10
NODE avlmd511:
Network net_ether_01
pddd013sb 10.22.84.93
pddd013 10.22.84.66
avlmd511 10.22.84.61
Network net_ether_02
bu-avlmd511 10.29.12.57
Network net_ether_03
gpfs_avlmd511 192.168.34.11
Skip this step for DB2 HADR since there are no shared disks on HADR configuration.
You will now create a shared volume group for each application that needs to fail over. If your volume
group already exists refer to Appendix A instead for the process to convert an existing volume group.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
systems. When using EMC devices, the command “powermt display dev=all” can be used to match the
“Logical device ID”. For other disk types, the following commands can be used to ensure you have
paired the right devices:
lsattr -El <disk> -a lun_id
lscfg -vl <disk> | grep “Serial Number”
odmget -q "attribute=unique_id and name=<disk>" CuAt
Once you have identified the devices that will be used in the shared volume group, run the following
commands to prepare your disks for PowerHA on all nodes in the cluster.
On node 1:
chdev -l <disk> -a pv=yes
chdev -l <disk> -a reserve_policy=no_reserve
On node 2 … N:
cfgmgr
chdev -l <disk> -a reserve_policy=no_reserve
Next we must identify a common major number to use for each volume group. As a standard numbering
convention we will use numbers starting at 100. The following command provides a list of available
major number ranges.
/usr/es/sbin/cluster/sbin/cl_nodecmd /usr/sbin/lvlstmajor
Example:
# /usr/es/sbin/cluster/sbin/cl_nodecmd /usr/sbin/lvlstmajor
aplmd501: 35..99,101...
aplmd502: 35..99,101...
In the above example, the numbers 35-99 are available and everything else starting at 101. Since 100 is
already taken we would select 101.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
14.4) Sample Configuration Selections
Node Names aplmd501,aplmd502
Resource Group Name [rg_udb3] +
PVID 00c4d532f5479504 00c4d532f548d36c
VOLUME GROUP name [udb3_vg]
Physical partition SIZE in megabytes 128 +
Volume group MAJOR NUMBER [101] #
Enable Fast Disk Takeover or Concurrent Access Fast Disk Takeover +
Volume Group Type Scalable
14.5) Validation
Running the command /usr/es/sbin/cluster/cspoc/cl_ls_shared_vgs will list out all shared volume groups.
# /usr/es/sbin/cluster/cspoc/cl_ls_shared_vgs
#Volume Group Resource Group Node List
caavg_private <None> aplmd501,aplmd502
udb3_vg rg_udb3 aplmd501,aplmd502
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Create Shared Logical Volumes
Skip this step for DB2 HADR since there are no shared disks on HADR configuration.
Configuration Alternative
If you do not want to calculate the number of logical partitions needed, you can use the command line
interface as follows:
/usr/sbin/cluster/cspoc/smitlvm -17 <resource group> -y <lv name> -t <filesystem type> <volume group>
<size: M for meg or G for gig>
Example:
/usr/sbin/cluster/cspoc/smitlvm -17 rg_udb3 -y pddd714_bp_lv -t jfs2 udb3_vg 8G
15.4) Validation
Running the command /usr/sbin/cluster/sbin/cl_lsfreelvs will list out the logical volumes that are not yet
associated with a filesystem.
# /usr/sbin/cluster/sbin/cl_lsfreelvs
pddd714_bp_lv aplmd501,aplmd502
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Create Shared Filesystems
Skip this step for DB2 HADR since there are no shared disks on HADR configuration.
16.4) Validation
Running the command /usr/sbin/cluster/sbin/cl_lsfs will list out the known filesystems.
# /usr/sbin/cluster/sbin/cl_lsfs
Node: Name Nodename Mount Pt VFS Size Options Au to Accounting
aplmd501: /dev/pddd714_bkp_lv -- /backup/pddd714 jfs2 -- rw no no
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Start Cluster Services
We need to start the services “now” (not at boot), with the clinfo daemon, automatically managing
resource groups and without broadcasting a message to all users:
For HADR it’s almost the same except that we don’t want to automatically manage resource groups:
18.1) Validation
Your command will complete but that does not mean the PowerHA daemons are yet operational.
Continue to monitor the status using the command: /usr/es/sbin/cluster/sbin/cl_nodecmd "lssrc -ls
clstrmgrES | grep state" while looking for the state of “ST_STABLE”.
Once the cluster daemons are up, run the command /usr/es/sbin/cluster/utilities/clRGinfo and validate that
each resource group has come online on the expected node.
Note: For HADR clusters the groups should be OFFLINE at this point.
# /usr/es/sbin/cluster/utilities/clRGinfo
-----------------------------------------------------------------------------
Group Name Group State Node
-----------------------------------------------------------------------------
rg_udb3 ONLINE aplmd501
OFFLINE aplmd502
Application Install
At this point we have validated that our shared storage has been properly configured and has come online
on our primary node. You can now install any application data to the shared storage according to the
application documentation. You can also change the mount point permissions as needed. Even though
the cluster is active you can manually unmount your filesystems without impacting PowerHA to change
mount points. When changing mount point permissions ensure that all nodes have the same permissions.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
For DB2 HADR clusters refer to Appendix C for the appropriate scripts and settings.
HADR requires 2 Application Servers. Create them using the following naming convention:
as_<instance>
as_s_<instance>
20.2) Validation
Run the command “clmgr -v query application” to validate the application settings.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
STOPSCRIPT="/opt/PowerHAscripts/db2.hadr.stop-pddd013-PASDF01D-primary.ksh"
For DB2 HADR clusters refer to Appendix C for the appropriate scripts and settings.
HADR requires 2 Application Monitors. Create them using the following naming convention:
asm_<instance>
asm_s_<instance>
HADR Primary
# clmgr add application_monitor asm_pddd013 TYPE="custom" \
APPLICATIONS="as_pddd013" MODE="both" \
MONITORMETHOD="/opt/PowerHAscripts/db2.hadr.monitor-pddd013-PASDF01D-primary.ksh"
\
MONITORINTERVAL="45" \
HUNGSIGNAL="9" \
RESTARTINTERVAL="0" \
STABILIZATION="180" \
RESTARTCOUNT="0" \
FAILUREACTION="fallover" \
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
CLEANUPMETHOD="/opt/PowerHAscripts/db2.hadr.stop-pddd013-PASDF01D-primary.ksh"
\
RESTARTMETHOD="/opt/PowerHAscripts/db2.hadr.start-pddd013-PASDF01D-primary.ksh"
HADR Standby
# clmgr add application_monitor asm_s_pddd013 TYPE="custom" \
APPLICATIONS="as_s_pddd013" MODE="both" \
MONITORMETHOD="/opt/PowerHAscripts/db2.hadr.monitor-pddd013-PASDF01D-standby.ksh"
\
MONITORINTERVAL="120" \
HUNGSIGNAL="9" \
RESTARTINTERVAL="594" \
STABILIZATION="60" \
RESTARTCOUNT="3" \
FAILUREACTION="fallover" \
CLEANUPMETHOD="/opt/PowerHAscripts/db2.hadr.stop-pddd013-PASDF01D-standby.ksh" \
RESTARTMETHOD="/opt/PowerHAscripts/db2.hadr.start-pddd013-PASDF01D-standby.ksh"
21.2) Validation
Run the command “clmgr -v query monitor <monitor>” to validate the application settings.
22.2) Validation
Run the command “clmgr –v query rg <resource group>”
# clmgr -v q rg rg_pddd013
NAME="rg_pddd013"
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
CURRENT_NODE="avlmd510"
NODES="avlmd510 avlmd511"
STATE="ONLINE"
TYPE="non-concurrent"
APPLICATIONS="as_pddd013"
STARTUP="OHN"
FALLOVER="FNPN"
FALLBACK="NFB"
…
SERVICE_LABEL="pddd013"
…
We will now validate that each resource group fails over properly to all secondary nodes.
Resource Group: your resource group
Destination Node: the fail over server
24.3) Validation
After waiting for a period of time greater than the stabilization interval that you defined above, run the
command /usr/es/sbin/cluster/utilities/clRGinfo <resource group>
# /usr/es/sbin/cluster/utilities/clRGinfo rg_pddd013
-----------------------------------------------------------------------------
Group Name Group State Node
-----------------------------------------------------------------------------
rg_pddd013 OFFLINE avlmd510
ONLINE avlmd511
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Create a Cluster Snapshot
At this point we have a functional cluster that is ready for deployment. We will take a cluster snapshot in
case we ever need to restore to this point.
25.4) Validation
Run the command “clmgr query snapshot” and make sure the snapshot is listed.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Appendix A – Converting an Existing Volume Group
Convert an Existing Volume Group
You will now convert an existing standard volume group to a shared volume group for each application
that needs to fail over.
6.1) Prerequisite
Before you convert your volume group you must first configure your disks appropriately to ensure they
are ready to be shared between your systems. First you must identify the same disk/lun across your
systems. When using EMC devices, the command “powermt display dev=all” can be used to match the
“Logical device ID”. For other disk types, the following commands can be used to ensure you have
paired the right devices:
lsattr -El <disk> -a lun_id
lscfg -vl <disk> | grep “Serial Number”
odmget -q "attribute=unique_id and name=<disk>" CuAt
Once you have identified the devices that will be used in the shared volume group, run the following
commands to prepare your disks for PowerHA. If these settings are not already in place on node 1 you
will need to bring down all active filesystems in the volume group.
On node 1:
chdev -l <disk> -a reserve_policy=no_reserve
On node 2:
cfgmgr
chdev -l <disk> -a reserve_policy=no_reserve
Next we must ensure the major number for each volume group will not conflict with the secondary
server’s settings. As a standard numbering convention we will use numbers starting at 100. The
following command provides a list of available major number ranges.
/usr/es/sbin/cluster/sbin/cl_nodecmd /usr/sbin/lvlstmajor
Example:
# /usr/es/sbin/cluster/sbin/cl_nodecmd /usr/sbin/lvlstmajor
aplmd501: 35..99,101...
aplmd502: 35..99,101...
In the above example, the numbers 35-99 are available and everything else starting at 101. Since 100 is
already taken we would select 101.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
6.4) Sample Configuration Selections
* VOLUME GROUP name pddd777_vg
* Activate volume group AUTOMATICALLY no +
at system restart?
* A QUORUM of disks required to keep the volume yes +
group on-line ?
Convert this VG to Concurrent Capable? enhanced concurrent +
Change to big VG format? no +
Change to scalable VG format? no +
LTG Size in kbytes 1024 +
Set hotspare characteristics n +
Set synchronization characteristics of stale n +
partitions
Max PPs per VG in units of 1024 32 +
Max Logical Volumes 256 +
Mirror Pool Strictness +
1. Stop any application that is using any of the filesystems on that volume group. The command fsuser
<filesystem> can be used to determine the pid of any processes that are using that filesystem.
2. Unmount all of the filesystems associated with that volume group.
unmount <filesystem>
3. Turn any automount options off for each filesystem:
chfs -A no <filesystem>
4. Vary off the volume group:
varyoffvg <volume group>
5. Export the volume group:
exportvg <volume group>
6. Import the volume group using the new major number:
importvg -y <volume group> -V <major number> <disk for vg>
6.10) Validation
Running the command /usr/es/sbin/cluster/cspoc/cl_ls_shared_vgs will list out all shared volume groups.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
# /usr/es/sbin/cluster/cspoc/cl_ls_shared_vgs
#Volume Group Resource Group Node List
heartbeat_vg <None> aplmd501,aplmd502
pddd777_vg rg_udb3 aplmd501,aplmd502
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Appendix B – DB2 Monitoring Configuration Requirements
The settings required for a PowerHA DB2 configuration are defined in the AMEX owned ABB
document. The settings below are not expected to change, but this reference should not be considered the
source document for this information. The values listed below were pulled from
ABB_IBM_PowerHA_7.x_v1.doc. The location for the scripts/binaries is at the bottom of this appendix.
NOTE: The original ABB has a slightly different naming than the one used in this doc. This is due to a
change that was made during testing. We expect the ABB to be updated soon.
DB2 requires customized PowerHA scripts developed by the IBM and TIE teams to start, stop,
clean, and monitor the DB2 instances. There are also 2 binary programs 1 of which runs as a
daemon and connects to the DB2 instance and the other is a client of the DB2 connection daemon
and can detect if the database is online. These scripts/binaries will be provided by IBM.
Since PowerHA 7 doesn’t support scripts with parameters, we created a wrapper that can be
symlinked and uses the symlink name to figure out the parameters for the actual script.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Common DB2 Application Monitor configuration:
o Monitor Mode = Both
o Monitor Interval = 120
o Hung Monitor Signal = 9
o Stabilization Interval = 300
o Restart Count = 2
o Restart Interval = 924
o Action on Application Failure = fallover
o Cleanup Method = /opt/PowerHAscripts/db2.ha.stop-{Instance}.ksh
o Restart Method = /opt/PowerHAscripts/db2.ha.start-{Instance}.ksh
Other settings
o For clusters with multiple DB2 instances, balance the node priorities so that in
general an equal number of DB2 instances are running on each side of the cluster.
Script locations:
All of the required scripts and binaries listed above in the “PowerHA custom DB2 monitoring scripts and
functions” section should be copied to the /opt/PowerHAscripts directory. Mount the respective NFS
repositories, copy the files and then chmod 755 the files. The files must be created on all nodes in the
cluster.
The filesystem monitor scripts (fs_monitor.pl and fs_remount.pl) are owned and maintained by the OS
Engineering team and can be found at the following NFS repository:
appii501.ipc.us.aexp.com:/export/software/powerha/scripts
The remaining scripts are all owned and maintained by the DB team. The location for those scripts is
maintained by that team, but at the time this document was published they were available at the following
NFS repository:
sppiu527.ipc.us.aexp.com:/software/PowerHa/Aix
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Appendix C – DB2 HADR Configuration Requirements
The settings required for a PowerHA DB2 HADR configuration are defined in the AMEX owned ABB
document. The settings below are not expected to change, but this reference should not be considered the
source document for this information. The values listed below were pulled from
ABB_IBM_PowerHA_7.2_v1.doc. The location for the scripts/binaries is at the bottom of this appendix.
NOTE: The original ABB has a slightly different naming than the one used in this doc. This is due to a
change that was made during testing. We expect the ABB to be updated soon.
DB2 HADR requires customized PowerHA scripts to start, stop, clean, and monitor the DB2
instances. These scripts/binaries will be provided by IBM.
Script locations:
All of the required scripts and binaries listed above should be copied to the /opt/PowerHAscripts
directory. Mount the respective NFS repositories, copy the files and then chmod 755 the files. The files
must be created on all nodes in the cluster.
The filesystem monitor scripts (fs_monitor.pl and fs_remount.pl) are owned and maintained by the OS
Engineering team and can be found at the following NFS repository:
appii501.ipc.us.aexp.com:/export/software/powerha/scripts
The remaining scripts are all owned and maintained by the DB team. The location for those scripts is
maintained by that team, but at the time this document was published they were available at the following
NFS repository:
sppiu527.ipc.us.aexp.com:/software/PowerHA/Aix
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
o Start Script = /opt/PowerHAscripts/db2.hadr.start-{Instance}-{db}-primary.ksh
o Stop Script = /opt/PowerHAscripts/db2.hadr.stop-{Instance}-{db}-primary.ksh
Other settings
o Configure the node priorities for the both Resource Groups so that they have the
same nodes in the reverse order.
o Only the Primary resource group will have an IP label assigned (instance VIP)
o No VG will be assigned to any of the resource groups.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
o Filesystems should be auto-mounted at boot time.
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY
Appendix D – DB2 HADR startup and failover procedures
o Contact DBA team to find out on which node each group should be started.
o Start the Standby (rg_s_<instance>) group first.
o Once the Standby group is “ONLINE” proceed to start the Primary group
(rg_<instance> on the other node.
Manual Failover
o Stop Standby group first (rg_s_<instance>).
# clmgr offline rg rg_s_<instance> node=<node2>
o Failover the Primary group to the other node (rg_<instance>).
# clmgr move rg rg_<instance> node=<node2>
o Start Standby group on the other node.
# clmgr online rg rg_s_<instance> node=<node1>
Date: 5/3/18
CONFIDENTIAL - FOR INTERNAL USE ONLY