2.1 Informix HighAvailability and Scalability
2.1 Informix HighAvailability and Scalability
7 Bootcamp
Informix High-Availability and Scalability
Information Management Technology Ecosystems
HDRSecondary
Primary
Client Apps
Re
ad-
On
ly
• DONE!
© 2010 IBM Corporation
Agenda
• High Availability BEFORE Informix 11.10
• High Availability Data Replication (HDR)
• Enterprise Replication (ER)
• New High Availability Features in Informix 11.10
• MACH 11 and required subcomponents
• Remote Standalone Secondary (RSS)
• Shared Disk Secondary (SDS)
• Continuous Log Restore (CLR)
• New Features in Informix 11.5
• Updatable Secondaries
• Connection Manager
• New Features in Informix 11.70
• Flexible Grid
• Connection Manager Grid Support
• Other Supporting Features
• Appendix
6 © 2010 IBM Corporation
Enterprise Replication (ER)
Use
Workload partitioning
Capacity relief
The entire group of servers is the replication domain
Any node within the domain can replicate data with any other
node in the domain
Servers in domain can be configured to be root, non-root and
Leaf
Supports Multiple topologies supported
Heterogeneous OS, Informix versions, and H/W for maximum
implementation flexibility
Secure data communication Fully Connected
Update anywhere (Bi-directional replication) Hierarchical Routing
Hierarchical Tree
Conflicting updates resolved by Forest of Trees
Timestamp, stored procedure, or always apply
Based on log snooping rather than transaction based
• Configuration:
• Primary:
LOG_INDEX_BUILDS: Enable index page logging
• Dynamically: onmode –wf LOG_INDEX_BUILDS=1
• DONE!
© 2010 IBM Corporation
New RSS Configuration Parameters (11.50.xC5)
Dynamic – onmode –wf/wm
• DELAY_APPLY
• Used to configure RS secondary servers to wait for a
specified period of time before applying logs
• LOG_STAGING_DIR
• Specifies the location of log files received from the
primary server when configuring delayed application of
log files on RS secondary servers
• STOP_APPLY
• Used to stop an RS secondary server from applying log
files received from the primary serve
Useful when a problem on the Primary should not be replicated to the Secondary server(s)
• Uses
Adjust capacity online as demand changes
Lower data storage costs Hardware
Mirror
Shared
• How does it work? Disk
• Change the following ONCONFIG parameters to be unique for this SDS instance
¾ DBSERVERALIASES, DBSERVERNAME, MSGPATH, SERVERNUM
¾ Leave all other parameters the same
• DONE!
© 2010 IBM Corporation
Agenda
• High Availability BEFORE Informix 11.10
• High Availability Data Replication (HDR)
• Enterprise Replication (ER)
• New High Availability Features in Informix 11.10
• MACH 11 and required subcomponents
• Remote Standalone Secondary (RSS)
• Shared Disk Secondary (SDS)
• Continuous Log Restore (CLR)
• New Features in Informix 11.5
• Updatable Secondaries
• Connection Manager
• New Features in Informix 11.70
• Flexible Grid
• Connection Manager Grid Support
• Other Supporting Features
• Appendix
18 © 2010 IBM Corporation
Continuous Log Restore (CLR)
• Also known as “Log Shipping”
• Server in roll forward mode
• Logical log backups made from an IDS instance are
continuously restored on a second machine
• Allows logical recovery to span multiple ‘ontape/onbar’
commands/logs
Primary
BENEFITS
• Provides a secondary instance with ‘log file
granularity’
• Does not impact the primary server
• Can co-exist with “the cluster” (HDR/RSS/SDS)
as well as ER
• Useful when backup site is totally isolated (i.e. CLR1
CLR2
CLR3
no network)
• Ideal for disaster recovery
• Replay server logs when convenient
© 2010 IBM Corporation
Agenda
• High Availability BEFORE Informix 11.10
• High Availability Data Replication (HDR)
• Enterprise Replication (ER)
• New High Availability Features in Informix 11.10
• MACH 11 and required subcomponents
• Remote Standalone Secondary (RSS)
• Shared Disk Secondary (SDS)
• Continuous Log Restore (CLR)
• New Features in Informix 11.5
• Updatable Secondaries
• Connection Manager
• New Features in Informix 11.70
• Flexible Grid
• Connection Manager Grid Support
• Other Supporting Features
• Appendix
20 © 2010 IBM Corporation
Updatable Secondary Servers
If the “before” image on the secondary is different than the current image on the primary,
then the write operation is not allowed and an EVERCONFLICT (-7350) error is returned
© 2010 IBM Corporation
Updatable Secondary Servers: Row Versioning
• ifx_insert_checksum
• insert checksum value
• remains constant for the life of the row
• ifx_row_version
• update version
• incremented with each update of the row
• Use of row versions can reduce the network traffic and improve
performance
If no vercols, the entire secondary “before” image is sent to primary
and compared to its image
SLOW and network hog!!!!
Row Versioning Optional but STRONGLY RECOMMENDED
Austin
Frisco
Dallas
Las Vegas
Paris Tokyo
Sao Paulo
© 2010 IBM Corporation
29
Connection Model starting with Informix 11.5
Austin
Frisco
OLTP
Dallas
C
AT
AL Las Vegas
O
MART G
Paris Tokyo
Sao Paulo
as
and then re-routes
eg
Austin
sV
that connection to
e?
La
Frisco
one of the “best fit” OLTP
nc
ta
ins
nodes in the
log
Informix cluster
ta
ca
h
hic
manages
instances failovers
C
A
TA
LLas
O Vegas
M ART G
Connection Manager Utility:
Paris Tokyo
oncmsm (Online Connection Sao Paulo
Manager and Server) Monitor
OR
• Example
production onsoctcp mac_1 prod_tcp
production_shm onipcshm mac_1 place_holder
sds_1 onsoctcp mac_2 sds1_tcp Cluster instances
hdr1 onsoctcp mac_3 hdr1_tcp
rss_1 onsoctcp mac_4 rss1_tcp
dev_1 onsoctcp georgetown dev_1_tcp
• Example:
production onsoctcp mac_1 prod_tcp
sds_1 onsoctcp mac_2 sds1_tcp
Cluster instances
hdr1 onsoctcp mac_3 hdr1_tcp
rss_1 onsoctcp mac_4 rss1_tcp
• Other Examples
oncmsm –c /path_to_config_file
• Sample Output:
CM name host sla define foc flag connections
Cm1 bia oltp primary SDS+HDR+RSS,0 3 5
Cm1 bia report (SDS+RSS) SDS+HDR+RSS,0 3 16
• FAILOVER_CALLBACK
• Valid for secondary instances
• Pathname to program/script to execute if the server is
promoted from secondary to primary
• Can be used to issue alert, take specific actions, etc
LOGFILE /opt/informix/3.50.EVP5/tmp/cm1.log
DEBUG 1
• LOG 2
Print out how many bytes are received and sent for each
session
• LOG 3
Dump each communication buffer for each session
(use with care, obviously a lot of data to be expected)
• Example
• lx-rama lx-rama ravi foobar
• toru toru_2 usr2 fivebar
• seth_tcp seth_alias fred 9ocheetah
• cheetah panther anup cmpl1cate
onpassword –k 34RogerSippl1 –e
/user_data/my_stuff/my_passwd_file
• Useful if you have multiple replication servers and you often need
to perform the same tasks on every replication server
• Requirements
• Enterprise Replication must be running
• Servers must be on Panther (11.70.xC1)
• Pre-panther servers within the ER domain cannot be part of
the GRID
© 2010 IBM Corporation
What are the features of the new Informix Flexible Grid?
• The grid must exist and the grid routines must be executed as an
authorized user from an authorized server
• Enable
• execute procedure ifx_set_erstate(‘on’)
• Disable
• execute procedure ifx_set_erstate(‘off’)
• Get current state
• execute function ifx_get_erstate();
• Return of 1 means that ER is going to snoop the logs for this transaction
• Servers in the grid on which users are authorized to run grid commands
are marked with an asterisk (*)
• When you add a server to the grid, any commands that were previously
run through the grid have a status of PENDING for that server
• Options include:
--source=<source_node>
--summary cdr list grid grid1
--verbose
--nacks
--acks
--pending
© 2010 IBM Corporation
Agenda
• High Availability BEFORE Informix 11.10
• High Availability Data Replication (HDR)
• Enterprise Replication (ER)
• New High Availability Features in Informix 11.10
• MACH 11 and required subcomponents
• Remote Standalone Secondary (RSS)
• Shared Disk Secondary (SDS)
• Continuous Log Restore (CLR)
• New Features in Informix 11.5
• Updatable Secondaries
• Connection Manager
• New Features in Informix 11.70
• Flexible Grid
• Connection Manager Grid Support
• Other Supporting Features
• Appendix
63 © 2010 IBM Corporation
Connection Manager and Flexible Grids
• New parameters
• TYPE REPLSET
• Indicates that this is an ER / Grid agent
• NODES name=instname+instname+instname
• A named list of the Grid / ER instances to participate in this named list
• Can be more than one list of node names
• New option for SLA definition
• policy=[LATENCY | FAILURE]
• The SLA will select the server with the lowest replication latency, the
fewest replication failures or both if the “+” keyword is used
• Must enable before use - cdr define qod –start
• New parameter setting
• FOC DISABLED
• With a Grid, there is no list of “promotable primary” nodes to fail over to
NAME doe_test_1
TYPE REPLSET
NODES list_1=g_pan1+g_pan2
NODES list_2=g_pan3+g_pan4
# Failover Configuration
FOC DISABLED
• Example
CREATE TABLE customer (id INT) WITH ERKEY;
ALTER TABLE customer ADD ERKEY;
• ifxclone utility
• Clones an instance from a single command
• Starts the backup and restore processes simultaneously (SMX
transfer)
• No need to read or write data to disk or tape
• Creates a standalone server ER node or a remote standalone
secondary (RSS) server
• If creating a new ER node, ER registration is cloned as well
• No Sync/Check is necessary
ifxclone -T -S machine2 -I 111.222.333.555 -P 456 -t machine1
-i 111.222.333.444 -p 123
© 2010 IBM Corporation
Easily Convert Cluster Servers to ER nodes
• RSS Æ ER
• Use the rss2er() stored procedure is located in the syscdr
database
• Converts the RSS secondary server into an ER server
• Secondary will inherit the replication rules that the primary had
• Does not require a ‘cdr check’ or ‘cdr sync’
• Basic Steps
1. Execute ‘cdr start sec2er’
2. Restrict application to only one of the nodes
3. Migrate server on which the apps are not running
4. Move apps to the migrated server
5. Use ifxclone to switch back to RSS/HDR
HDR Traffic
HDRPrimary
Secondary
Primary
CAF
Client Apps
Offline
Shared
Disk
Shared
CAF Disk
Mirror HDR Secondary
RSS
Offline
DBA
Blade Server B
<New Orleans>
Building-B Blade Server D
Shared <Denver>
OAT Disk
HDR HDR
Reduce Disaster
Local
The
Add
hardware
Traffic Rest
HDR/RSS
SecondaryThe
Clients
of
Connect
a Local
costs
Add with
Resumes, Initial
aStrikes
Add Loose
the
New
Copy inSystem
New
Capacity
RSS Orleans
Connectivity
Clients
Replication
and Clients
blade
Failover
Local Continue
Nodes
Stops
Denver
servers
Node
Clients Promoted
and no application changes
Continue Client Apps
You are currently using HDR and are uncomfortable with losing both primary and HDR secondary
instances
If the primary fails, it is possible to convert the existing HDR secondary into the primary instance. If
it appears that the original primary is going to be down for an extended period of time, the RSS
instance can be converted into an HDR secondary instance
You want to provide copies of the instance in remote locations, but testing shows that the ping rate
is around 333 ms. You realize that thiswill cause problems on the primary if he uses HDR
Since RSS uses SMX protocol (full duplex), and does not require checkpoints be processed in
SYNC mode, it should not have a significant impact on the performance of the primary instance
You want to provide copies of the database in remote locations, but know there is a high latency
between the sites
RSS uses a fully duplexed communication protocol. This allows RSS to be used in places where
network communication is slow or not always reliable
You are currently using HDR for high availability but would like to have an additional backup of
your system in the event of a disaster in which both primary and secondary servers are lost
Using HDR to provide High Availability is a proven choice. Additional disaster availability Is
provided by using RSS to replicate to a secure ‘bunker’
Do you need to
Protect yourself from Use HDR
Yes Node failure?
No Yes
Do you need to
Multilevel site failure
Yes Use RSS
Do you need to
protection?
Use SDS protect yourself from
Site failure?
No
END
Yes Do you need
Use ER geographically disperse
processing?
No
Automatic No No No No
Failover
When is a client When the client next tries After the administrator When the client When the configured
redirected? to connect with a specified changes the connectivity restarts and reads a service level
database. information, when the client new value for the agreement is
next tries to establish a INFORMIXSERVER attained.
connection with a database environment
server. variable.
Do clients need No Yes No Yes Yes No
to be restarted
to be
redirected?
What is the Individual Individual All clients Individual Individual clients Individual clients
scope of the clients clients that use a clients redirected. redirected.
redirection? redirected. redirected. given redirected.
database
server
redirected.
IDS instances
80 © 2010 IBM Corporation
Making the Connection Manager Redundant
• Clients requesting catalog connectivity get a response from a virtualized
Communication Manger:
• If the first CM agent doesn’t respond, an attempt will be made to the next in
the group definition.
concord walnut_creek
pleasant_hill
e?
nc
i nsta
]
st x
te e_
p| c
o lt s tan
g| In
l o
a ta
[c
hic h
W
IDS instances
• Primary Server
ProxyTh
A pool of threads performing the low level portion of
a redirected write using optimistic concurrency
ProxyDispatch
Manages the ProxyTh threads and passes
redirected operations to the ProxyTh thread
• Secondary Servers
ProxySync
Receives status messages from the ProxyTh
threads on the primary and communicates those to
the sqlexec threads running on the secondary
• Optimistic Concurrency:
• Introduced in IDS 11.
• A update technique used by many web applications to
allow disconnected transactions.
• Briefly -- an update is allowed to proceed if the row
about to be updated has not been updated by some
other activity.
• Relies on either a comparison of the “before” row
image or some form of row versioning
ACK
• Supports encryption Receive
• Automatically activated
• Requires no configuration other
than encryption
• HDR currently transfers the index pages to the secondary when creating the
index
• Requirement: HDR Secondary instance must be available
• Causes index usage on the primary to be delayed
SDS
US Cluster
HDR Traffic
Blade Server D Shared
Primary Secondary Shared Blade Server E
BOM-A Disk
Disk PNQ
RSS Traffic
SDS
RSS
Blade Server A Shared SDS
Disk Shared Blade Server B
SFO-A Shared Disk Mirror
RSS Traffic Disk ATL Optional
Blade Server G Shared Blade Server F
BOM-B Disk CCU
RSS
SDS
Shared Disk Mirror HDR Traffic
Optional Shared Primary Secondary
Blade Server D Blade Server C
SFO-B Disk DEN
SDS
cluster RSS
SDS
• ER can be used to replicate complete or partial (schema Shared Disk Mirror
Optional
based) cluster data Blade Server J
LHR-B
Shared
Disk
Blade Server I
MUC
Note: currently, the repl names are numeric and don’t include the table
name. That will change in a future release.
© 2010 IBM Corporation
Using the oncmsm agent
• With the H/A cluster, SLAs were defined at the
instance level. For example:
SLA oltp=primary
SLA report=rss_1+rss_2+rss_3
SLA accounting=(SDS+HDR)
SLA catalog=rss_4+rss_5+rss_6
SLA test=RS
Syntax:
execute procedure ifx_grid_connect(‘gridname’, ‘tagname’,
er_enabled)
• Tag -- A character string to identify Grid operations
• Enables
• tracing of execution
• reapplying a failed operation
• er_enabled – a numeric value identifying whether master
replicates should be created
• enabling replicating of DML across nodes
• Values
• 0 – off (Default)
• 1 -- on
• By default, the results of transactions run in the context of the grid are not
replicated by ER
• Can now enable replication within a transaction that is run in the context of
the grid
• Some situations require both propagation of a transaction to the servers in
the grid and replicate the results of the transaction
• To enable replication within a transaction:
1. Connect to the grid with the ifx_grid_connect() procedure
2. Create a procedure that performs the following tasks:
• Defines a data variable for the ER state information
• Runs the ifx_get_erstate() function and save its result in the data variable
• Enables replication by running the ifx_set_erstate() procedure with argument 1
• Runs the statements that needs to be replicated
• Resets the replication state to the previous value by running the ifx_set_erstate()
procedure with the name of the data variable
3. Disconnect from the grid with the ifx_grid_disconnect() procedure
4. Run the newly-defined procedure by using the
ifx_grid_procedure() procedure
© 2010 IBM Corporation
11.70.xC2 functionality changes