Oracle Real Application Clusters New Features
Oracle Real Application Clusters New Features
Oracle Real Application Clusters New Features
Oracle 9i RAC
OPS (Oracle Parallel Server) was renamed as RAC
CFS (Cluster File System) was supported
ocrcheck introduced
ocrdump introduced
CLUVFY introduced
3. Oracle RAC load balancing advisor - Starting from 10g R2 we have RAC load balancing
advisor utility. 11g RAC load balancing advisor is only available with clients who use
.NET, ODBC, or the Oracle Call Interface (OCI).
4. ADDM for RAC - Oracle has incorporated RAC into the automatic database diagnostic
monitor, for cross-node advisories. The script addmrpt.sql run give report for single
instance, will not report all instances in RAC, this is known as instance ADDM. But using
the new package DBMS_ADDM, we can generate report for all instances of RAC, this
known as database ADDM.
5. Optimized RAC cache fusion protocols - moves on from the general cache fusion
protocols in 10g to deal with specific scenarios where the protocols could be further
optimized.
6. Oracle 11g RAC Grid provisioning - The Oracle grid control provisioning pack allows us to
"blow-out" a RAC node without the time-consuming install, using a pre-installed
"footprint".
3. Single Client Access Name (SCAN) - eliminates the need to change tns entry when
nodes are added to or removed from the Cluster. RAC instances register to SCAN
listeners as remote listeners. SCAN is fully qualified name. Oracle recommends
assigning 3 addresses to SCAN, which create three SCAN listeners.
4. AWR is consolidated for the database.
5. 11g Release 2 Real Application Cluster (RAC) has server pooling technologies so its
easier to provision and manage database grids. This update is geared toward
dynamically adjusting servers as corporations manage the ebb and flow between data
requirements for datawarehousing and applications.
8. GPnP profile.
9. Oracle RAC OneNode is a new option that makes it easier to consolidate databases that
arent mission critical, but need redundancy.
13.Oracle Restart - the feature of Oracle Grid Infrastructure's High Availability Services
(HAS) to manage associated listeners, ASM instances and Oracle instances.
14.Oracle Omotion - Oracle 11g release2 RAC introduces new feature called Oracle
Omotion, an online migration utility. This Omotion utility will relocate the instance from
one node to another, whenever instance failure happens.
15.Omotion utility uses Database Area Network (DAN) to move Oracle instances. Database
Area Network (DAN) technology helps seamless database relocation without losing
transactions.
17.Grid Naming Service (GNS) is a new service introduced in Oracle RAC 11g R2. With GNS,
Oracle Clusterware (CRS) can manage Dynamic Host Configuration Protocol (DHCP) and
DNS services for the dynamic node registration and configuration.
18.Oracle Local Registry (OLR) - From Oracle 11gR2 "Oracle Local Registry (OLR)"
something new as part of Oracle Clusterware. OLR is nodes local repository, similar to
OCR (but local) and is managed by OHASD. It pertains data of local node only and is not
shared among other nodes.
20.I/O fencing prevents updates by failed instances, and detecting failure and preventing
split brain in cluster. When a cluster node fails, the failed node needs to be fenced off
from all the shared disk devices or diskgroups. This methodology is called I/O Fencing,
sometimes called Disk Fencing or failure fencing.
21.Re-bootless node fencing (restart) - instead of fast re-booting the node, a graceful
shutdown of the stack is attempted.
22.Virtual Oracle 11g RAC cluster - Oracle 11g RAC supports virtualization.
In this method, once detecting a potential split brain condition, Oracle clusterware
automatically picks a cluster node as a victim to reboot to avoid data corruption. This process is
called node eviction. DBAs or system administrators need to understand how this IO fencing
mechanism works and learn how to troubleshoot the clustereware problem. When they
experience a cluster node reboot event, DBAs or system administrators need to be able to
analyze the events and identify the root cause of the clusterware failure.
The network heartbeat crosses the private interconnect to establish and confirm valid
node membership in the cluster. The disk heartbeat is between the cluster node and the voting
disk on the shared storage. Both heartbeats have their own maximal misscount values in
seconds called CSS misscount in which the heartbeats must be completed; otherwise a node
eviction will be triggered.
The CSS misscount for the network heartbeat has the following default values
depending on the version of Oracle clusterweare and operating systems:
10g
(R1
&R2 11
OS ) g
Linux 60 30
Unix 30 30
VMS 30 30
Windo
ws 30 30
The CSS misscount for disk heartbeat also varies on the versions of Oracle
clustereware. For oracle 10.2.1 and up, the default value is 200 seconds.
Jul 23 11:15:23 racdb7 logger: Oracle clsomon failed with fatal status 12.
Jul 23 11:15:23 racdb7 logger: Oracle CSSD failure 134.
Jul 23 11:15:23 racdb7 logger: Oracle CRS failure. Rebooting for cluster integrity.
Three of clusterware processes OCSSD, OPROCD and OCLSOMON can initiate a CRS reboot
when they run into certain errors:
1. OCSSD ( CSS daemon) monitors inter-node heath, such as the interconnect and
membership of the cluster nodes. Its log file is located in
$CRS_HOME/log/<host>/cssd/ocssd.log
2. OPROCD(Oracle Process Monitor Daemon), introduced in 10.2.0.4, detects hardware
and driver freezes that results in the node eviction, then kills the node to prevent any IO
from accessing the sharing disk. Its log file is /etc/oracle/oprocd/<hostname>. oprocd.log
3. OCLSOMON process monitors the CSS daemon for hangs or scheduling issue. It may
reboot the node if it sees a potential hang. The log file is
$CRS_HOME/log/<host>/cssd/oclsomon/oclsmon.log
And one of the most important log files is the syslog file, On Linux, the syslog file is
/var/log/messages.
The CRS reboot troubleshooting procedure starts with reviewing various logs files to identify
which of three processes above contributes the node reboot and then isolates the root cause of
this process reboot. Figure 6 troubleshooting tree or diagram illustrated the CRS reboot
troubleshooting flowchart.