Oracle RAC Interview Q&A
Oracle RAC Interview Q&A
Oracle RAC Interview Q&A
Well, there is not much difference between 10g and 11gR (1) RAC.
But there is a significant difference in 11gR2.
Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS
o
o
o
o
o
o
Databases
Instances
Applications
Node Monitoring
Event Services
High Availability
From 11gR2(onwards) its completed HA stack managing and providing the following resources as like the other cluster
software like VCS etc.
Databases
Instances
Applications
Cluster Management
Node Management
Event Services
High Availability
Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN Single Access Client
Naming method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)
Time synchronization (rather depending upon traditional NTP)
Removed OS dependent hang checker etc, manages with own additional monitor process
6. What are the background process that exists in 11gr2 and functionality?
Process Name
Functionality
crsd
The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster
Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates
events when the status of a resource changes.
cssd
Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the
cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware,
then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate
processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process
monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor
daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting
the node.
diskmon
Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As
Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when
ocssd is started.
evmd
Event Manager (EVM): Is a background process that publishes Oracle Clusterware events
mdnsd
Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and
UNIX, and a service on Windows.
gnsd
Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process
performs name resolution within the cluster.
ons
Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN)
events
oraagent
oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout
scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).
orarootagent
Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root,
such as the network, and the Grid virtual IP address
oclskd
Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS
gipcd
Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure
ctssd
Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP
Component
Owner
ohasd
init, root
root
grid owner
evmd, evmlogger
grid owner
octssd
root
ons, eons
grid owner
Oracle Agent
oragent
grid owner
orarootagent
root
gnsd
root
gpnpd
grid owner
mdnsd
grid owner
8. What is startup sequence in Oracle 11g RAC? 11g RAC startup sequence?
Click here to know more details
From this sequence of the log message and timestamp, we get some understanding about
the sequence of clusterware and ASM instance:
1)
2) Votingdisks used by CSSD are discovered by reading the header of the disks, not
throught ASM
3)
Startup of CRS service has to wait until ASM instance is up and the diskgroup for
OCR and votingdisk is mounte
9. As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup sequence OCSSD starts first before than ASM,
how is it possible?
How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be started if voting disks are stored in ASM? This
sounds like a chicken-and-egg problem: without access to the voting disks there is no CSS, hence the node cannot join the cluster. But without
being part of the cluster, CSSD cannot start the ASM instance. To solve this problem the ASM disk headers have new metadata in 11.2: you can
use kfed to read the header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the
voting file. This does not require the ASM instance to be up. Once the voting disks are located, CSS can access them and joins the cluster.
Source: Pro Oracle Database 11g RAC on Linux- Martin Bach Amazon.com
10. How does SCAN works?
1.
2.
3.
4.
5.
6.
Client Connected through SCAN name of the cluster (remember all three IP addresses round robin resolves to same Host name
(SCAN Name), here in this case our scan name is cluster01-scan.cluster01.example.com
The request reaches to DNS server in your corp and then resolves to one of the node out of three. a. If GNS (Grid Naming service
or domain is configured) that is a subdomain configured in the DNS entry for to resolve cluster address the request will be
handover to GNS (gnsd)
Here in our case assume there is no GNS, now the with the help of SCAN listeners where end points are configured to database
listener.
Database Listeners listen the request and then process further.
In case of node addition, Listener 4, client need not to know or need not change any thing from their tns entry (address of
4th node/instance) as they just using scan IP.
Same case even in the node deletion.
o
o
o
To add a node, simply connect the server to the cluster and allow the cluster to configure the node.
To make it happen, Oracle uses the profile located in $GI_HOME/gpnp/profiles/peer/profile.xml which contains the cluster resources, for
example disk locations of ASM. etc.
So this profile will be read local or from the remote machine when plugged into cluster and dynamically added to cluster.
13. What are the file types that ASM support and keep in disk groups?
Control files
Flashback logs
Data files
DB SPFILE
OCR files
Archive logs
ASM SPFILE
Process
Description
RBAL
Opens all device files as part of discovery and coordinates the rebalance activity
ARBn
GMON
Responsible for managing the disk-level activities such as drop or offline and advancing the ASM
disk group compatibility
MARK
Onnn
One or more ASM slave processes forming a pool of connections to the ASM instance for exchanging
messages
PZ9n
One or more parallel slave processes used in fetching data on clustered ASM installation from GV$
views
The node listener is a process that helps establish network connections from ASM clients to the ASM instance.
Runs by default from the Grid $ORACLE_HOME/bin directory
Listens on port 1521 by default
Is the same as a database instance listener
Is capable of listening for all database instances on the same machine in addition to the ASM instance
Can run concurrently with separate database listeners or be replaced by a separate database listener
Is named tnslsnr on the Linux platform
cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE
Process
Description
RBAL
Opens all device files as part of discovery and coordinates the rebalance activity
ARBn
GMON
Responsible for managing the disk-level activities such as drop or offline and advancing the ASM
disk group compatibility
MARK
Onnn
One or more ASM slave processes forming a pool of connections to the ASM instance for exchanging
messages
PZ9n
One or more parallel slave processes used in fetching data on clustered ASM installation from GV$
views
Supported MirroringLevels
External redundancy
Unprotected (None)
Unprotected (None)
Normal redundancy
High redundancy
Three-way
Three-way
ASM stripes files using extents with a coarse method for load balancing or a fine method to reduce latency.
26. How many ASM Diskgroups can be created under one ASM Instance?
ASM imposes the following limits:
1.
2.
3.
Sets permissions on the Oracle Inventory (central inventory) directory. Reconfigures primary and secondary group memberships for the
installation owner, if necessary, for the Oracle Inventory directory and the operating system privileges groups.
Yes, as per documentation, if you have multiple voting disk you can add online, but if you have only one voting disk , by that cluster will be
down as its lost you just need to start crs in exclusive mode and add the votedisk using
crsctl add votedisk <path>
43. You have lost OCR disk, what is your next step?
The cluster stack will be down due to the fact that cssd is unable to maintain the integrity, this is true in 10g, From 11gR2 onwards, the crsd
stack will be down, the hasd still up and running. You can add the ocr back by restoring the automatic backup or import the manual backup,
Read complete steps here
44. What happens when ocssd fails, what is node eviction? how does node eviction happens? For all answer will be same.
Read here
45. What is virtual IP and how does it works?
Read here
46. Describe some rac wait events you experienced?
Oracle RAC Wait events
and this table,
47. Can you modify VIP address after your cluster installation?
OCRDUMP (or)
b.
crs_stat -p
c.
By using strings.
(or)
Voting disk contents are not persistent and are not required to view the contents, because the voting disk contents
will be overwritten. if still need to view, strings are used.
oifcfg getif
ii.
iii.
iv.
SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC installation.
Enabling/disabling SCAN IP is mostly used in oracle apps environment by the concurrent manager (kind of job scheduler in oracle
apps).
To disable the SCAN IP,
i.
ii.
iii.
Stop scan
srvctl stop scan (this will stop the scan vip's)
iv.
Case 1: Migrating disk group from one storage to other with same name
1. Consider the disk group is DATA,
2. Create new disks in DATA pointing towards the new storage (EMC),
a) Partioning provisioning done by storage and they give you the device name or mapper like
/dev/mapper/asakljdlas
3. Add the new disk to diskgroup DATA
a) Alter diskgroup data add disk '/dev/mapper/asakljdlas'
3. drop the old disks from DATA with which rebalancing is done automatically.
If you want you can the rebalance by alter system set asm_power_limit =12 for full throttle.
alter diskgroup data drop disk 'path to hitachi storage'
Note: you can get the device name in v$asm_disk in path column.
4. Request SAN team to detach the old Storage (HITACHI).
b.
Case 2: Migrating disk group from one to another with different diskgroup name.
1) Create the Disk group with new name in the new storage.
2) Create the spfile in new diskgroup and change the parameter scope = spfile for control files etc.
3) Take a control file backup in format +newdiskgroup
4) Shutdown the db, startup nomount the database
5) restore the control file from backup (now the control will restore to new diskgroup)
6) Take the RMAN backup as copy of all the databases with new format.
RMAN> backup database as copy format '+newdiskgroup name' ;
3) RMAN> Switch database to copy.
4) Verify dba_data_files,dba_temp_files, v$log that all files are pointing to new diskgroup name.
c.
Case 3: Migrating disk group to new storage but no additional diskgroup given
1) Take the RMAN backup as copy of all the databases with new format and place it in the disk.
2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries)
3) Take a backup of pfile and modify the following referring to new diskgroup name
.control_files
.db_create_file_dest
.db_create_online_log_dest_1
.db_create_online_log_dest_2
.db_recovery_file_des
4) stop the database
5) Unmount the diskgroup
asmcmd umount ORA_DATA
6) use asmcmd renamedg (11gr2 only) command to rename to new diskgroup
renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA verbose=true
7)
8) start the database in mount with new pfile taken backup in step 3
9) Run the rename file scripts generated at step2
9) Add the diskgroup to cluster the cluster (if using rac)
srvctl modify database -d orcl -p +NEW_FRA/orcl/spfileorcl.ora
srvctl modify database -d orcl -a "NEW_DATA"
srvctl config database -d orcl
srvctl start database -d orcl
10) Delete the old diskgroup from cluster
crsctl delete resource ora.ORA_DATA.dg
11) Open the database.
Take the outputs of all the services that are running on the databases.
b.
set cluster_database=FALSE
c.
d.
e.
Startup mount
f.
Generic question, If using ASM the usual location for the datafile would be
+DATA/datafile/OLDDBNAME/system01.dbf'
Does NID changes this path too? to reflect the new db name?
Yes it will, by using proper directory structure it will create a links to original directory
structure. +DATA/datafile/NEWDBNAME/system01.dbf'
this has to be tested,
We dont have test bed, but thanks to Anji who confirmed it will
g.
h.
i.
j.
k.
l.
m.
n.
o.
8.How to find the database in which particular service is attached to when you have a large number of databases running in the server, you
cannot check one by one manually
Write a shell script to read the database name from oratab and iterate the loop taking inpt as DB name in srvctl to get the result.
#!/bin/ksh
ORACLE_HOME=
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`
done
9. Difference between OHAS and CRS
OHAS is complete cluster stack which includes some kernel level tasks like managing network,time synchronization, disks etc, where the CRS
has the ability to manage the resources like database,listeners,applications, etc With both of this Oracle provides the high
availabilityclustering services rather only affinity to databases.