Monday, August 25, 2008

Oracle RAC installation on Solaris SPARC 64 bit

Keyword:

Oracle RAC Installation RAC Installation

Installing Oracle RAC Installing

Oracle RAC on Solaris



Few weeks back i did a 2 node oracle RAC installation
The Machines were Soalris 10 SPARC 64 bit (Sun-Fire-T2000).
The Shared storage was NAS
Even though Solaris 10 uses resource control, the kernel parameters were added in the /etc/system (#Metalink: Note:367442.1)

The group OINSTALL and user ORACLE were created on both nodes

Few parameters were tuned in /etc/rc2.d/S99nettune

bash-3.00# more /etc/rc2.d/S99nettune
#!/bin/sh
ndd -set /dev/ip ip_forward_src_routed 0
ndd -set /dev/ip ip_forwarding 0
ndd -set /dev/tcp tcp_conn_req_max_q 16384
ndd -set /dev/tcp tcp_conn_req_max_q0 16384
ndd -set /dev/tcp tcp_xmit_hiwat 400000
ndd -set /dev/tcp tcp_recv_hiwat 400000
ndd -set /dev/tcp tcp_cwnd_max 2097152
ndd -set /dev/tcp tcp_ip_abort_interval 60000
ndd -set /dev/tcp tcp_rexmit_interval_initial 4000
ndd -set /dev/tcp tcp_rexmit_interval_max 10000
ndd -set /dev/tcp tcp_rexmit_interval_min 3000
ndd -set /dev/tcp tcp_max_buf 4194304
ndd -set /dev/tcp tcp_maxpsz_multiplier 10
#Oracle Required
ndd -set /dev/udp udp_recv_hiwat 65535
ndd -set /dev/udp udp_xmit_hiwat 65535

Check /etc/system is readable by ORACLE (else RDBMS installation will fail)

-rw-r–r– 1 root root 2561 Apr 17 16:03 /etc/system
Checked the system config on both nodes
For RAM
/usr/sbin/prtconf | grep “Memory size”
Memory size: 8064 Megabytes

For SWAP

/usr/sbin/swap -s
total: 4875568k bytes allocated + 135976k reserved = 5011544k used, 9800072k available
For /tmp
df -h /tmp

Filesystem size used avail capacity Mounted on

swap 9.4G 31M 9.4G 1% /tmp

For OS

/bin/isainfo -kv
64-bit sparcv9 kernel modules

For user

id -a #both UID and GID of user ORACLE should be same on both nodes
uid=300(ORACLE) gid=300(oinstall) groups=300(oinstall),301(dba),503(tms),504(mscat),102(dwh)

User nobody should exist

id -a nobody

uid=60001(nobody) gid=60001(nobody) groups=60001(nobody)
I had the below entries in the /etc/hosts on both nodes
[]cat /etc/hosts

#Public:
3.208.169.203 ownserver01 ownserver01ipmp0 loghost
3.208.169.207 ownserver02 ownserver02ipmp0 loghost
#Private:
10.47.2.82 ownserver01ipmp1 # e1000g1 -Used this while installing cluster
10.47.2.85 ownserver02ipmp1 # e1000g1 -Used this while installing cluster
10.47.2.76 ownserver01ipmp2 # e1000g0
10.47.2.79 ownserver02ipmp2 # e1000g0
#Vip:
3.208.169.202 ownserverv01
3.208.169.206 ownserverv02

All the interfaces had their ipmp groups.

Confirmed that the interface names of both Private and Public are same across the nodes.
e1000g3 was the Public Interface on both nodes.
e1000g0 and e1000g1 were the Private Interface on both nodes.
-I had 2 interfaces for Private Interconnect, of which i used e1000g1 during the cluster installation.
-The interface names for e1000g1 on each node were ownserver01ipmp1 and ownserver02ipmp1

Below is the ‘ifconfig -a‘ from Ownserver01

ownserver01 [SHCL1DR1]$ ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1000843 mtu 1500 index 2
inet 10.47.2.76 netmask ffffffe0 broadcast 10.47.2.95
groupname ipmp2
e1000g0:1: flags=9040843 mtu 1500 index 2
inet 10.47.2.77 netmask ffffffe0 broadcast 10.47.2.95
e1000g0:2: flags=9040842 mtu 1500 index 2
inet 10.47.2.77 netmask ff000000 broadcast 10.255.255.255
e1000g1: flags=1000843 mtu 1500 index 3
inet 10.47.2.82 netmask ffffffe0 broadcast 10.47.2.95
groupname ipmp1
e1000g1:1: flags=9040843 mtu 1500 index 3
inet 10.47.2.83 netmask ffffffe0 broadcast 10.47.2.95
e1000g1:2: flags=9040842 mtu 1500 index 3
inet 10.47.2.83 netmask ff000000 broadcast 10.255.255.255
e1000g2: flags=1000843 mtu 1500 index 4
inet 10.47.2.11 netmask ffffffc0 broadcast 10.47.2.63
groupname ipmp3
e1000g2:1: flags=9040843 mtu 1500 index 4
inet 10.47.2.12 netmask ffffffc0 broadcast 10.47.2.63
e1000g2:2: flags=9040842 mtu 1500 index 4
inet 10.47.2.12 netmask ff000000 broadcast 10.255.255.255
e1000g3: flags=1000843 mtu 1500 index 5
inet 3.208.169.203 netmask ffffffc0 broadcast 3.208.169.255
groupname ipmp0
e1000g3:1: flags=9040843 mtu 1500 index 5
inet 3.208.169.204 netmask ffffffc0 broadcast 3.208.169.255
e1000g3:2: flags=9040842 mtu 1500 index 5
inet 3.208.169.204 netmask ff000000 broadcast 3.255.255.255
nxge0: flags=69040843 mtu 1500 index 6
inet 10.47.2.13 netmask ffffffc0 broadcast 10.47.2.63
groupname ipmp3
nxge1: flags=69040843 mtu 1500 index 7
inet 3.208.169.205 netmask ffffffc0 broadcast 3.208.169.255
groupname ipmp0
nxge2: flags=69040843 mtu 1500 index 8
inet 10.47.2.78 netmask ffffffe0 broadcast 10.47.2.95
groupname ipmp2
nxge3: flags=69040843 mtu 1500 index 9
inet 10.47.2.84 netmask ffffffe0 broadcast 10.47.2.95
groupname ipmp1



From the above output its clear that On Ownserver01

e1000g0 is 10.47.2.76 and e1000g1 is 10.47.2.82 - For Private Interconnect
e1000g2 is 10.47.2.11 - For Shared Storage (NAS)
e1000g3 is 3.208.169.203 - For Public

On Ownserver02

e1000g0 is 10.47.2.79 and e1000g1 is 10.47.2.85 - For Private Interconnect
e1000g2 is 10.47.2.14 - For Shared Storage (NAS)
e1000g3 is 3.208.169.207 - For Public

Checked for SSH and SCP in /usr/local/bin/

The cluster verification utility(runcluvfy.sh ) checks for scp and ssh in /usr/local/bin/.
Create soft links of ssh and scp in /usr/local/bin/ if they are not there.
cd /usr/local/bin/
ls -l
lrwxrwxrwx 1 root root 12 Apr 25 16:57 /usr/local/bin/scp -> /usr/bin/scp
lrwxrwxrwx 1 root root 12 Apr 25 16:57 /usr/local/bin/ssh -> /usr/bin/ssh

Checked SSH equivalency between the nodes

ownserver01 [SHCL1DR1]$ ssh ownserver01 date
ssh_exchange_identification: Connection closed by remote host
ownserver01 [SHCL1DR1]$
mkdir ~/.ssh
chmod 700 ~/.ssh
/usr/bin/ssh-keygen -t rsa
usr/bin/ssh-keygen -t dsa
touch ~/.ssh/authorized_keys
ssh ownserver01 cat /home/oracle/.ssh/id_rsa.pub >>authorized_keys
Passowrd: *****
ssh ownserver01 cat /home/oracle/.ssh/id_dsa.pub >>authorized_keys
Password: *****
ssh ownserver02 cat /home/oracle/.ssh/id_rsa.pub >>authorized_keys
Password: *****
ssh ownserver02 cat /home/oracle/.ssh/id_dsa.pub>>authorized_keys
Password:*****
From Ownserver01
ownserver01 [SHCL1DR1]$ ssh OWNSERVER01 date
ownserver01 [SHCL1DR1]$ ssh OWNSERVER02 date
From Ownserver02
ownserver02 [SHCL1DR2]$ ssh OWNSERVER01 date
ownserver02 [SHCL1DR2]$ ssh OWNSERVER02 date
The time on both nodes were same at any time.
I made sure that from Ownserver01 i could SSH to ownserver02 and also to ownserver01 itself and the same from Ownserver02 too.

Note: also mv /etc/issue to /etc/issue.bak or user equivalence will fail – This clears the login banner. (this may be for Solaris only)

Checked for file /usr/lib/libdce.so [ Metalink Note Note:333348.1]

The 10gR2 installer on Soalris 64 bit fails if the file /usr/lib/libdce.so is present.
Check Metalink Note Note:333348.1 for the workaround.

Configure the .profile of user ORACLE

stty cs8 -istrip -parenb
PATH=/usr/bin:/usr/local/bin
EDITOR=/usr/bin/vi
#umask 077
umask 022
ulimit -c 0
export PATH EDITOR
set -o vi
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export OH=$ORACLE_HOME
export ORA_CRS_HOME=$ORACLE_BASE/product/crs
export CH=$ORA_CRS_HOME
export ORACLE_SID=SHCL1DR1
#export NLS_LANG=Japanese_Japan.UTF8
export NLS_LANG=AMERICAN_AMERICA.UTF8
export PATH=$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:/sbin:/usr/bin:/usr/ccs/bin:/usr/ucb:/etc:/usr/X/bin:/usr/openwin/bin:/usr/local/bin:/usr/sbin
export PS1=`hostname`” [$ORACLE_SID]\$ “
####
alias bdump=’cd /u01/app/oracle/admin/SHCL1/bdump/’
alias talert=’tail -f $ORACLE_BASE/admin/SHCL1/bdump/alert_$ORACLE_SID.log’
alias tns=’cd $ORACLE_HOME/network/admin’
alias udump=’cd /u01/app/oracle/admin/SHCL1/bdump/’
alias valert=’view $ORACLE_BASE/admin/SHCL1/bdump/alert_$ORACLE_SID.log’
alias home=’cd $ORACLE_HOME’

Created the directories on both nodes

For ORACLE_BASE

mkdir -p /u01/app/oracle
chown -R ORACLE:oinstall /u01/app/oracle
chmod -R 770 /u01/app/oracle

For ORA_CRS_HOME

mkdir -p /u01/app/oracle/product/crs
chown -R root:oinstall /u01/app/oracle/product/crs

For ORACLE_HOME [RDBMS]

mkdir -p /u01/app/oracle/product/10.2.0/db_1
chown -R ORACLE:oinstall /u01/app/oracle/product/10.2.0/db_1

For OCR and Voting disks

mkdir -p /u02/oracle/crs/
mkdir -p /u03/oracle/crs/
mkdir -p /u04/oracle/crs/
Check the privileges on the directories [should be ORACLE:oinstall]

Created OCR and Voting Disk files

In Linux, the oracle provided cluster file system OCSF2 is used on the shared disk,so ‘touch’ ing ocr_disk1 and vote_disk1would do.
But since i used NAS as the shared storage (which is mounted on each nodes), i had to create raw files for ocr and voting disk.
OCR

chown root:oinstall /u02/oracle/crs/ocr_disk1
chown root:oinstall /u03/oracle/crs/ocr_disk2
chmod 660 /u02/oracle/crs/ocr_disk1
chmod 660 /u03/oracle/crs/ocr_disk2

VOTING DISK

chown ORACLE:oinstall /u02/oracle/crs/vote_disk1
chown ORACLE:oinstall /u03/oracle/crs/vote_disk2
chown ORACLE:oinstall /u04/oracle/crs/vote_disk3
chmod 660 /u02/oracle/crs/vote_disk1
chmod 660 /u03/oracle/crs/vote_disk2
chmod 660 /u04/oracle/crs/vote_disk3

Downloaded and unzipped Oracle 10.2.0.1 installation files

10gr2_cluster_sol.cpio
10gr2_companion_sol.cpio
10gr2_db_sol.cpio

Run the Cluster Verification Utility available in

10gr2_cluster_sol.cpio
ownserver01 [SHCL1DR1]$ ./runcluvfy.sh stage -pre crsinst -n OWNSERVER01,OWNSERVER02 -verbose
Performing pre-checks for cluster services setup
Checking node reachability…
Check: Node reachability from node “ownserver01″
Destination Node Reachable?
———————————— ————————
OWNSERVER01 yes
OWNSERVER02 yes
Result: Node reachability check passed from node “ownserver01″.
Checking user equivalence…
Check: User equivalence for user “ORACLE”
Node Name Comment
———————————— ————————
OWNSERVER02 passed
OWNSERVER01 passed
Result: User equivalence check passed for user “ORACLE”.
Checking administrative privileges…
Check: Existence of user “ORACLE”
Node Name User Exists Comment
———— ———————— ————————
OWNSERVER02 yes passed
OWNSERVER01 yes passed
Result: User existence check passed for “ORACLE”.
Check: Existence of group “oinstall”
Node Name Status Group ID
———— ———————— ————————
OWNSERVER02 exists 300
OWNSERVER01 exists 300
Result: Group existence check passed for “oinstall”.
Check: Membership of user “ORACLE” in group “oinstall” [as Primary]
Node Name User Exists Group Exists User in Group Primary Comment
—————- ———— ———— ———— ———— ————
OWNSERVER02 yes yes yes yes passed
OWNSERVER01 yes yes yes yes passed
Result: Membership check for user “ORACLE” in group “oinstall” [as Primary] passed.
Administrative privileges check passed.
Checking node connectivity…
Interface information for node “OWNSERVER02″
Interface Name IP Address Subnet
—————————— —————————— —————-
e1000g0 10.47.2.79 10.47.2.64
e1000g0 10.47.2.80 10.47.2.64
e1000g0 10.47.2.80 10.0.0.0
e1000g1 10.47.2.85 10.47.2.64
e1000g1 10.47.2.86 10.47.2.64
e1000g1 10.47.2.86 10.0.0.0
e1000g2 10.47.2.14 10.47.2.0
e1000g2 10.47.2.15 10.47.2.0
e1000g2 10.47.2.15 10.0.0.0
e1000g3 3.208.169.207 3.208.169.192
e1000g3 3.208.169.208 3.208.169.192
e1000g3 3.208.169.208 3.0.0.0
nxge0 10.47.2.16 10.47.2.0
nxge1 3.208.169.209 3.208.169.192
nxge2 10.47.2.81 10.47.2.64
nxge3 10.47.2.87 10.47.2.64
Interface information for node “OWNSERVER01″
Interface Name IP Address Subnet
—————————— —————————— —————-
e1000g0 10.47.2.76 10.47.2.64
e1000g0 10.47.2.77 10.47.2.64
e1000g0 10.47.2.77 10.0.0.0
e1000g1 10.47.2.82 10.47.2.64
e1000g1 10.47.2.83 10.47.2.64
e1000g1 10.47.2.83 10.0.0.0
e1000g2 10.47.2.11 10.47.2.0
e1000g2 10.47.2.12 10.47.2.0
e1000g2 10.47.2.12 10.0.0.0
e1000g3 3.208.169.203 3.208.169.192
e1000g3 3.208.169.204 3.208.169.192
e1000g3 3.208.169.204 3.0.0.0
nxge0 10.47.2.13 10.47.2.0
nxge1 3.208.169.205 3.208.169.192
nxge2 10.47.2.78 10.47.2.64
nxge3 10.47.2.84 10.47.2.64
Check: Node connectivity of subnet “10.47.2.64″
Source Destination Connected?
—————————— —————————— —————-
OWNSERVER02:e1000g0 OWNSERVER02:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER02:nxge2 yes
OWNSERVER02:e1000g0 OWNSERVER02:nxge3 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER01:nxge2 yes
OWNSERVER02:e1000g0 OWNSERVER01:nxge3 yes
OWNSERVER02:e1000g0 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER02:nxge2 yes
OWNSERVER02:e1000g0 OWNSERVER02:nxge3 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER01:nxge2 yes
OWNSERVER02:e1000g0 OWNSERVER01:nxge3 yes
OWNSERVER02:e1000g1 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER02:nxge2 yes
OWNSERVER02:e1000g1 OWNSERVER02:nxge3 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER01:nxge2 yes
OWNSERVER02:e1000g1 OWNSERVER01:nxge3 yes
OWNSERVER02:e1000g1 OWNSERVER02:nxge2 yes
OWNSERVER02:e1000g1 OWNSERVER02:nxge3 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER01:nxge2 yes
OWNSERVER02:e1000g1 OWNSERVER01:nxge3 yes
OWNSERVER02:nxge2 OWNSERVER02:nxge3 yes
OWNSERVER02:nxge2 OWNSERVER01:e1000g0 yes
OWNSERVER02:nxge2 OWNSERVER01:e1000g0 yes
OWNSERVER02:nxge2 OWNSERVER01:e1000g1 yes
OWNSERVER02:nxge2 OWNSERVER01:e1000g1 yes
OWNSERVER02:nxge2 OWNSERVER01:nxge2 yes
OWNSERVER02:nxge2 OWNSERVER01:nxge3 yes
OWNSERVER02:nxge3 OWNSERVER01:e1000g0 yes
OWNSERVER02:nxge3 OWNSERVER01:e1000g0 yes
OWNSERVER02:nxge3 OWNSERVER01:e1000g1 yes
OWNSERVER02:nxge3 OWNSERVER01:e1000g1 yes
OWNSERVER02:nxge3 OWNSERVER01:nxge2 yes
OWNSERVER02:nxge3 OWNSERVER01:nxge3 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g0 OWNSERVER01:nxge2 yes
OWNSERVER01:e1000g0 OWNSERVER01:nxge3 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g0 OWNSERVER01:nxge2 yes
OWNSERVER01:e1000g0 OWNSERVER01:nxge3 yes
OWNSERVER01:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g1 OWNSERVER01:nxge2 yes
OWNSERVER01:e1000g1 OWNSERVER01:nxge3 yes
OWNSERVER01:e1000g1 OWNSERVER01:nxge2 yes
OWNSERVER01:e1000g1 OWNSERVER01:nxge3 yes
OWNSERVER01:nxge2 OWNSERVER01:nxge3 yes
Result: Node connectivity check passed for subnet “10.47.2.64″ with node(s) OWNSERVER02,OWNSERVER01.
Check: Node connectivity of subnet “10.0.0.0″
Source Destination Connected?
—————————— —————————— —————-
OWNSERVER02:e1000g0 OWNSERVER02:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER02:e1000g2 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g0 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g1 OWNSERVER02:e1000g2 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g1 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g0 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g1 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g1 yes
OWNSERVER01:e1000g0 OWNSERVER01:e1000g2 yes
OWNSERVER01:e1000g1 OWNSERVER01:e1000g2 yes
Result: Node connectivity check passed for subnet “10.0.0.0″ with node(s) OWNSERVER02,OWNSERVER01.
Check: Node connectivity of subnet “10.47.2.0″
Source Destination Connected?
—————————— —————————— —————-
OWNSERVER02:e1000g2 OWNSERVER02:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER02:nxge0 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER01:nxge0 yes
OWNSERVER02:e1000g2 OWNSERVER02:nxge0 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER02:e1000g2 OWNSERVER01:nxge0 yes
OWNSERVER02:nxge0 OWNSERVER01:e1000g2 yes
OWNSERVER02:nxge0 OWNSERVER01:e1000g2 yes
OWNSERVER02:nxge0 OWNSERVER01:nxge0 yes
OWNSERVER01:e1000g2 OWNSERVER01:e1000g2 yes
OWNSERVER01:e1000g2 OWNSERVER01:nxge0 yes
OWNSERVER01:e1000g2 OWNSERVER01:nxge0 yes
Result: Node connectivity check passed for subnet “10.47.2.0″ with node(s) OWNSERVER02,OWNSERVER01.
Check: Node connectivity of subnet “3.208.169.192″
Source Destination Connected?
—————————— —————————— —————-
OWNSERVER02:e1000g3 OWNSERVER02:e1000g3 yes
OWNSERVER02:e1000g3 OWNSERVER02:nxge1 yes
OWNSERVER02:e1000g3 OWNSERVER01:e1000g3 yes
OWNSERVER02:e1000g3 OWNSERVER01:e1000g3 yes
OWNSERVER02:e1000g3 OWNSERVER01:nxge1 yes
OWNSERVER02:e1000g3 OWNSERVER02:nxge1 yes
OWNSERVER02:e1000g3 OWNSERVER01:e1000g3 yes
OWNSERVER02:e1000g3 OWNSERVER01:e1000g3 yes
OWNSERVER02:e1000g3 OWNSERVER01:nxge1 yes
OWNSERVER02:nxge1 OWNSERVER01:e1000g3 yes
OWNSERVER02:nxge1 OWNSERVER01:e1000g3 yes
OWNSERVER02:nxge1 OWNSERVER01:nxge1 yes
OWNSERVER01:e1000g3 OWNSERVER01:e1000g3 yes
OWNSERVER01:e1000g3 OWNSERVER01:nxge1 yes
OWNSERVER01:e1000g3 OWNSERVER01:nxge1 yes
Result: Node connectivity check passed for subnet “3.208.169.192″ with node(s) OWNSERVER02,OWNSERVER01.
Check: Node connectivity of subnet “3.0.0.0″
Source Destination Connected?
—————————— —————————— —————-
OWNSERVER02:e1000g3 OWNSERVER01:e1000g3 yes
Result: Node connectivity check passed for subnet “3.0.0.0″ with node(s) OWNSERVER02,OWNSERVER01.
Suitable interfaces for VIP on subnet “3.208.169.192″:
OWNSERVER02 e1000g3:3.208.169.207 e1000g3:3.208.169.208
OWNSERVER01 e1000g3:3.208.169.203 e1000g3:3.208.169.204
Suitable interfaces for VIP on subnet “3.208.169.192″:
OWNSERVER02 nxge1:3.208.169.209
OWNSERVER01 nxge1:3.208.169.205
I ignored the last messages about the VIP, as i knew there wasn’t any problem.






Click “Add” to add the second node
Public Node Name :ownserver02
Private Node Name:ownserver02ipmp1
Virtual Host Name:ownserverv02







On Ownserver01
bash-3.00# /u01/app/oracle/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oracle/oraInventory to 770.
Changing groupname of /u01/app/oracle/oraInventory to oinstall.
The execution of the script is complete
bash-3.00# ls -l /u01/app/oracle/product/crs/root.sh
-rwxr-xr-x 1 ORACLE oinstall 105 Apr 30 11:14 /u01/app/oracle/product/crs/root.sh
bash-3.00# /u01/app/oracle/product/crs/root.sh
WARNING: directory ‘/u01/app/oracle/product’ is not owned by root
WARNING: directory ‘/u01/app/oracle’ is not owned by root
WARNING: directory ‘/u01/app’ is not owned by root
WARNING: directory ‘/u01′ is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory ‘/u01/app/oracle/product’ is not owned by root
WARNING: directory ‘/u01/app/oracle’ is not owned by root
WARNING: directory ‘/u01/app’ is not owned by root
WARNING: directory ‘/u01′ is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: ownserver01 ownserver01ipmp1 ownserver01
node 2: ownserver02 ownserver02ipmp1 ownserver02
Creating OCR keys for user ‘root’, privgrp ‘root’..
Operation successful.
Now formatting voting device: /u02/oracle/crs/vote_disk1
Now formatting voting device: /u03/oracle/crs/vote_disk2
Now formatting voting device: /u04/oracle/crs/vote_disk3
Format of 3 voting devices complete.
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
ownserver01
CSS is inactive on these nodes.
ownserver02
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
bash-3.00# /u01/app/oracle/product/crs/root.sh
WARNING: directory ‘/u01/app/oracle/product’ is not owned by root
WARNING: directory ‘/u01/app/oracle’ is not owned by root
WARNING: directory ‘/u01/app’ is not owned by root
WARNING: directory ‘/u01′ is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory ‘/u01/app/oracle/product’ is not owned by root
WARNING: directory ‘/u01/app/oracle’ is not owned by root
WARNING: directory ‘/u01/app’ is not owned by root
WARNING: directory ‘/u01′ is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: ownserver01 ownserver01ipmp1 ownserver01
node 2: ownserver02 ownserver02ipmp1 ownserver02
clscfg: Arguments check out successfully.
NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
ownserver01
ownserver02
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
Creating VIP application resource on (2) nodes…
Creating GSD application resource on (2) nodes…
Creating ONS application resource on (2) nodes…
Starting VIP application resource on (2) nodes…
Starting GSD application resource on (2) nodes…
Starting ONS application resource on (2) nodes…
Done.

Friday, April 25, 2008

Knowing when to Rebuild Indexes

Abstract
Indexes provide a fast and efficient method of retrieving selected data from a table. By pointing to the blocks that contain the selected data, the entire table does not have to read in order to extract the required information. Most indexes in Oracle databases are built using the B-tree data structure. Contrary to some widely accepted beliefs, indexes in Oracle are not self-balancing. After a table experiences a large number of inserts, updates, and deletes, the index can become unbalanced and fragmented and can hinder query performance. Knowing when to rebuild the indexes is a topic of some confusion. This paper hopes to shine some light on the subject.

Where is the index now?

In order to understand what we must do with the index, we must first get an idea of the current state of the index. This can be accomplished by using the ANALYZE INDEX VALIDATE STRUCTURE command. Normally, the ANALYZE INDEX command creates either computed or estimated statistics for the index that can be seen in the DBA_INDEXES view. Analyzing the index changes the optimizer’s execution plans for queries that potentially use that index. This action may produce unintentional side effects, especially if the index has not previously been analyzed. The VALIDATE STRUCTURE command can be safely executed without affecting the optimizer. The VALIDATE STRUCTURE command populates the SYS.INDEX_STATS table only. The SYS.INDEX_STATS table can be accessed with the public synonym INDEX_STATS. The INDEX_STATS table will only hold validation information for one index at a time. You will need to query this table before validating the structure of the next index.

Below is an example of ANALYZE INDEX VALIDATE STRUCTURE and sample output from INDEX_STATS:

SQLWKS> ANALYZE INDEX shopping_basket_pk VALIDATE STRUCTURE;
Statement processed.
SQLWKS> SELECT name,height,lf_rows,lf_blks,del_lf_rows,distinct_keys,used_space
2> FROM INDEX_STATS;

NAME HEIGHT LF_ROWS LF_BLKS DEL_LF_ROW DISTINCT_K USED_SPACE------------------------- --------- ---------- ---------- ---------- ---------- ----------SHOPPING_BASKET_PK 2 1 3 1 1 65

1 row selected.
I have the information, now what?

There are two rules of thumb to help determine if the index needs to be rebuilt. If it is determined that the index needs to be rebuilt, this can easily be accomplished by the ALTER INDEX REBUILD command. Although not necessarily recommended, this command could be executed during normal operating hours. Rebuilding the index uses the existing index as a basis. The alternative is to drop and re-create the index. Creating an index uses the base table as its data source that needs to put a lock on the table. The index is also unavailable during creation.

First rule of thumb is if the index has height greater than four, rebuild the index. For most indexes, the height of the index will be quite low, i.e. one or two. I have seen an index on a 3 million-row table that had height three. An index with height greater than four may need to be rebuilt as this might indicate a skewed tree structure. This can lead to unnecessary database block reads of the index. It is helpful to know the data structure for the table and index. Most times, the index height should be two or less, but there are exceptions.

The second rule of thumb is that the deleted leaf rows should be less than 20% of the total number of leaf rows. An excessive number of deleted leaf rows indicates that a high number of deletes or updates have occurred to the index column(s). The index should be rebuilt to better balance the tree. The INDEX_STATS table can be queried to determine if there are excessive deleted leaf rows in relation to the total number of leaf rows. Let’s look at an example:

SQLWKS> ANALYZE INDEX item_basket_pk VALIDATE STRUCTURE;
Statement processed.
SQLWKS> SELECT name,height,lf_rows,del_lf_rows,(del_lf_rows/lf_rows)*100 as ratio 2> FROM INDEX_STATS;
NAME HEIGHT LF_ROWS DEL_LF_ROW RATIO
------------------------------ ---------- ---------- ---------- ----------ITEM_BASKET_PK 1 235 74 31.4893617

1 row selected.

In this example, the ratio of deleted leaf rows to total leaf rows is clearly above 20%. This is a good candidate for rebuilding. Let’s rebuild the index and examine the results.

SQLWKS> ALTER INDEX item_basket_pk REBUILD;
Statement processed.
SQLWKS> ANALYZE INDEX item_basket_pk VALIDATE STRUCTURE;
Statement processed.
SQLWKS> SELECT name,height,lf_rows,del_lf_rows,(del_lf_rows/lf_rows)*100 as ratio
2> FROM INDEX_STATS;
NAME HEIGHT LF_ROWS DEL_LF_ROW RATIO ------------------------------ ---------- ---------- ---------- ----------ITEM_BASKET_PK 1 161 0 0

1 row selected.

The index is rebuilt and validated once again. Examining the INDEX_STATS table shows that the 74 deleted leaf rows were dropped from the index. Notice that the total number of leaf rows went from 235 to 161, which is a difference of 74 leaf rows. This index should provide better performance for the application.

A Case Study

To illustrate the ideas in this paper, a test case was created and various queries were executed against a large table before and after rebuilding the primary key index. It should be noted that this test case was created specifically for this paper and results in the field may vary.

The test table only contains two columns, ID and NAME. The table is described below:
REQ SQL> desc analyze;
Name Null? Type
------------------------------- -------- ----
ID NUMBER NAME VARCHAR2(20)

The test table has a primary key, ANALYZE_PK on the ID field. The test table was populated with 1 million rows of data. After populating the table, over 270,000 rows were selectively deleted from the table. This method ensured that the primary key index would have a sufficient number of deleted leaf rows. After populating and deleting rows from the table, the index was validated as below:

REQ SQL> analyze index analyze_pk validate structure;
Index analyzed.
REQ SQL> select name,height,lf_rows,del_lf_rows,del_lf_rows/lf_rows from index_stats;
NAME HEIGHT LF_ROWS DEL_LF_ROWS DEL_LF_ROWS/LF_ROWS------------------------------ ---------- ---------- ----------- -------------------ANALYZE_PK 3 990206 277353 .280096263

It is now clear that the deleted leaf rows comprise approximately 28% of the total leaf rows.

Four different queries were executed against the table. All of these queries will utilize the primary key index as verified by the explain plan. These queries are executed against the table with the current index. The index is then rebuilt. These queries are re-executed and the results are compared. The four queries are:

1. select count(*) from analyze where id between 250000 and 600000;
2. select count(*) from analyze where id in (50000,4321,698754);
3. select count(*) from analyze where id > 500000;
4. select count(*) from analyze where id = 4321;

With the current index, these three queries are executed against the test table. SQL Trace is enabled and the queries are shown in the following excerpt from TKPROF: ********************************************************************************
select count(*) from analyze where id between 250000 and 600000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.47 0.47 0 293 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.47 0.47 0 293 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
138614 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id in (50000,4321,698754)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 9 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.01 0 9 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
0 CONCATENATION
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id > 500000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 1.11 1.11 0 1611 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1.11 1.11 0 1611 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
445545 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id = 4321

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************

After these results have been obtained, the index is rebuilt. The index is then validated and the results are show below:

REQ SQL> alter index analyze_pk rebuild;
Index altered.
REQ SQL> analyze index analyze_pk validate structure;
Index analyzed.
REQ SQL> select name,height,lf_rows,del_lf_rows,del_lf_rows/lf_rows from index_stats;
NAME HEIGHT LF_ROWS DEL_LF_ROWS DEL_LF_ROWS/LF_ROWS------------------------------ ---------- ---------- ----------- -------------------ANALYZE_PK 3 712853 0 0

It is clear that the deleted leaf rows have been removed from the index. The four queries are run against the table once again. Again, SQL Trace is enabled and the queries are shown in the following excerpt from TKPROF:

********************************************************************************
select count(*) from analyze where id between 250000 and 600000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.23 0.23 0 679 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.23 0.23 0 679 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
138614 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id in (50000,4321,698754)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 9 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 9 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
0 CONCATENATION
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id > 500000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.96 0.96 0 933 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.96 0.96 0 933 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
445545 INDEX (RANGE SCAN) OF 'ANALYZE_PK' (UNIQUE)
********************************************************************************
select count(*) from analyze where id = 4321

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 0 1

Misses in library cache during parse: 0
Optimizer goal: CHOOSEParsing user id: 5 (SYSTEM)

Rows Execution Plan------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (AGGREGATE)
1 INDEX (UNIQUE SCAN) OF 'ANALYZE_PK' (UNIQUE)

The same queries have now been executed against the test table before and after rebuilding the index. The results are summarized in the following table:


Before index rebuild After index rebuild % Change
CPU Elapsed CPU Elapsed CPU Elapsed
Query 1 0.47 0.47 0.23 0.23 51% 51%
Query 2 0.00 0.01 0.00 0.00 0% 0%
Query 3 1.11 1.11 0.96 0.96 13.5% 13.5%
Query 4 0.00 0.00 0.00 0.00 0% 0%

The results in the above table show that queries 1 and 3 experienced up to 51% improvement in execution time. It should be noted that the queries were executed at least once prior to tracing the queries so that the data would reside in the buffer cache and first time executions of the queries would not skew the results. This fact explains why the CPU and elapsed times for a query are nearly identical. The data was read from the buffer cache. Queries 2 and 4 took minimal time to execute. These queries searched for specific values. Since the data was cached, the CPU and Elapsed times are minimal. Queries 1 and 3 took significantly longer. These queries returned a range of values. This paper does not attempt to generate any findings on the improvement of different types of queries when rebuilding an index. Rather it gives a broader statement that query execution times will improve after rebuilding substandard indexes.

Script to find indexes to rebuild

Below is a sample script that can be run to determine which indexes need to be rebuilt. For those indexes that need to be rebuilt, the ALTER INDEX REBUILD command is dynamically generated as output. The user can tailor the height and percentage of deleted leaf rows by altering the vMaxHeight and vMaxDel variables. The output of this script can be spooled to a file. This file can then be run to rebuild the indexes.

-- validate_idx.sql
-- by Arun Mahendran
-- Database Administrator
-- 26 April 2008
--- This script will check indexes to find candidates for rebuilding.
-- Run this script in SQL*Plus as a user with SELECT ANY TABLE
-- privileges.
---- This script can be used and modified without permission. Run this
-- script at your own risk! The script author is not responsible for
-- any problems that may arise from running this script.
set serveroutput on size 100000
DECLARE
vOwner dba_indexes.owner%TYPE; /* Index Owner
*/ vIdxName dba_indexes.index_name%TYPE; /* Index Name
*/ vAnalyze VARCHAR2(100); /* String of Analyze Stmt
*/ vCursor NUMBER; /* DBMS_SQL cursor
*/ vNumRows INTEGER; /* DBMS_SQL return rows
*/ vHeight index_stats.height%TYPE; /* Height of index tree
*/ vLfRows index_stats.lf_rows%TYPE; /* Index Leaf Rows
*/ vDLfRows index_stats.del_lf_rows%TYPE; /* Deleted Leaf Rows
*/ vDLfPerc NUMBER; /* Del lf Percentage
*/ vMaxHeight NUMBER; /* Max tree height
*/ vMaxDel NUMBER; /* Max del lf percentage
*/
CURSOR cGetIdx IS SELECT owner,index_name
FROM dba_indexes WHERE OWNER NOT LIKE 'SYS%';
BEGIN
/* Define maximums. This section can be customized. */
vMaxHeight := 3;
vMaxDel := 20;

/* For every index, validate structure */
OPEN cGetIdx;
LOOP
FETCH cGetIdx INTO vOwner,vIdxName;
EXIT WHEN cGetIdx%NOTFOUND;
/* Open DBMS_SQL cursor */
vCursor := DBMS_SQL.OPEN_CURSOR;
/* Set up dynamic string to validate structure */
vAnalyze := 'ANALYZE INDEX ' vOwner '.' vIdxName '
VALIDATE STRUCTURE';
DBMS_SQL.PARSE(vCursor,vAnalyze,DBMS_SQL.V7);
vNumRows := DBMS_SQL.EXECUTE(vCursor);
/* Close DBMS_SQL cursor */
DBMS_SQL.CLOSE_CURSOR(vCursor);
/* Does index need rebuilding? */
/* If so, then generate command */
SELECT height,lf_rows,del_lf_rows INTO vHeight,vLfRows,vDLfRows
FROM INDEX_STATS;
IF vDLfRows = 0 THEN
/* handle case where div by zero */
vDLfPerc := 0;
ELSE
vDLfPerc := (vDLfRows / vLfRows) * 100;
END IF;
IF (vHeight > vMaxHeight) OR (vDLfPerc > vMaxDel) THEN DBMS_OUTPUT.PUT_LINE('ALTER INDEX ' vOwner '.'
vIdxName ' REBUILD;');
END IF;
END LOOP;
CLOSE cGetIdx;
END;
/
Summary

One of the areas lacking in Oracle documentation pertains to rebuilding an index. Oracle’s ANALYZE INDEX VALIDATE STRUCTURE command provides a nice way to check an index to see if it is a candidate for rebuilding. This command does not affect the Oracle optimizer’s execution plan for queries that may use the index. The results in the INDEX_STATS are checked after issuing the VALIDATE STRUCTURE command. If an index has excessive height (greater than four) or a high number of deleted leaf rows (over 20% of the total), we rebuild the index.
A test case was studied to show the potential for improvements after rebuilding an index. In one example, the query executed 51% faster after rebuilding the index. While this example is specific to the test case, it does show the potential for improvement after rebuilding an index.
Keywords:

Index Index Rebuild Rebuild Index Online ALTER INDEX REBUILD ANALYZE INDEX VALIDATE STRUCTURE ANALYZE INDEX LF_ROWS DEL_LF_ROW

Thursday, April 17, 2008

Increase in Controlfile Size ???

In Oracle8i and higher, if the MAXDATAFILES limit is reached the controlfile will expand automatically.In Oracle 8i and higher, when you issue CREATE DATABASE or CREATE CONTROLFILE statements, the MAXDATAFILES parameter specifies an initial size of the datafile portion of the control file.

Later, if you add a file whose number exceeds MAXDATAFILES but is less than or equal to the value specified by the DB_FILES initialization parameter, the control file automatically expands to allow the datafile portion to accommodate more files.

We can have DB_FILES set to more than the value of MAXDATAFILES.However note that these DB_FILES parameter value should be within the OS kernel limits.

Refer to Note 144638.1 - Relationship Between Common Init.ora Parameters and Unix Kernel Parameters.

If you are adding datafiles to the database and is within the DB_FILES limit, you will get an error only if control file is unable to allocate more space. Recreating the control file is not required to increase the MAXDATAFILES parameter.It is not good to have a high DB_FILES parameter value ( much higher than required ). Increasing the value of DB_FILES increases the size of the PGA, or Program Global Area, which is allocated for every user process connected to ORACLE.
KeyWords:

CREATE CONTROLFILE CREATE DATABASE controlfile 144638.1 Unix Kernel Parameters OS kernel limits controlfile size

Tuesday, March 18, 2008

RMAN Recovery Methods


RMAN Recovery Methods
When performing a restore operation, it is best to open two telnet sessions, one for SQL commands, and one for RMAN commands. For the rest of this document, RMAN commands will be prefaced with the RMAN> prompt, and SQL commands will be prefaced with the SQL> prompt. A $ indicates that the command is executed from the Unix prompt.


Restoring and Recovering All Datafiles


In this scenario, it is assumed that your control files are still accessible. You have a backup, done for example with backup database plus archivelog;

Your first step is to make sure that the target database is shut down:


$ sqlplus “/ as SYSDBA”

SQL> shutdown abort;

ORACLE instance shut down.


Next, you need to start up your target database in mount mode. RMAN cannot restore datafiles unless the database is at least in mount mode, because RMAN needs to be able to access the control file to determine which backup sets are necessary to recover the database. If the control file isn't available, you have to recover it first. Issue the STARTUP MOUNT command shown in the following example to mount the database:


SQL> startup mount;


Oracle instance started.
Database mounted


Since backup set files are created in an RMAN-specific format, you must use RMAN to restore the datafiles. To use RMAN, connect to the target database:


$ rman target / rcvcat rcvcat/rcvcat@oemprod


The remainder of this example shows how to restore all of the datafiles of the target database. When the restore command is executed, RMAN will automatically go to its last good backup set and restore the datafiles to the state they were in when that backup set was created.

When restoring database files, RMAN reads the datafile header and makes the determination as to whether the file needs to be restored. The recovery is done by allocating a channel for I/O and then issuing the RMAN restore database command.


With Oracle9i and above, you don't need to allocate a channel explicitly. Instead, you can use the default channel mode:


RMAN> restore database;

RMAN> recover database;

SQL> alter database open;


For Oracle8i, the ALLOCATE, RESTORE, and RECOVER commands need to be enclosed by the run{} command:


RMAN> run {

allocate channel d1 type disk;

restore database;

recover database;

}

alter database open;


Once the recovery has been completed, execute a complete RMAN backup to establish a new baseline.

Restoring Specific Tablespaces


In this scenario, it is assumed that your control files are still accessible. You have a backup, done for example with backup database plus archivelog;


Take the tablespace that needs recovery offline, restore the tablespace, recover the tablespace, and bring the tablespace online. If you cannot take the tablespace offline, then shutdown abort the database and restore in mount mode.


First try to take the tablespace offline;


$ sqlplus "/ as sysdba"

SQL> alter tablespace tab offline;


If this works, continue with the RMAN recovery:


$ rman target / rcvcat rcvcat/rcvcat@oemprod

RMAN> restore tablespace tab;

RMAN> recover tablespace tab;

SQL> alter tablespace tab online;


If taking the tablespace offline fails, follow these steps:


$ sqlplus “/ as SYSDBA”

SQL> shutdown abort;

SQL> startup mount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod

RMAN> restore tablespace tab;

RMAN> recover tablespace tab;

SQL> alter database open;


Once the recovery has been completed, execute a complete RMAN backup to establish a new baseline.

Restoring Specific Datafiles

In this scenario, it is assumed that your control files are still accessible. You have a backup, done for example with backup database plus archivelog;


Take the datafile that needs recovery offline, restore the datafile, recover the datafile, and bring the datafile online. If you cannot take the datafile offline, then shutdown abort the database and restore in mount mode.


First try to take the datafile offline:


SQL> alter database datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf' offline;

If this works, continue with the RMAN recovery:


$ rman target / rcvcat rcvcat/rcvcat@oemprod


RMAN> restore datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf'

RMAN> recover datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf'
SQL> alter database datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf' online;

If taking the datafile offline fails, follow these steps:


$ sqlplus “/ as SYSDBA”

SQL> shutdown abort;

SQL> startup mount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod

RMAN> restore datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf';

RMAN> recover datafile '/u01/oracle/db/AKI1/tab/AKI1_tab.dbf';

SQL> alter database open;


Once the recovery has been completed, execute a complete RMAN backup to establish a new baseline.


Restoring Control Files


In this scenario, it is assumed that your control files are backed up. You have a backup, done for example with backup database plus archivelog;

In an ideal world you'll never use RMAN to restore a control file. But if something catastrophic happens, and you lose all control files, here are the steps for getting them back:


The following examples assume that you are using a catalog. First, here's the simplest Oracle9i syntax for restoring a control file:


$ sqlplus “/ as SYSDBA”

SQL> shutdown abort;

SQL> startup nomount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod


Set the database id (DBID) with the following command. This is a 10-digit number that RMAN uses to uniquely identify this database in the recovery log. The number can be obtained from any previous RMAN backup log file.


RMAN> set dbid = xxxxxxxxxx;
RMAN> restore controlfile;

SQL> alter database mount;

SQL> alter database open;

If this fails with ...

ORA-01589: must use RESETLOGS or NORESETLOGS option for database open


... then you must perform a recover database:
SQL> shutdown abort;
SQL> startup mount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod

RMAN> recover database;

SQL> alter database open resetlogs;
RMAN> reset database;

Note, that all offline archivelogs are now useless, perform a full back as soon as possible.


Restoring Online Redologs


In this scenario, it is assumed that your control files are backed up. You have a backup, done for example with backup database plus archivelog;

$ sqlplus "/ as sysdba"
SQL> shutdown abort;

SQL> startup nomount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod


Set the database id (DBID) with the following command. This is a 10-digit number that RMAN uses to uniquely identify this database in the recovery log. The number can be obtained from any previous RMAN backup log file.


RMAN> set dbid = xxxxxxxxxx;

RMAN> restore controlfile;

SQL> alter database mount;

RMAN> restore database;

RMAM> recover database;

RMAN-00571:=========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS=============

RMAN-00571:=========================================================

RMAN-03002: failure of recover command at 09/28/2004 11:03:23

RMAN-06054: media recovery requesting unknown log: thread 1 seq 1 lowscn 8448414


Since the online logs were lost, complete recovery is not possible. Open the database with resetlogs to continue.

RMAN> alter database open resetlogs;


-------------------------------------IMPORTANT-------------------------------------------
During this type of recovery, if you receive error messages like this:


RMAN> restore database;

Starting restore at 11-JUL-05

using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1
RMAN-00571:=========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============
RMAN-00571:========================================================
RMAN-03002: failure of restore command at 07/11/2005 14:25:22
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 10 found to restore
RMAN-06023: no backup or copy of datafile 9 found to restore
RMAN-06023: no backup or copy of datafile 8 found to restore
RMAN-06023: no backup or copy of datafile 7 found to restore
RMAN-06023: no backup or copy of datafile 6 found to restore
RMAN-06023: no backup or copy of datafile 5 found to restore
RMAN-06023: no backup or copy of datafile 4 found to restore
RMAN-06023: no backup or copy of datafile 3 found to restore
RMAN-06023: no backup or copy of datafile 2 found to restore
RMAN-06023: no backup or copy of datafile 1 found to restore

…use the following directions to recover (recreate the controlfile):


1.With the database mounted, execute ‘alter database backup controlfile to trace resetlogs;’


2.Perform a shutdown abort on the database, but remain at the SQL> prompt.


3.In another telnet session, go to the udump directory to retrieve the resulting trace file and copy it to another location to edit it.


4.Edit the file and add the phrase “until cancel” to the recover database command at the end. The phrase should read “recover database until cancel using backup controlfile”. Remove the “alter database open” command after the recover command. Save the file with a .sql extension.


5.Back at the SQL> prompt, execute the modified trace file. When prompted for an archived log, type in “cancel” and the reply should be “media recovery cancelled”.


6.Issue “alter database open resetlogs”. The database should open after a few moments.


7.Connect to the RMAN recovery catalog and issue the “reset database” command.


8.Perform a full RMAN backup as soon as possible.

----------------------------------------------------------------------------------------
Time-Based, Change-Based, or SCN-based Incomplete Recovery


Incomplete recovery uses a backup to produce a non-current version of the database. In other words, youdo not apply all of the redo records generated after the most recent backup.


You usually perform incomplete recovery of the whole database in the following situations:


Media failure destroys some or all of the online redo logs.
A user error causes data loss, for example, a user inadvertently drops a table.
You cannot perform complete recovery because an archived redo log is missing.
You lose your current control file and must use a backup control file to open the database.


To perform incomplete media recovery, you must restore all datafiles from backups created prior to the time to which you want to recover and then open the database with the RESETLOGS option when recovery completes. The RESETLOGS operation creates a new incarnation of the database; in other words, a database with a new stream of log sequence numbers starting with log sequence 1.


NOTE – Start every RMAN incomplete recovery with the following commands:

$ sqlplus "/ as sysdba"

SQL> shutdown abort;

SQL> startup mount;

$ rman target / rcvcat rcvcat/rcvcat@oemprod


--For time-based recovery, use these commands:


RMAN> restore database until time "to_date('07/12/05 12:0:00','MM/DD/YY HH24:MI:SS')RMAN> recover database until time "to_date('07/12/05 12:0:00','MM/DD/YY HH24:MI:SS')

media recovery complete.SQL> alter database open resetlogs;


--For SCN-based recovery, user these commands:
RMAN> restore database until scn 1000;

RMAN> recover database until scn 1000;

media recovery complete.

SQL> alter database open resetlogs;


--For change-based recovery, user these commands:


RMAN> restore database until sequence 9923; --Archived log sequence number
RMAN> recover database until sequence 9923; --Archived log sequence numbermedia recovery complete.

SQL> alter database open resetlogs;


Once the recovery has been completed, execute the following steps:

Delete prior backups with this command (from the RMAN prompt):

RMAN> delete force backup;

This command removes all prior backups from the RMAN catalog as they can no longer be used once the database has been restarted with the resletlogs option. After completing this command, create a new RMAN backup to establish a new baseline.

Recovering Archived Logs only

In the event that you want to recover the database archived redo logs until a desired time, you can use the following commands:

$ rman target / rcvcat rcvcat/rcvcat@oemprod

RMAN> restore ARCHIVELOG FROM TIME 'SYSDATE-1' UNTIL TIME 'SYSDATE';
or
RMAN> restore ARCHIVELOG FROM TIME "to_date('07/11/05 00:00:01','MM/DD/YY HH24:MI:SS')UNTIL TIME 'SYSDATE';



By

-6N


Keywords:

RMAN Recovery Methods Restoring and Recovering All Datafiles Recovering Datafiles ALLOCATE, RESTORE, and RECOVER commands RMAN backup Restoring Specific Tablespaces Restoring Specific Datafiles Restoring Control Files offline archivelogs Restoring Online Redologs Perform a full RMAN backup Incomplete recovery Time-Based, Change-Based, or SCN-based Incomplete Recovery