Refer Metalink doc ID :- 11.2 Data Guard Physical Standby
Switchover Best Practices using SQL*Plus [ID 1304939.1]
PRE SWITCHOVER
Checks:-
1) Ensure LOG_ARCHIVE_CONFIG & DG_CONFIG parameters are
established in primary
database.
database.
2) Verify the physical standby database performing properly.
On the standby database, query the
V$ARCHIVED_LOG view to identify existing files in the
archived redo log
SQL> SELECT
SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY
SEQUENCE#;
SEQUENCE#;
SEQUENCE#
FIRST_TIM NEXT_TIME
---------- --------- ---------
118 27-MAY-13 27-MAY-13
119 27-MAY-13 28-MAY-13
120
28-MAY-13 28-MAY-13
Then Force a log switch on Primary
…………..
SQL> alter system
switch logfile;
System altered.
Then Verify redo received on
Standby................
SQL> SELECT
SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY
SEQUENCE#;
SEQUENCE#;
SEQUENCE# FIRST_TIM NEXT_TIME
--------------------------------------------------
119 27-MAY-13 28-MAY-13
120 28-MAY-13 28-MAY-13
121 28-MAY-13 28-MAY-13
Then verify redo applied on Standby...................
SQL> SELECT
SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
SEQUENCE#
APPLIED
----------
---------
119 YES
120 NO
120 YES
121 NO
121 YES
122 NO
122 YES
123 NO
123 NO
The following
query at the standby verifies that managed recovery is running:
SQL> SELECT
PROCESS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
The
following query at the Primary verifies that recovery is running with “REAL
TIME APPLY”
option. In the example
below, LOG_ARCHIVE_DEST_2 is established to ship redo to the target
standby
(dest_id=2):
SQL>
SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;
RECOVERY_MODE
-----------------------
MANAGED
REAL TIME APPLY
If managed
standby recovery is not running or not started with real-time apply, restart
managed recovery with Real-time apply
enabled:
SQL>
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER
MANAGED STANDBY DATABASE USING CURRENT LOGFILE
DISCONNECT;
Then Verify Sufficient
Number of Archiver Processes Make sure
LOG_ARCHIVE_MAX_PROCESSES is set to 4 or higher on every primary and standby
database in the Data Guard configuration. Take care not to set it too high as the additional
archivers can increase the time to shutdown the database. This parameter can be set
dynamically via ALTER SYSTEM.
LOG_ARCHIVE_MAX_PROCESSES is set to 4 or higher on every primary and standby
database in the Data Guard configuration. Take care not to set it too high as the additional
archivers can increase the time to shutdown the database. This parameter can be set
dynamically via ALTER SYSTEM.
Verify there are
no large Gaps
Identify the
current sequence number for each thread on the primary database
SQL>
SELECT THREAD#, SEQUENCE# FROM V$THREAD;
Verify the target physical standby database has applied up to, but not including the logs from
the primary query. On the standby the following query should be within 1 or 2 of the
primary query result.
Verify the target physical standby database has applied up to, but not including the logs from
the primary query. On the standby the following query should be within 1 or 2 of the
primary query result.
SQL>
SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG
WHERE APPLIED = 'YES'
AND RESETLOGS_CHANGE# = (SELECT
RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION WHERE STATUS
= ‘CURRENT’)
GROUP BY THREAD#;
If large gaps exist (more than 3 logs) then see Section 6.4.3 Redo Gap Detection and
Resolution.--http://docs.oracle.com/cd/E11882_01/server.112/e17022/log_transport.htm
If large gaps exist (more than 3 logs) then see Section 6.4.3 Redo Gap Detection and
Resolution.--http://docs.oracle.com/cd/E11882_01/server.112/e17022/log_transport.htm
Then Verify
Primary and Standby tempfiles match and all datafiles are ONLINE
For each temporary tablespace on
the standby, verify that temporary files associated with that tablespace on the
primary database also exist on the standby database. Tempfiles added after
initial standby creation are not propagated to the standby. Run this query on
both the primary and target physical standby databases and verify that they
match.
SQL> SELECT TMP.NAME FILENAME, BYTES, TS.NAME TABLESPACE
FROM V$TEMPFILE TMP, V$TABLESPACE TS WHERE TMP.TS#=TS.TS#;
If the queries do not match then you can correct the mismatch now or immediately after the
open of the new primary database.
Prior to switchover, on the target standby, verify that all datafiles necessary for updates after
role transition to primary are ONLINE.
On the target standby:
SQL>
SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
If there are any
OFFLINE datafiles, and these are needed after switchover, bring them ONLINE:
SQL> ALTER DATABASE DATAFILE ‘datafile-name’ ONLINE;
Then Check if jobs are running....
SQL> SELECT * FROM DBA_JOBS_RUNNING;
Depending on what
the running job is, be ready to terminate the job if necessary.
SQL>
SELECT OWNER, JOB_NAME, START_DATE, END_DATE, ENABLED FROM
DBA_SCHEDULER_JOBS WHERE ENABLED=’TRUE’ AND
OWNER <> ‘SYS”;
SQL> SHOW PARAMETER
job_queue_processes
Note: Job candidates to be disabled among others: oracle text sync and optimizer, RMAN
backups, application garbage collectors, application background agents.
Note: Job candidates to be disabled among others: oracle text sync and optimizer, RMAN
backups, application garbage collectors, application background agents.
Block further job submission
SQL>
ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH SID=’*’;
Disable any jobs
that may interfere.
SQL> EXECUTE DBMS_SCHEDULER.DISABLE(
SWITCHOVER
STEPS:-
Verify that the primary database can be switched to the
standby role
Query the SWITCHOVER_STATUS column of the V$DATABASE view on
the primary database:
SQL>
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
-----------------
TO STANDBY
A value of TO STANDBY or SESSIONS ACTIVE (which requires the
WITH SESSION SHUTDOWN clause on the switchover command) indicates that the
primary database can be switched to the standby role. If neither of these
values is returned, a switchover is not possible because redo transport is
either mis-configured or is not functioning properly
http://docs.oracle.com/cd/E11882_01/server.112/e17022/troubleshooting.htm
If The Primary is a RAC, then shutdown all
secondary primary instances.
A normal or immediate
shutdown can be done, but to expedite the shutdown issue a SHUTDOWN ABORT on
secondary RAC instances on the primary cluster only leaving one Primary
instance up. Wait until the remaining Primary instance has completed cluster
reconfiguration (and performed recovery if you chose to abort the secondary
instances) before continuing.
Switchover the primary to a standby database
SQL>
ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
If an ORA-16139 error is encountered, as long as V$DATABASE.DATABASE_ROLE=’PHYSICAL STANDBY’, then you can proceed. A common case where this can occur is when there are a large number of data files. Once managed recovery is started on the new standby, the database will recover.
If the role was not changed then you need to cancel the switchover and review the alert logs and trace files further.
Verify the standby has received the end-of-redo (EOR) log(s)
If an ORA-16139 error is encountered, as long as V$DATABASE.DATABASE_ROLE=’PHYSICAL STANDBY’, then you can proceed. A common case where this can occur is when there are a large number of data files. Once managed recovery is started on the new standby, the database will recover.
If the role was not changed then you need to cancel the switchover and review the alert logs and trace files further.
Verify the standby has received the end-of-redo (EOR) log(s)
In the primary alert log you will
see messages like these:
Switchover: Primary
controlfile converted to standby controlfile succesfully.
Tue Mar 15 16:12:15 2011
MRP0 started with pid=17, OS id=2717
MRP0: Background
Managed Standby Recovery process started (SFO)
Serial Media Recovery started
Managed Standby Recovery not using Real Time
Apply
Online logfile pre-clearing operation disabled
by switchover
Media Recovery Log
/u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_133_6qzl0yvd_.arc
Identified End-Of-Redo for thread 1 sequence
133
Resetting standby activation ID 0 (0x0)
Media Recovery End-Of-Redo indicator
encountered
Media Recovery Applied until change
4314801
MRP0: Media Recovery Complete: End-Of-REDO
(SFO)
MRP0: Background Media Recovery process
shutdown (SFO)
Tue Mar 15 16:12:21 2011
Switchover: Complete - Database shutdown
required (SFO)
Completed: ALTER DATABASE COMMIT TO SWITCHOVER
TO PHYSICAL STANDBY WITH SESSION SHUTDOWN
And correspondingly in the standby
alert log file you should see messages like these:
Tue
Mar 15 16:12:15 2011
RFS[8]:
Assigned to RFS process 2715
RFS[8]:
Identified database type as 'physical standby': Client is Foreground pid 2568
Media
Recovery Log
/u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_133_6qzl0yjp_.arc
Identified
End-Of-Redo for thread 1 sequence 133
Resetting
standby activation ID 2680651518 (0x9fc77efe)
Media Recovery End-Of-Redo indicator
encountered
Media
Recovery Continuing
Resetting
standby activation ID 2680651518 (0x9fc77efe)
Media
Recovery Waiting for thread 1 sequence 134
In versions prior to Oracle Database 11g Release 2, the MRP (Redo Apply coordinator) would stop automatically after processing the End-of-Redo marker. With Oracle Database 11g Release 2, it no longer stops leaving all bystander standby databases still ready to apply redo from the new primary database without having to be restarted. The MRP process will be shut down automatically by the switchover command when executed at the target standby database.
Verify that the standby database can be switched to the primary role
In versions prior to Oracle Database 11g Release 2, the MRP (Redo Apply coordinator) would stop automatically after processing the End-of-Redo marker. With Oracle Database 11g Release 2, it no longer stops leaving all bystander standby databases still ready to apply redo from the new primary database without having to be restarted. The MRP process will be shut down automatically by the switchover command when executed at the target standby database.
Verify that the standby database can be switched to the primary role
Query the SWITCHOVER_STATUS
column of the V$DATABASE view on the standby database:
SQL> SELECT
SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
-----------------
TO
PRIMARY
A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that redo apply is active and that redo transport is configured and working properly. Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.
A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that redo apply is active and that redo transport is configured and working properly. Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.
Switchover the standby database to
a primary
SQL> ALTER
DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
In the standby alert log file you
should see messages like these:
Tue Mar
15 16:16:44 2011
ALTER
DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN
ALTER
DATABASE SWITCHOVER TO PRIMARY (NYC)
Maximum wait for role transition is 15
minutes.
Switchover: Media recovery is still active
Role
Change: Canceling MRP - no more redo to apply
Tue
Mar 15 16:16:45 2011
MRP0:
Background Media Recovery cancelled with status 16037
Errors
in file /u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:
ORA-16037:
user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time
Apply
Recovery
interrupted!
Waiting for MRP0 pid 2460 to terminate
Errors in file
/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:
ORA-16037: user requested cancel of managed
recovery operation
Tue
Mar 15 16:16:45 2011
MRP0: Background Media Recovery process
shutdown (NYC)
Role Change: Canceled MRP
Open the new primary database ---- For RAC setup
also open the database on second node.
SQL>
ALTER DATABASE OPEN;
Note: There will be
an increase in I/O activity while the new primary’s standby redo logs are
cleared.
Correct any tempfile
mismatch
If there was a tempfile that was not corrected during the
pre-switchover check, then correct it now on the new primary.
Restart the new standby
If the new standby
database (former primary database) was not shutdown since switching it to
standby, bring it to the mount state and start managed recovery. This can be
done in parallel to the new primary open.
SQL> SHUTDOWN ABORT;
Note: If you use IMMEDIATE,
an ABORT will be performed anyway as of 11.2.0.2 and you would see the
following in the alert log:
Performing implicit shutdown abort due to switchover to
physical standby
Shutting down instance (abort)
License high water mark = 15
USER (ospid: 14665): terminating the instance
Instance terminated by USER, pid = 14665
SQL> STARTUP
MOUNT;
SQL> ALTER
DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
Note: If you were
using a delay for your standby then you would restart the apply without real
time apply:
SQL> ALTER
DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
Finally, if the
database is a RAC, then start all secondary instances on the new standby.
ONCE THE SWITCHOVER IS SUCCESSFULL RESTORE ANY JOBS TAKEN
DOWN AND DISABLE TRACE IF ENABLED.
Fallback and Handling Failures in
switchovers:-
Refer to below URL for Failures during switchover
http://docs.oracle.com/cd/E11882_01/server.112/e17022/troubleshooting.htm
Thank you very much...
ReplyDeleteVery nice steps you prepared..Helpful.