[20140418]使用dgmgrl管理dataguard(6).txt
参考链接:
http://blog.itpub.net/267265/viewspace-1142649/
http://blog.itpub.net/267265/viewspace-1143027/
http://blog.itpub.net/267265/viewspace-1143058/
http://blog.itpub.net/267265/viewspace-1143126/
http://blog.itpub.net/267265/viewspace-1143480/
我的测试环境:primary的tnsnames别名test,standby主机的tnsnames别名testdg。数据库版本:
开始学习使用dataguard完成切换工作。首先理解Switchover和Failover两个基本概念,摘自:
PacktPub.Oracle.Data.Guard.11gR2.Administration.Beginners.Guide.Jun.2013.pdf
Switchover P43
In a basic Data Guard confguraton with one primary and one standby database, a switchover operaton changes the roles of
these databases, and so the directon of the redo shipping. In a correctly designed confguraton, archived log shipping in
the opposite directon starts immediately afer switchover and clients do not need to change their connecton descriptons
in order to connect the new primary database.
If there is more than one standby database in a Data Guard confguraton, it’s possible to perform switchover between the
primary and any of the standby databases. Afer the switchover, the new primary database can contnue to send redo to all
of the standby databases in the configuraton.
Regardless of the confguraton of Data Guard, a switchover operaton always guarantees zero data loss. This brings high
reliability to switchover and thus it’s widely used for planned maintenance operatons, such as hardware or operatng
system upgrades, database sofware rolling upgrade, and other infrastructure maintenances. Switchover reduces the
downtime for these maintenance operatons by a signifcant amount of time.
Failover P43
Failover is the operaton of convertng a standby database to a primary database, because of a failure in the original
primary database. If the fashback database is disabled on the primary database, failover is an operaton with no return.
In other words, we have to fashback the failed primary database to a state before failover in order to re-establish the
confguraton. Without fashback, Data Guard confguraton needs to be built from scratch.
A manual database failover may be performed in the case of failure with the initatve of the database owner. However,
this will require extra outage for the decision making. If fast-start failover is used, which is a 10g release 2
feature, the failover operaton will perform automatcally.
–不翻译了,很好理解。
1.首先做一个准备工作:
–在主数据库建立standby logfile redo文件,一般要求比redo文件多一组(注:备用库已经建立):
alter database add standby logfile thread 1 (‘/u01/app/oracle11g/oradata/test/redostb01.log’) size 50m ;
alter database add standby logfile thread 1 (‘/u01/app/oracle11g/oradata/test/redostb02.log’) size 50m ;
alter database add standby logfile thread 1 (‘/u01/app/oracle11g/oradata/test/redostb03.log’) size 50m ;
alter database add standby logfile thread 1 (‘/u01/app/oracle11g/oradata/test/redostb04.log’) size 50m ;
— 检查一些参数。
DGMGRL> show database test RedoCompression
RedoCompression = ‘enable’
DGMGRL> show database testdg RedoCompression
RedoCompression = ‘enable’
SYS@test> show parameter log_archive_dest_2
NAME TYPE VALUE
——————- ——- —————————————————————————————————-
log_archive_dest_2 string service=”testdg”, LGWR SYNC AFFIRM delay=0 optional compression=enable max_failure=0 max_connections
=1 reopen=300 db_unique_name=”testdg” net_timeout=30, valid_for=(all_logfiles,primary_role)
SYS@testdg> show parameter log_archive_dest_2
NAME TYPE VALUE
——————- ——- ———————————————————————————————-
log_archive_dest_2 string SERVICE=test LGWR SYNC AFFIRM VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=test
— 关闭数据库做一个冷备份。(忽略)
2.启动数据库做一些DML操作看看。
–检查数据库状态:
DGMGRL> show configuration verbose
Configuration – study
Protection Mode: MaxPerformance
Databases:
test – Primary database
testdg – Physical standby database
Properties:
FastStartFailoverThreshold = ’30’
OperationTimeout = ’30’
FastStartFailoverLagLimit = ’30’
CommunicationTimeout = ‘180’
FastStartFailoverAutoReinstate = ‘TRUE’
FastStartFailoverPmyShutdown = ‘TRUE’
BystandersFollowRoleChange = ‘ALL’
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database test
Database – test
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
test
Database Status:
SUCCESS
DGMGRL> show database testdg
Database – testdg
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: ON
Instance(s):
testdg
Database Status:
SUCCESS
— 一切正常!11G自带的dgmgrl没有validate命令。在12c client看看。
DGMGRL> help validate
Performs an exhaustive set of validations for a database
Syntax:
VALIDATE DATABASE [VERBOSE] ;
DGMGRL> validate database test
Error: ORA-16606: unable to find property “RoleChangeReadiness”
DGMGRL> validate database testdg
Error: ORA-16606: unable to find property “RoleChangeReadiness”
–缺少属性RoleChangeReadiness。
–开始切换:
DGMGRL> switchover to testdg
Performing switchover NOW, please wait…
New primary database “testdg” is opening…
Operation requires shutdown of instance “test” on database “test”
Shutting down instance “test”…
ORA-24327: need explicit attach before authenticating a user
Warning: You are no longer connected to ORACLE.
Please complete the following steps to finish switchover:
shut down instance “test” of database “test”
start up instance “test” of database “test”
— 提示难道要手工关闭数据库吗?再启动数据库吗?
–查看主数据库的alert_test.log:
CLOSE: all sessions shutdown successfully.
Waiting for all non-current ORLs to be archived…
All non-current ORLs have been archived.
Waiting for all FAL entries to be archived…
All FAL entries have been archived.
Waiting for dest_id 2 to become synchronized…
Active, synchronized Physical Standby switchover target has been identified
Switchover End-Of-Redo Log thread 1 sequence 1020 has been fixed
Switchover: Primary highest seen SCN set to 0x0.0xc2dc7170
ARCH: Noswitch archival of thread 1, sequence 1020
ARCH: End-Of-Redo Branch archival of thread 1 sequence 1020
ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARCH: Standby redo logfile selected for thread 1 sequence 1020 for destination LOG_ARCHIVE_DEST_2
Archived Log entry 1029 added for thread 1 sequence 1020 ID 0x7cf843e2 dest 1:
ARCH: Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received alls redo
Final check for a synchronized target standby. Check will be made once.
LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target
Active, synchronized target has been identified
Target has also received all redo
Backup controlfile written to trace file /u01/app/oracle11g/diag/rdbms/test/test/trace/test_rsm0_5722.trc
Clearing standby activation ID 2096645090 (0x7cf843e2)
The primary database controlfile was created using the
‘MAXLOGFILES 16’ clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE ‘srl1.f’ SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE ‘srl2.f’ SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE ‘srl3.f’ SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE ‘srl4.f’ SIZE 52428800;
Archivelog for thread 1 sequence 1020 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete – Database shutdown required
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN
Fri Apr 18 10:14:16 2014
Process (ospid 5728) is suspended due to switchover to physical standby operation.
Fri Apr 18 10:14:53 2014
idle dispatcher ‘D000’ terminated, pid = (17, 1)
–提示要自动建立4个standby logfile文件。实际上根本没有建立。
–查看主数据库的alert_testdg.log:
Thread 1 cannot allocate new log, sequence 1023
Private strand flush not complete
Current log# 2 seq# 1022 mem# 0: /u01/app/oracle11g/oradata/test/redo02.log
******************************************************************
LGWR: Setting ‘active’ archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
ARC4: Archival started
ARC2: STARTING ARCH PROCESSES COMPLETE
PING[ARC2]: Heartbeat failed to connect to standby ‘test’. Error is 16456.
LGWR: Error 16456 disconnecting from destination LOG_ARCHIVE_DEST_2 standby host ‘test’
Error 16456 for archive log file 3 to ‘test’
Destination LOG_ARCHIVE_DEST_2 no longer supports SYNCHRONIZATION
LGWR: Failed to archive log 3 thread 1 sequence 1023 (16456)
Thread 1 advanced to log sequence 1023 (LGWR switch)
Current log# 3 seq# 1023 mem# 0: /u01/app/oracle11g/oradata/test/redo03.log
Archived Log entry 69 added for thread 1 sequence 1022 ID 0x7fc05e3b dest 1:
Starting background process CJQ0
Fri Apr 18 10:36:59 2014
CJQ0 started with pid=34, OS id=10232
Shutting down archive processes
ARCH shutting down
ARC4: Archival stopped
Fri Apr 18 10:42:24 2014
PING[ARC2]: Heartbeat failed to connect to standby ‘test’. Error is 16456.
3.继续测试:
$ rlwrap -s 9999 -c -r -i dgmgrl sys/xxxx@testdg
DGMGRL for Linux: Version 11.2.0.3.0 – 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type “help” for information.
Connected.
DGMGRL> show configuration
Configuration – study
Protection Mode: MaxPerformance
Databases:
testdg – Primary database
Error: ORA-16778: redo transport error for one or more databases
test – Physical standby database
Error: ORA-01034: ORACLE not available
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
–已经切换,只不过test不能传送。看来11g必须重启数据库。
然后关闭主数据库在打开看看。
SYS@test> shutdown immediate
ORA-01092: ORACLE instance terminated. Disconnection forced
SYS@test> startup mount
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2228784 bytes
Variable Size 1006636496 bytes
Database Buffers 587202560 bytes
Redo Buffers 7344128 bytes
Database mounted.
DGMGRL> show configuration verbose
Configuration – study
Protection Mode: MaxPerformance
Databases:
testdg – Primary database
test – Physical standby database
Properties:
FastStartFailoverThreshold = ’30’
OperationTimeout = ’30’
FastStartFailoverLagLimit = ’30’
CommunicationTimeout = ‘180’
FastStartFailoverAutoReinstate = ‘TRUE’
FastStartFailoverPmyShutdown = ‘TRUE’
BystandersFollowRoleChange = ‘ALL’
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database test
Database – test
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
test
Database Status:
SUCCESS
DGMGRL> show database testdg
Database – testdg
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
testdg
Database Status:
SUCCESS
–再备用库上检查log_archive_dest_2参数。
SYS@testdg> show parameter log_archive_dest_2
NAME TYPE VALUE
——————- ——- —————————————————————————————————-
log_archive_dest_2 string service=”test”, LGWR SYNC AFFIRM delay=0 optional compression=enable max_failure=0 max_connections=1
reopen=300 db_unique_name=”test” net_timeout=30, valid_for=(all_logfiles,primary_role)
SYS@test> select database_role,open_mode,protection_mode from v$database;
DATABASE_ROLE OPEN_MODE PROTECTION_MODE
—————- ——————– ——————–
PHYSICAL STANDBY READ ONLY WITH APPLY MAXIMUM PERFORMANCE
SYS@testdg> select database_role,open_mode,protection_mode from v$database;
DATABASE_ROLE OPEN_MODE PROTECTION_MODE
—————- ——————– ——————–
PRIMARY READ WRITE MAXIMUM PERFORMANCE
–从这里也可以确定切换完成。
4.切换回来看看。
DGMGRL> connect sys/xxxxx
Connected.
DGMGRL> switchover to test
Performing switchover NOW, please wait…
New primary database “test” is opening…
Operation requires shutdown of instance “testdg” on database “testdg”
Shutting down instance “testdg”…
ORACLE instance shut down.
Operation requires startup of instance “testdg” on database “testdg”
Starting instance “testdg”…
Unable to connect to database
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
Failed.
Warning: You are no longer connected to ORACLE.
Please complete the following steps to finish switchover:
start up instance “testdg” of database “testdg”
–ora-12514 现在明白了,需要把服务test_DGMGRL.com 注册为静态服务。这些相关操作通过_DGMGRL.进行的。
DGMGRL> show configuration
Configuration – study
Protection Mode: MaxPerformance
Databases:
test – Primary database
Error: ORA-16778: redo transport error for one or more databases
testdg – Physical standby database
Error: ORA-01034: ORACLE not available
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
DGMGRL> show database verbose test StaticConnectIdentifier
StaticConnectIdentifier = ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=hisdg)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=test_DGMGRL.com)(INSTANCE_NAME=test)(SERVER=DEDICATED)))’
DGMGRL> show database verbose testdg StaticConnectIdentifier
StaticConnectIdentifier = ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=testdb)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=testdg_DGMGRL.com)(INSTANCE_NAME=testdg)(SERVER=DEDICATED)))’
5.修改 listener.ora配置加入如下:
–test
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SDU=32767)
(GLOBAL_DBNAME = test.com)
(ORACLE_HOME = /u01/app/oracle11g/product/11.2.0/db_2)
(SID_NAME = test)
)
(SID_DESC =
(SDU=32767)
(GLOBAL_DBNAME = test_DGMGRL.com)
(ORACLE_HOME = /u01/app/oracle11g/product/11.2.0/db_2)
(SID_NAME = test)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = hisdg)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)
ADR_BASE_LISTENER = /u01/app/oracle11g
–testdg
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SDU=32767)
(GLOBAL_DBNAME = test.com)
(ORACLE_HOME = /u01/app/oracle11g/product/11.2.0/db_2)
(SID_NAME = testdg)
)
(SID_DESC =
(SDU=32767)
(GLOBAL_DBNAME = testdg_DGMGRL.com)
(ORACLE_HOME = /u01/app/oracle11g/product/11.2.0/db_2)
(SID_NAME = testdg)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = testdb)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)
ADR_BASE_LISTENER = /u01/app/oracle11g
–重启监听看看。两边都要做。
lsnrctl reload
$ lsnrctl service
LSNRCTL for Linux: Version 11.2.0.3.0 – Production on 18-APR-2014 11:17:30
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=hisdg)(PORT=1521)))
Services Summary…
Service “test.com” has 2 instance(s).
Instance “test”, status UNKNOWN, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0
LOCAL SERVER
Instance “test”, status READY, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
Service “testXDB.com” has 1 instance(s).
Instance “test”, status READY, has 1 handler(s) for this service…
Handler(s):
“D000” established:0 refused:0 current:0 max:1022 state:ready
DISPATCHER < hisdg pid:>
(ADDRESS=(PROTOCOL=tcp)(HOST=hisdg)(PORT=57796))
Service “test_DGB.com” has 1 instance(s).
Instance “test”, status READY, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0 state:ready
LOCAL SERVER
Service “test_DGMGRL.com” has 1 instance(s).
Instance “test”, status UNKNOWN, has 1 handler(s) for this service…
Handler(s):
“DEDICATED” established:0 refused:0
LOCAL SERVER
The command completed successfully>
–可以发现动态注册了Service “test_DGMGRL.com”。
6.再做切换看看。
–注意小细节,一定要连接要切换的备用库,再执行切换。不然依旧出现如下错误提示。
–实际上也很好理解。先要shut 主库,这个时候连接就断开了,无法在实现后面的步骤。
DGMGRL> switchover to testdg;
Performing switchover NOW, please wait…
New primary database “testdg” is opening…
Operation requires shutdown of instance “test” on database “test”
Shutting down instance “test”…
ORA-24327: need explicit attach before authenticating a user
Warning: You are no longer connected to ORACLE.
Please complete the following steps to finish switchover:
shut down instance “test” of database “test”
start up and mount instance “test” of database “test”
DGMGRL> connect sys/xxxxxx@testdg
Connected.
DGMGRL> switchover to testdg;
Performing switchover NOW, please wait…
New primary database “testdg” is opening…
Operation requires shutdown of instance “test” on database “test”
Shutting down instance “test”…
ORACLE instance shut down.
Operation requires startup of instance “test” on database “test”
Starting instance “test”…
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is “testdg”
DGMGRL> show configuration
Configuration – study
Protection Mode: MaxPerformance
Databases:
testdg – Primary database
test – Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> connect sys/xxxxxx@test
Connected.
DGMGRL> switchover to test;
Performing switchover NOW, please wait…
New primary database “test” is opening…
Operation requires shutdown of instance “testdg” on database “testdg”
Shutting down instance “testdg”…
ORACLE instance shut down.
Operation requires startup of instance “testdg” on database “testdg”
Starting instance “testdg”…
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is “test”
DGMGRL> show configuration
Configuration – study
Protection Mode: MaxPerformance
Databases:
test – Primary database
testdg – Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
总结:
1.一定要连接要切换的备用库,再执行切换。不然要手工关闭在启动原来的主库,现在是备用库。
2.配置_DGMGRL.为静态服务,我的感觉如果设置
edit database test set property staticConnectidentifier=’test’;
edit database test set property staticConnectidentifier=’testdg’;
应该也可以完成切换,留待下次测试。