标签归档:RAC

Oracle 10g RAC修改数据库实例与主机对应关系

RAC中使用dbca在创建集群数据库时,由于未在主节点[主机号小的节点]上启动图形界面创建数据库,导致最终创建好的集群数据库实例号与主机名号之间关系不对应,例如这里的实验环境如下:
instance_name etltest1对应主机名k1rac4
instance_name etltest2对应主机名k1rac2
下面用实验演示,如何将数据库实例号与主机号的对应关系改为如下:
instance_name etltest1对应主机名k1rac2
instance_name etltest2对应主机名k1rac4

---查看当前集群数据库实例与主机对应关系
[oracle@k1rac2 ~]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      ONLINE     ONLINE     k1rac4    
ora.etltest.etltest1.inst           ONLINE     ONLINE     k1rac4    
ora.etltest.etltest2.inst           ONLINE     ONLINE     k1rac2    
ora.etltest.etltests.cs             ONLINE     ONLINE     k1rac2    
ora.etltest.etltests.etltest2.srv   ONLINE     ONLINE     k1rac2    
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---查看集群数据库实例spfile文件位置
SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------------------------
spfile                               string      /home/db/oracle/oradata/etl_spfile

---创建集群数据库实例pfile文件
SQL> create pfile='/tmp/pfileetl.ora' from spfile;

File created.

---将以下两行添加到新创建的集群数据库实例pfile文件中
etltest1.instance_name=etltest1
etltest2.instance_name=etltest2

---将主机k1rac2上数据库库实例etltest2密码文件改为实例etltest1所有
[oracle@k1rac2 dbs]$ mv orapwetltest2 orapwetltest1

---将主机k1rac2上数据库库实例etltest2 pfile文件改为实例etltest1所有
[oracle@k1rac2 dbs]$ more initetltest2.ora 
SPFILE='/home/db/oracle/oradata/etl_spfile'
[oracle@k1rac2 dbs]$ mv initetltest2.ora initetltest2.ora 

---将主机k1rac4上数据库库实例etltest1密码文件改为实例etltest2所有
[oracle@k1rac4 dbs]$ mv orapwetltest1 orapwetltest2

---将主机k1rac4上数据库库实例etltest1 pfile文件改为实例etltest2所有
[oracle@k1rac4 dbs]$ more initetltest1.ora
SPFILE='/home/db/oracle/oradata/etl_spfile'
[oracle@k1rac4 dbs]$ mv initetltest1.ora initetltest2.ora

---停止集群数据库实例
[oracle@k1rac2 dbs]$ srvctl stop database -d etltest
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      OFFLINE    OFFLINE              
ora.etltest.etltest1.inst           OFFLINE    OFFLINE              
ora.etltest.etltest2.inst           OFFLINE    OFFLINE              
ora.etltest.etltests.cs             OFFLINE    OFFLINE              
ora.etltest.etltests.etltest2.srv   OFFLINE    OFFLINE              
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---更新集群数据库spfile文件
SQL> create SPFILE='/home/db/oracle/oradata/etl_spfile' from pfile='/tmp/pfileetl.ora';

File created.

---将数据库及实例从集群中移除
[oracle@k1rac2 dbs]$ srvctl remove database -d etltest
Remove the database etltest? (y/[n]) y
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---重新将数据库及实例添加到集群
[oracle@k1rac2 dbs]$ srvctl add database -d etltest -o $ORACLE_HOME
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      OFFLINE    OFFLINE              
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    
[oracle@k1rac2 dbs]$ srvctl add instance -d etltest -i etltest1 -n k1rac2
[oracle@k1rac2 dbs]$ srvctl add instance -d etltest -i etltest2 -n k1rac4

---查看集群中数据库及实例状态
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      OFFLINE    OFFLINE              
ora.etltest.etltest1.inst           OFFLINE    OFFLINE              
ora.etltest.etltest2.inst           OFFLINE    OFFLINE              
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---启动集群数据库
[oracle@k1rac2 dbs]$ srvctl start database -d etltest
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      ONLINE     ONLINE     k1rac2    
ora.etltest.etltest1.inst           ONLINE     ONLINE     k1rac2    
ora.etltest.etltest2.inst           ONLINE     ONLINE     k1rac4    
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---添加相应的Service服务
[oracle@k1rac2 dbs]$ srvctl add service -d etltest -s etltests -r etltest1 -a etltest2 -P BASIC
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      ONLINE     ONLINE     k1rac2    
ora.etltest.etltest1.inst           ONLINE     ONLINE     k1rac2    
ora.etltest.etltest2.inst           ONLINE     ONLINE     k1rac4    
ora.etltest.etltests.cs             OFFLINE    OFFLINE              
ora.etltest.etltests.etltest1.srv   OFFLINE    OFFLINE              
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

---启动新添加的服务并查看集群状态
[oracle@k1rac2 dbs]$ srvctl start service -d etltest
[oracle@k1rac2 dbs]$ crs_stat.sh
Name                                Target     State      Host      
----------------------------------- ---------- ---------  -------   
ora.etltest.db                      ONLINE     ONLINE     k1rac2    
ora.etltest.etltest1.inst           ONLINE     ONLINE     k1rac2    
ora.etltest.etltest2.inst           ONLINE     ONLINE     k1rac4    
ora.etltest.etltests.cs             ONLINE     ONLINE     k1rac2         
ora.etltest.etltests.etltest1.srv   ONLINE     ONLINE     k1rac2         
ora.k1rac2.LISTENER_K1RAC2.lsnr     ONLINE     ONLINE     k1rac2    
ora.k1rac2.gsd                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.ons                      ONLINE     ONLINE     k1rac2    
ora.k1rac2.vip                      ONLINE     ONLINE     k1rac2    
ora.k1rac4.LISTENER_K1RAC4.lsnr     ONLINE     ONLINE     k1rac4    
ora.k1rac4.gsd                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.ons                      ONLINE     ONLINE     k1rac4    
ora.k1rac4.vip                      ONLINE     ONLINE     k1rac4    

--- 更新集群节点listener和tns文件

RAC中诡异的ORA-12545错误

这个错误是本月18号项目上线时遇到的,项目上线后应用开始进行测试,发现总是时断时续出现ORA-12545: Connect failed because target host or object does not exist. 这个错误提示很简单,一开始觉得这个错误按道理不应该出现,因为应用和中间件配的连接串配置是我这边提供的,其中设计到主机名和vip的地方我都改成了IP地址,而且还用了负载均衡,其连接串如下示例:

ORA10GTAF =
   (DESCRIPTION =
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.23)(PORT = 1521))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.24)(PORT = 1521))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = ora10gtaf)
       (FAILOVER_MODE =
         (BACKUP = ora10gtaf_preconnect)
         (TYPE = SELECT)
         (METHOD = PRECONNECT)
         (RETRIES = 180)
         (DELAY = 5)
       )
     )
   ) 

于是在Server端写了一个使用连接串方式每5秒连接一次数据的脚本去测试,结果等了30分钟,一次12545错误也没出现。由此开始怀疑应用和中间件与数据库之前的连接配置串有问题,或是他们擅自把我提供的连接配置串修改,然后去和应用的人核查,查完之后并未发现异常,无奈只能将前面在Server端每5秒连接一次数据的检查脚本放在一个Linux应用服务段测试,结果ORA-12545错误重现了,但并不是每次连接都出现ORA-12545错误,而是间歇性出现。后来仔细观察才发现,每次连接成功的都是连接到实例1,永远连接不上实例2上,然后为了能让其连上实例2上,我在连接串里面把1节点的连接地址及协议注销掉,只留节点2的连接信息,示例如下:

ORA10GTAF =
   (DESCRIPTION =
   # (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.23)(PORT = 1521))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.24)(PORT = 1521))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = ora10gtaf)
       (FAILOVER_MODE =
         (BACKUP = ora10gtaf_preconnect)
         (TYPE = SELECT)
         (METHOD = PRECONNECT)
         (RETRIES = 180)
         (DELAY = 5)
       )
     )
   )

然后再用那个每5秒连接数据库检测脚本去测试,结果这次连接测试全部失败,没有一次连接成功,然后就开始思考为何出现这种情况,考虑到第一次测试时能连接成功的总是实例1,从没有一次能连接上实例2,第二次单独去测试连接实例2就出现全部连接失败的问题,突然想到RAC的负载均衡问题,由此判断实例2负载很可能大于1实例(负载指每个实例的连接会话数和服务器压力大小)。在实例2的负载大于实例1前提下,当第二次去单独测试连接实例2时,由于RAC的负载均衡机制,每次都会将连接实例2的请求重定向到实例1上,那么连接请求重定向如果失败是不是就出现了上面的ORA-12545错误呢?想到这一点我立即查了一下连个节点的负载情况:

SQL> select inst_id,count(*) from gv$session group by inst_id;
   INST_ID   COUNT(*)
---------- ----------
         1         30
         2         74

果然实例2的连接数远远大于实例1,刚好在处理的过程中在MOS上看到一篇于此类似的文章Client Connection to RAC Intermittently Fails-ORA-12545 TNS: Host or Object Does not Exist (Doc ID 364855.1),看完解开了心中疑惑。

Symptoms
When attempting to connect to a RAC service name, the connection intermittently fails with an ORA-12545 error:  TNS: Host or object doesn't exist.   The expected behavior is that the connection would be redirected to any of the listener in the cluster where the listener is running on the VIP.  This is normal under RAC Cluster node load balancing. 
However, when the server side listener endpoints are not correctly configured OR the client cannot resolve all forms of the VIP hostname, this error can get thrown.
 
Cause
Whe LOCAL_LISTENER is not set at all or is not set correctly, PMON can register both the VIP hostname and the physical hostname with the REMOTE_LISTENER.  This is incorrect in RAC but if it does occur, the client can intermittently be routed to a listener endpoint on either of these addresses.  If the client is unable to resolve the physical hostname or even the vip hostname, the connection will fail with ORA-12545.
Additionall, if the VIP hostname cannot be resolved, but the VIP numerical address is reachable; then this VIP numerical address should be used as the HOST value for LOCAL_LISTENER instead.
And important first set is to check the output for :  lsnrctl services <listener_name>
You should not see the physical hostname in the handler information for your RAC instances.   Here's an example of an incorrect lsnrctl services <listener_name> output for a 2-node RAC cluster:
------------------------------------
lsnrctl services oracle_rac_listener 
Service "oracle.oracle.com" has 2 instance(s). 
Instance "orcl1", status READY, has 2 handler(s) for this service... 
Handler(s): 
"DEDICATED" established:0 refused:0 state:ready 
REMOTE SERVER 
(ADDRESS=(PROTOCOL=TCP)(HOST=myphysicalhost1)(PORT=1521)) 
"DEDICATED" established:326 refused:0 state:ready 
LOCAL SERVER 
 
Instance "orcl2", status READY, has 1 handler(s) for this service... 
Handler(s): 
"DEDICATED" established:394371 refused:0 state:ready 
REMOTE SERVER 
(ADDRESS=(PROTOCOL=TCP)(HOST=myphysicalhost2)(PORT=1521))
 
Here's an example of what it ought to look like when LOCAL_LISTENER is configured correctly:
------------------------------------
lsnrctl services oracle_rac_listener 
Service "oracle.oracle.com" has 2 instance(s). 
Instance "orcl1", status READY, has 2 handler(s) for this service... 
Handler(s): 
"DEDICATED" established:0 refused:0 state:ready 
REMOTE SERVER 
(ADDRESS=(PROTOCOL=TCP)(HOST=myVIP_host1)(PORT=1521)) 
"DEDICATED" established:326 refused:0 state:ready 
LOCAL SERVER 
 
Instance "orcl2", status READY, has 1 handler(s) for this service... 
Handler(s): 
"DEDICATED" established:394371 refused:0 state:ready 
REMOTE SERVER 
(ADDRESS=(PROTOCOL=TCP)(HOST=myVIP_host2)(PORT=1521))
 
Solution
Log in with privileges to the instance and issue the following commands so that LOCAL_LISTENER is set correctly:
-----------------------------------
alter system set LOCAL_LISTENER="(address=(protocol=tcp)(port=1521)(host=<your_vip_node1>))" scope=both sid='INSTANCE_NAME1';
Do the same for the 2nd instance where host is set to the <vip_host_node2> and the sid is set to the 2nd instance name.
alter system set LOCAL_LISTENER="(address=(protocol=tcp)(port=1521)(host=<your_vip_node2>))" scope=both sid='INSTANCE_NAME2'; 
 
Now the correct address using the VIP hostname will be registered against the listeners in the cluster.
 
The client should be able to resolve all forms of the VIP host via their /etc/hosts file.  i.e. short name and fully qualified.
In the case that a VIP hostname cannot be resolved by the client, the following syntax can be used with LOCAL_LISTENER to tell the client to connect using an IP address instead of the VIP hostname
-----------------------------------
alter system set LOCAL_LISTENER="(address=(protocol=tcp)(port=1521)(host=1.2.3.4))"
... where "1.2.3.4" is the IP address of your VIP interface

上面从MOS上贴出来的这些内容足以说明,RAC的负载均衡机制在将连接请求重定向时,如果client端未解析vip名时,而local_listener又未直接使用具体值而是标识符时,极有可能出现ORA-12545错误,解决办法上面已经说了,当然如果条件允许还可以考虑统一使用域名服务器解析。