作者归档:Lynn

关于Lynn

Oracle DBA is my Occupation dream, Focus on Oracle, Love the life! I hope I can make a difference in this industry.

ORAchk Health Checks For The Oracle Stack

ORAchk是一个数据库健康检查评估工具,可以帮助我们发现系统及数据库所存在的各种问题,常用于数据库故障及性能问题调查时有力工具。

目前在生产中使用越来越多,之前该工具命名为RACcheck,后改名为ORAchk [RACcheck Configuration Audit Tool Statement of Direction – name change to ORAchk (Doc ID 1591208.1)],关于该工具详细信息请参考MOS:
ORAchk – Health Checks for the Oracle Stack (Doc ID 1268927.2)

Orachk/UserGuide/Example Reports下载:http://pan.baidu.com/s/1bn4LhSF 下载密码:j7eh

ORA-600 [4511]

客户10.2.0.5 RAC数据库日常检查时发现告警日志出现ORA-600 [4511]错误:
******部分信息已做处理******
Tue Dec 23 11:15:01 CST 2014
Errors in file /home/db/oracle/admin/******/bdump/******_mmon_3957.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Tue Dec 23 11:15:44 CST 2014
Trace dumping is performing id=[cdmp_20141223111544]
Tue Dec 23 11:18:29 CST 2014
Trace dumping is performing id=[cdmp_20141223111829]
Tue Dec 23 11:20:00 CST 2014
Restarting dead background process MMON
MMON started with pid=31, OS id=9047

查看对应的trace文件******_mmon_3957.trc提取出如下信息:
##############################
*** SERVICE NAME:(SYS$BACKGROUND) 2014-12-23 11:15:01.036
*** SESSION ID:(1062.6626) 2014-12-23 11:15:01.036
*** 2014-12-23 11:15:01.036
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Current SQL statement for this session:
delete from WRI$_ALERT_OUTSTANDING where reason_id = :1 and object_id = :2 and subobject_id = :3 and internal_instance_number = :4 returning owner, object_name, subobject_name, sequence_id, error_instance_id, creation_time into :5, :6, :7, :8, :9, :10
##############################
Block header dump:  0x00c007e2
 Object id on Block? Y
 seg/obj: 0x2286  csc: 0x04.7eefed77  itc: 4  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0xc007df ver: 0x01 opc: 0
     inc: 0  exflg: 0
##############################
Block header dump:  0x00c051c1
 Object id on Block? Y
 seg/obj: 0x2285  csc: 0x04.7ef4ab8c  itc: 2  flg: E  typ: 1 – DATA
     brn: 1  bdba: 0xc007d9 ver: 0x01 opc: 0
     inc: 0  exflg: 0

根据MOS上ORA-600 [4511] “Row locked by inactive ITL” (Doc ID 39553.1)对ORA-600 [4511]的描述:
DESCRIPTION:
We receive this error when a row lock is being tested and the lock is not active at the time.
This may also be caused by having more than 4095 rows in a single data block. Only blocks of 32Kb can contain this many rows. The workaround is to limit the number of rows in the block, by setting pctfree to a higher value.
FUNCTIONALITY:
Transaction block operations.
IMPACT:
Process failure, possible block corruption.
SUGGESTIONS:
If you know the table(s) involved, use ANALYZE TABLE <table_name> VALIDATE STRUCTURE CASCADE, to check for possible problems.
Run DBVERIFY on any datafiles associated with the table(s).
This can be related to a Consistent Read (CR) problem, so shutdown and restart the instance and see if the problem reproduces.

结合trace文件里提到的block seg/obj: 0x2286和seg/obj: 0x2285找到了对应的objects

SQL> select object_name,object_type,owner from dba_objects where data_object_id in (8837,8838)

OBJECT_NAME                    OBJECT_TYPE         OWNER
------------------------------ ------------------- ------------------------------
WRI$_ALERTS_OUTSTANDING_PK     INDEX               SYS
WRI$_ALERT_OUTSTANDING         TABLE               SYS

以上查询结果已经很清晰,就是我们trace文件里提到的导致session失败的SQL语句中的表WRI$_ALERT_HISTORY及其索引,按照MOS [39553.1]描述,重建表索引及对该表进行分析:
SQL> alter index WRI$_ALERTS_OUTSTANDING_PK rebuild;

Index altered.

SQL> analyze table WRI$_ALERT_OUTSTANDING validate structure cascade; 

Table analyzed.

对表所在的system表空间数据文件进行dbv校验未发现坏块:
$ dbv file=’/home/db/oracle/oradata/******/system01.dbf’ block_size=32768

最后重启数据库,数据库alert日志再未出现该ORA-600 [4511]核心错误。