客户10.2.0.5 RAC数据库日常检查时发现告警日志出现ORA-600 [4511]错误:
******部分信息已做处理******
Tue Dec 23 11:15:01 CST 2014
Errors in file /home/db/oracle/admin/******/bdump/******_mmon_3957.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Tue Dec 23 11:15:44 CST 2014
Trace dumping is performing id=[cdmp_20141223111544]
Tue Dec 23 11:18:29 CST 2014
Trace dumping is performing id=[cdmp_20141223111829]
Tue Dec 23 11:20:00 CST 2014
Restarting dead background process MMON
MMON started with pid=31, OS id=9047
查看对应的trace文件******_mmon_3957.trc提取出如下信息:
##############################
*** SERVICE NAME:(SYS$BACKGROUND) 2014-12-23 11:15:01.036
*** SESSION ID:(1062.6626) 2014-12-23 11:15:01.036
*** 2014-12-23 11:15:01.036
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Current SQL statement for this session:
delete from WRI$_ALERT_OUTSTANDING where reason_id = :1 and object_id = :2 and subobject_id = :3 and internal_instance_number = :4 returning owner, object_name, subobject_name, sequence_id, error_instance_id, creation_time into :5, :6, :7, :8, :9, :10
##############################
Block header dump: 0x00c007e2
Object id on Block? Y
seg/obj: 0x2286 csc: 0x04.7eefed77 itc: 4 flg: E typ: 2 – INDEX
brn: 0 bdba: 0xc007df ver: 0x01 opc: 0
inc: 0 exflg: 0
##############################
Block header dump: 0x00c051c1
Object id on Block? Y
seg/obj: 0x2285 csc: 0x04.7ef4ab8c itc: 2 flg: E typ: 1 – DATA
brn: 1 bdba: 0xc007d9 ver: 0x01 opc: 0
inc: 0 exflg: 0
根据MOS上ORA-600 [4511] “Row locked by inactive ITL” (Doc ID 39553.1)对ORA-600 [4511]的描述:
DESCRIPTION:
We receive this error when a row lock is being tested and the lock is not active at the time.
This may also be caused by having more than 4095 rows in a single data block. Only blocks of 32Kb can contain this many rows. The workaround is to limit the number of rows in the block, by setting pctfree to a higher value.
FUNCTIONALITY:
Transaction block operations.
IMPACT:
Process failure, possible block corruption.
SUGGESTIONS:
If you know the table(s) involved, use ANALYZE TABLE <table_name> VALIDATE STRUCTURE CASCADE, to check for possible problems.
Run DBVERIFY on any datafiles associated with the table(s).
This can be related to a Consistent Read (CR) problem, so shutdown and restart the instance and see if the problem reproduces.
结合trace文件里提到的block seg/obj: 0x2286和seg/obj: 0x2285找到了对应的objects
SQL> select object_name,object_type,owner from dba_objects where data_object_id in (8837,8838)
OBJECT_NAME OBJECT_TYPE OWNER
------------------------------ ------------------- ------------------------------
WRI$_ALERTS_OUTSTANDING_PK INDEX SYS
WRI$_ALERT_OUTSTANDING TABLE SYS
以上查询结果已经很清晰,就是我们trace文件里提到的导致session失败的SQL语句中的表WRI$_ALERT_HISTORY及其索引,按照MOS [39553.1]描述,重建表索引及对该表进行分析:
SQL> alter index WRI$_ALERTS_OUTSTANDING_PK rebuild;
Index altered.
SQL> analyze table WRI$_ALERT_OUTSTANDING validate structure cascade;
Table analyzed.
对表所在的system表空间数据文件进行dbv校验未发现坏块:
$ dbv file=’/home/db/oracle/oradata/******/system01.dbf’ block_size=32768
最后重启数据库,数据库alert日志再未出现该ORA-600 [4511]核心错误。