首页 » ORACLE 9i-23c » troubleshooting ORA-600[6302] & ORA-600 [6200] corrupted index block

troubleshooting ORA-600[6302] & ORA-600 [6200] corrupted index block

今天同事的一套数据库遇到了alert 日志 一堆的ora-600,这是一套10.2.0.5 db 2nodes RAC on AIX , 找我帮着看看, 最终确认为一个索引出现了block corrupted 勘误块导致, 重新创建的了索引后解决, 下面简单记录一下我的诊断过程 。

# db alert log file #

Sun Feb 01 07:14:46 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:15:07 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:15:07 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [ksu_register_tac-1], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:15:09 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [ksu_register_tac-1], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:15:38 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:15:46 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:16:09 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:16:40 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 07:16:46 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
ORA-07445: exception encountered: core dump [kghrst+020c] [SIGSEGV] [Address not mapped to object] [0xF08C704140910] [] []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []

===============
Sun Feb 01 12:26:54 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_56230004.trc:
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Sun Feb 01 12:27:53 BEIST 2015
SMON: Parallel transaction recovery tried
Sun Feb 01 12:28:04 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
Sun Feb 01 12:28:04 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
Sun Feb 01 12:28:04 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
Sun Feb 01 12:28:04 BEIST 2015

==============
Sun Feb 01 14:28:37 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_51249546.trc:
ORA-00600: internal error code, arguments: [6300], [], [], [], [], [], [], []
Sun Feb 01 14:28:37 BEIST 2015
Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_51249546.trc:
ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0xD2007308C704241D] [] []
ORA-00600: internal error code, arguments: [6300], [], [], [], [], [], [], []

TIP:
错误从早上7点开始,一直到下午都在不断有ORA-600 ORA-7445 的错误信息写入alert,错误信息眼花缭乱,当多个错误出现是通常是从最早产生的错误查起,也许后面是附加的错误, 我们先从第一个ora-600 开始,ora-600 6302 是验证index key 相关的错误。

# ora-600 6302 trace file #

key1 (15104): 
 ab 08 c7 04 24 15 01 42 59 46 05 c4 11 04 5f 34 00 00 18 4d f0 ef 00 0f 08
 c7 04 24 15 01 42 59 46 05 c4 11 3d 45 2e 00 00 5b 47 7d 8b 00 0e 08 c7 04
 24 15 01 42 59 46 05 c4 12 32 33 0a 00 00 5c 4b 0b 1c 00 82 08 c7 04 24 15
 01 42 59 46 05 c4 12 3f 37 13 00 00 5e 4d 44 e3 00 e4 08 c7 04 24 15 01 42
 59 46 05 c4 12 41 4c 2d 00 00 5d 46 cb 5f 00 59 08 c7 04 24 15 01 42 59 46
...
 04 24 15 01 42 58 61 05 c4 12 45 25 3c 00 00 19 0d 8b 5c 00 4b 08 c7 04 24
 15 01 42 58
key2 (4):  24 15 01 42
*** 2015-02-01 07:14:46.313
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
Current SQL statement for this session:
MERGE INTO TEST t1
USING (SELECT :mincycle          as MIN_CYCLE_ID,
              :maxcycle          as MAX_CYCLE_ID,
              :freevol          as FREE_VOLUME,
              :usagevol          as USAGE_VOLUME,
              :overvol          as OVER_VOLUME,
              :tariffplan          as TARIFF_PLAN_ID,
              to_date(:endtime           , 'YYYYMMDDhh24miss') as CDR_END_TIME,
              :key_id          as KEY_ID,
              :counterid          as COUNTER_ID,
              :attrid          as ATTR_ID,
              :subsid          as SUBS_ID
         from dual) t2
ON (t1.KEY_ID = t2.KEY_ID and t1.COUNTER_ID = t2.COUNTER_ID and t1.ATTR_ID = t2.ATTR_ID and t1.SUBS_ID = t2.SUBS_ID)
WHEN MATCHED THEN
  update
     set t1.MIN_CYCLE_ID   = t2.MIN_CYCLE_ID,
         t1.MAX_CYCLE_ID   = t2.MAX_CYCLE_ID,
         t1.FREE_VOLUME    = t2.FREE_VOLUME,
         t1.USAGE_VOLUME   = t2.USAGE_VOLUME,
         t1.OVER_VOLUME    = t2.OVER_VOLUME,
         t1.TARIFF_PLAN_ID = t2.TARIFF_PLAN_ID,
         t1.CDR_END_TIME   = t2.CDR_END_TIME
WHEN NOT MATCHED THEN
  insert
    (t1.MIN_CYCLE_ID,
     t1.MAX_CYCLE_ID,
     t1.FREE_VOLUME,
     t1.USAGE_VOLUME,
     t1.OVER_VOLUME,
     t1.TARIFF_PLAN_ID,
     t1.CDR_END_TIME,
     t1.KEY_ID,
     t1.COUNTER_ID,
     t1.ATTR_ID,
     t1.SUBS_ID)
  values
    (t2.MIN_CYCLE_ID,
     t2.MAX_CYCLE_ID,
     t2.FREE_VOLUME,
     t2.USAGE_VOLUME,
     t2.OVER_VOLUME,
     t2.TARIFF_PLAN_ID,
     t2.CDR_END_TIME,
     t2.KEY_ID,
     t2.COUNTER_ID,
     t2.ATTR_ID,
     t2.SUBS_ID)
----- Call Stack Trace -----
kgeriv kgeasi kdksep kdxtms kdisle kdiins0 kdiinsp kauxsin insidx insflush qerusfetch updaul upsexe

...
 Object id on Block? Y
 seg/obj: 0x1adb65  csc: 0xce8.dbee5287  itc: 3  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x5c4e4d05 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0f47.03a.000b6058  0x5e027858.67b7.06  --U-    3  fsc 0x0000.dbee5ae5
0x02   0x1074.040.000858a2  0x010b3d2f.4bab.06  ----    5  fsc 0x0000.00000000
0x03   0x0ff1.000.000cda32  0x5e04bd1d.6e0b.8d  C---    0  scn 0x0ce8.dbe90591
...

Tip:
可以看到是有一条merge insert 引起,在MOS 中search一圈没有找到相关信息,我们知道是与index相关的,那么trace 中的key 是不是就是index key呢?先查一下trace中的错误对象

SQL> select to_number('1adb65','xxxxxxxxxxx') from dual;

TO_NUMBER('1ADB65','XXXXXXXXXXX')
---------------------------------
                          1760101
						  
select owner,object_name,object_type from dba_objects where object_id=1760101;
OWNER                OBJECT_NAME                    OBJECT_TYPE
-------------------- ------------------------------ -------------------------
USERINFO             TEST                   TABLE PARTITION

TIP:
确定是我们SQL语句中的表,那下一步查询表上的所有索引,重点是两列的组合索引

SQL> select 
  2       c.table_owner ind_table_owner1,
  3       c.table_name ind_table_name1, 
  4       c.index_name ind_index_name1, 
  5       c.column_position ind_column_position1, 
  6       c.column_name ind_column_name1, 
  7       decode(c.descend,'DESC','DESC',null) ind_desc1,i.index_type,t.data_type,i.status 
  8      from 
  9       dba_ind_columns c
 10       join dba_indexes i on c.index_name=i.index_name
 11      join dba_tab_columns t on t.owner=i.table_owner and t.table_name=i.table_name and t.column_name=c.column_name
 12      where
 13      i.table_owner='USERINFO' and i.table_name='TEST' order by 2,3;

IND_TABLE_OWNER IND_TABLE_NAME1 IND_INDEX_NAME1      IND_COLUMN_POSITION1 IND_COLUMN_NAME IND_ INDEX_TY DATA_TYPE  STATUS
--------------- --------------- -------------------- -------------------- --------------- ---- -------- ---------- ----------
USERINFO        TEST    IDX1_TEST                       1 MIN_CYCLE_ID         NORMAL   NUMBER     N/A
USERINFO        TEST    IDX1_TEST                       2 MAX_CYCLE_ID         NORMAL   NUMBER     N/A
USERINFO        TEST    INX_TEST                        1 KEY_ID               NORMAL   NUMBER     N/A
USERINFO        TEST    INX_TEST                        2 COUNTER_ID           NORMAL   NUMBER     N/A

Tip:
表上有两个两列的组合索引,不过列上都是number 类型, 因为第二个key 值比较短,那我们比较容易转换成真实的十进制数值,然后确认是哪个索引,转换方法参考我以前的笔记还原data block dumps实际值

key2 (4):  24 15 01 42

SQL> select to_number('24','xxxxxxxxxxxxx') from dual;
TO_NUMBER('24','XXXXXXXXXXXXX')
-------------------------------
                             36
SQL> c/24/15
  1* select to_number('15','xxxxxxxxxxxxx') from dual
SQL> /

TO_NUMBER('15','XXXXXXXXXXXXX')
-------------------------------
                             21
SQL> c/15/01
  1* select to_number('01','xxxxxxxxxxxxx') from dual
SQL> /
TO_NUMBER('01','XXXXXXXXXXXXX')
-------------------------------
                              1
SQL> c/01/42
  1* select to_number('42','xxxxxxxxxxxxx') from dual
SQL> /
TO_NUMBER('42','XXXXXXXXXXXXX')
-------------------------------
                             66

TIP:
那拼进来就是3621166, 然后这也就是索引第二列的值,因为目前我们还没确认是哪个索引,所以尝试把两个索引的第二列都查一下,看哪个值存在?

select min_cycle_id from  userinfo.TEST where max_cycle_id=3621166 ;

SQL> select key_id from  userinfo.TEST where counter_id=3621166;

    KEY_ID
----------
3.1508E+12
3.1908E+12
3.3520E+12

Tip:
那到这里我们就可以初步判断是INX_TEST这个索引出了问题。我们再找另一个ora-600 确认下,ora-600 6300的trace文件没有可用信息,看ora-600 6200这是一个index block corrupt相关的错误

# ora-600 6200 trace file #

Block Checking: DBA = 526172808, Block Type = KTB-managed data block
**** row 19: key out of order
**** row 40: key out of order
**** row 101: key out of order
**** row 123: row length 65543 past end of block
**** row 123: row skipped so other stats may be wrong
**** row 196: row length 65543 past end of block
**** row 196: row skipped so other stats may be wrong
**** row 197: key out of order
**** row 243: row length 65543 past end of block
**** row 243: row skipped so other stats may be wrong
**** row 250: row length 65543 past end of block
**** row 250: row skipped so other stats may be wrong
**** row 251: row length 65543 past end of block
**** row 251: row skipped so other stats may be wrong
**** row 252: row length 65543 past end of block
**** row 252: row skipped so other stats may be wrong
**** row 253: row length 65543 past end of block
**** row 253: row skipped so other stats may be wrong
**** row 254: row length 65543 past end of block
**** row 254: row skipped so other stats may be wrong
**** row 255: row length 65543 past end of block
**** row 255: row skipped so other stats may be wrong
**** row 256: row length 65543 past end of block
**** row 256: row skipped so other stats may be wrong
**** row 257: key out of order
**** row 518: row length 65543 past end of block
**** row 518: row skipped so other stats may be wrong
**** row 519: row length 65543 past end of block
**** row 519: row skipped so other stats may be wrong
**** row 520: row length 65543 past end of block
**** row 520: row skipped so other stats may be wrong
**** actual rows marked deleted = 1 != kdxlende = 0
**** key (begin=0x23fd, len=71) overlaps with another
        begin = 0x2414 len = 23
---- end index block validation
Block header dump:  0x1f5cc288
 Object id on Block? Y
 seg/obj: 0x1b0086  csc: 0xce8.dbee5f7f  itc: 3  flg: E  typ: 2 - INDEX
     brn: 3  bdba: 0x1d8d9e05 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0032.056.01c81b4c  0x01070ee9.78ea.01  CBU-    0  scn 0x0ca6.df9c8ce1
0x02   0x1090.00a.0004c828  0x5e015267.174a.32  C---    0  scn 0x0ce8.db5c9770
0x03   0x0ff4.062.00107720  0x0100e967.7562.4e  C---    0  scn 0x0ce8.d449ede4

TIP:
这是一个索引块,下面我们来确认错误对象

SQL> select to_number('1b0086','xxxxxxxxxx') from dual;
TO_NUMBER('1B0086','XXXXXXXXXX')
--------------------------------
                         1769606
						 
SQL> select owner,object_name,object_type from dba_objects where object_id=1769606;
OWNER      OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ -------------------
USERINFO   INX_TEST               INDEX PARTITION

Note:
和上面我们用ora-600 6302 trace 推断的对象是一样的,下面我们根据dba 确认一下数据块的位置

SQL> select to_char(526172808,'xxxxxxxxxxxx') from dual;
TO_CHAR(52617
-------------
     1f5cc288

SQL> @dba 1f5cc288
    RFILE#     BLOCK# BIGFILE_BLOCK# DUMP_CMD
---------- ---------- -------------- ---------------
       125    1884808      526172808 -- alter system dump datafile 125 block 1884808

SQL> select name from v$datafile where RFILE#=125;
NAME
--------------------------------------------
/dev/rzwc_lv30_088

> dbv file=/dev/rzwc_lv30_088 blocksize=16384

DBVERIFY: Release 10.2.0.5.0 - Production on Sun Feb 1 15:58:23 2015

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = /dev/rzwc_lv30_088
Block Checking: DBA = 526172808, Block Type = KTB-managed data block
**** row 19: key out of order
**** row 40: key out of order
**** row 101: key out of order
**** row 123: row length 65543 past end of block
**** row 123: row skipped so other stats may be wrong
**** row 196: row length 65543 past end of block
**** row 196: row skipped so other stats may be wrong
**** row 197: key out of order
**** row 243: row length 65543 past end of block
**** row 243: row skipped so other stats may be wrong
**** row 250: row length 65543 past end of block
**** row 250: row skipped so other stats may be wrong
**** row 251: row length 65543 past end of block
**** row 251: row skipped so other stats may be wrong
**** row 252: row length 65543 past end of block
**** row 252: row skipped so other stats may be wrong
**** row 253: row length 65543 past end of block
**** row 253: row skipped so other stats may be wrong
**** row 254: row length 65543 past end of block
**** row 254: row skipped so other stats may be wrong
**** row 255: row length 65543 past end of block
**** row 255: row skipped so other stats may be wrong
**** row 256: row length 65543 past end of block
**** row 256: row skipped so other stats may be wrong
**** row 257: key out of order
**** row 518: row length 65543 past end of block
**** row 518: row skipped so other stats may be wrong
**** row 519: row length 65543 past end of block
**** row 519: row skipped so other stats may be wrong
**** row 520: row length 65543 past end of block
**** row 520: row skipped so other stats may be wrong
**** actual rows marked deleted = 1 != kdxlende = 0
**** key (begin=0x23fd, len=71) overlaps with another
        begin = 0x2414 len = 23
---- end index block validation
Page 1884808 failed with check code 6401


DBVERIFY - Verification complete

Total Pages Examined         : 1966016
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 1923445
Total Pages Failing   (Index): 1
Total Pages Processed (Other): 24369
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 18202
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 4043020581 (3304.4043020581)
kdzwc2_bds:/home/oracle> 

Note:
最终我们确认了是数据文件/dev/rzwc_lv30_088 block# 1884808上存的index block corrupt导致的这一系列ora-600 ora-7445, 解决方法可以考虑drop index, re-create index。 后来也得到确认索引重建后错误没有再出现。这类corrupte block可能是因为突然断电,内存勘误,写错误,oracle bug等原因,bug问题可以提SR 确认。

— enjoy—

打赏

,

对不起,这篇文章暂时关闭评论。