这是一个非常典型的dataguard 环境中存在的bug, 在oracle 11.1-12.1版本之间一直存在的BUG,当Physical Standby Dataguard 环境发现switchover或者failover后,在验证索引块上存在无效的scn时抛出的错误ORA-600 [2663],或者有时附带ORA-600 [ktbdchk1: bad dscn], 该BUG不会导致当前数据块上有数据丢失,也不会有数据勘误在索引块上。 这里简单记录.
# db alert log
2019-09-17 03:05:37.019000 +08:00 Reconfiguration complete 2019-09-17 03:05:48.233000 +08:00 Thread 1 advanced to log sequence 230870 (LGWR switch) Current log# 4 seq# 230870 mem# 0: /dev/yyc_oravg02/ryyc_redo04 Archived Log entry 416230 added for thread 1 sequence 230869 ID 0x1fcb56a7 dest 1: LNS: Standby redo logfile selected for thread 1 sequence 230870 for destination LOG_ARCHIVE_DEST_2 2019-09-17 04:04:06.890000 +08:00 Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_21423.trc (incident=3797670): ORA-00600: internal error code, arguments: [2663], [3850], [1033791888], [3850], [1616068854], [], [], [], [], [], [], [] Incident details in: /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_3797670/anbob1_ora_21423_i3797670.trc 2019-09-17 04:04:16.898000 +08:00 Dumping diagnostic data in directory=[cdmp_20190917040416], requested by (instance=1, osid=21423), summary=[incident=3797670]. Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_19843.trc (incident=3796846): ORA-00600: internal error code, arguments: [2663], [3850], [1033876871], [3850], [1616068854], [], [], [], [], [], [], [] Incident details in: /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_3796846/anbob1_ora_19843_i3796846.trc
# trace file
ORA-00600: internal error code, arguments: [2663], [3850], [1033791888], [3850], [1616068854], [], [], [], [], [], [], []
========= Dump for incident 3797670 (ORA 600 [2663]) ========
*** 2019-09-17 04:04:06.912
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=aa2tagwuur40v) -----
INSERT INTO BUSINESS_xxx values(xxxx);
----- Call Stack Trace -----
calling call entry
location type point
-------------------- -------- --------------------
skdstdst()+64 call kgdsdst()
ksedst()+432 call skdstdst()
dbkedDefDump()+1440 call ksedst()
ksedmp()+64 call dbkedDefDump()
ksfdmp()+96 call ksedmp()
$cold_dbgexPhaseII( call ksfdmp()
)+576
dbgexProcessError() call $cold_dbgexPhaseII(
+2096 )
dbgeExecuteForError call dbgexProcessError()
()+288
dbgePostErrorKGE()+ call dbgeExecuteForError
2368 ()
dbkePostKGE_kgsf()+ call dbgePostErrorKGE()
128
kgeade()+496 call dbkePostKGE_kgsf()
kgeriv_int()+176 call kgeade()
kgeriv()+48 call kgeriv_int()
kgesiv()+192 call kgeriv()
ksesic4()+176 call kgesiv()
kcrfw_redo_gen()+29 call ksesic4()
12
kcbchg1_main()+9056 call kcrfw_redo_gen()
kcbchg1()+352 call kcbchg1_main()
ktbgfc()+1104 call kcbchg1()
$cold_ktbgcl1()+451 call ktbgfc()
2
ktbgfi()+3424 call $cold_ktbgcl1()
kdiins0()+275376 call ktbgfi()
kdiinsp()+368 call kdiins0()
kauxsin()+3120 call kdiinsp()
qesltcLoadIndexList call kauxsin()
()+1936
qesltcLoadIndexes() call qesltcLoadIndexList
+96 ()
qerltcNoKdtBuffered call qesltcLoadIndexes()
InsRowCBK()+752
qerltcSingleRowLoad call qerltcNoKdtBuffered
()+608 InsRowCBK()
qerltcFetch()+688 call qerltcSingleRowLoad
()
insexe()+1584 call qerltcFetch()
opiexe()+16192 call insexe()
kpoal8()+4624 call opiexe()
opiodr()+2416 call kpoal8()
ttcpip()+1792 call opiodr()
opitsk()+3024 call ttcpip()
opiino()+1696 call opitsk()
opiodr()+2416 call opiino()
opidrv()+1616 call opiodr()
sou2o()+256 call opidrv()
opimai_real()+656 call sou2o()
ssthrdmain()+576 call opimai_real()
main()+336 call ssthrdmain()
main_opd_entry()+80 call main()
解决方法
1, 可以安装Patch 22241601, 最初问题是Bug 8895202. 安装2224161会enable 8895202 fixed.
2, 如果你的数据库版本>=11.2.0.2,仅需要配置参数_ktb_debug_flags=8, 当装了1# path, 该事件默认启动
3, 当做任一修复后,当db 再次读写该块时会自动修复索引块上的无效scn, 但是如果没有读写过可能会在dbv时继续报错。
4, 如果以上方法没有修复可以在primary db上重建该索引。
5,升级到12cR2以后。
如果您解决不了,可以联系www.anbob.com 首页上的联系方式。