刚接手一套数据库,数据库执行SQL时提示 ORA-00600: internal error code, arguments: [6704], [2], [1405099], [953234], [], [], [], [], [], [], [], []],下面简单记录一下。
#CHECK DB Alert log
adrci> show incident
ADR Home = /oracle/app/oracle/diag/rdbms/anbob/anbob1:
*************************************************************************
INCIDENT_ID PROBLEM_KEY CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
604431 ORA 603 2014-01-01 03:47:35.535000 +08:00
607239 ORA 603 2014-01-01 03:48:06.268000 +08:00
654545 ORA 600 [kgantc_1] 2014-01-27 15:29:22.834000 +08:00
675729 ORA 603 2014-01-27 15:32:50.200000 +08:00
677322 ORA 600 [FILE:psdmsc.c LINE:795 ID:OCIKCallPush] 2014-04-01 17:19:01.494000 +08:00
681378 ORA 603 2014-06-01 05:05:39.366000 +08:00
680218 ORA 603 2014-06-01 05:05:39.417000 +08:00
676178 ORA 445 2014-07-01 08:21:36.993000 +08:00
676010 ORA 445 2014-08-01 00:29:27.336000 +08:00
680154 ORA 603 2014-09-01 08:14:35.356000 +08:00
679754 ORA 445 2014-09-01 09:21:00.207000 +08:00
701471 ORA 603 2014-09-02 06:43:20.432000 +08:00
700543 ORA 603 2014-09-02 06:44:01.439000 +08:00
702671 ORA 603 2014-09-02 06:49:17.915000 +08:00
700927 ORA 600 [kdddgb2] 2014-10-01 00:01:18.243000 +08:00
701935 ORA 600 [6704] <<<< 2014-10-01 00:02:45.452000 +08:00
701927 ORA 600 [kdddgb2] 2014-10-01 00:02:45.464000 +08:00
723749 ORA 603 2014-10-10 14:00:45.568000 +08:00
726558 ORA 600 [6704] <<<< 2014-11-01 00:01:32.396000 +08:00
724174 ORA 600 [6704] 2014-11-01 00:01:32.396000 +08:00
...
747790 ORA 603 2014-11-25 17:15:22.582000 +08:00
747800 ORA 603 2014-11-25 17:16:17.296000 +08:00
725310 ORA 600 [6704] <<<<< 2014-12-01 00:01:42.253000 +08:00 724030 ORA 445 2014-12-01 05:43:53.934000 +08:00 751458 ORA 603 2014-12-01 09:20:27.569000 +08:00 751402 ORA 603 2014-12-02 11:12:37.346000 +08:00 753066 ORA 603 2014-12-03 11:45:00.517000 +08:00 747890 ORA 32701 2014-12-22 01:19:10.370000 +08:00 747891 ORA 32701 2014-12-22 01:20:46.883000 +08:00 42 rows fetched adrci> show incident -mode detail -p "incident_id=725310"
ADR Home = /oracle/app/oracle/diag/rdbms/anbob/anbob1:
*************************************************************************
**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
INCIDENT_ID 725310
STATUS ready
CREATE_TIME 2014-12-01 00:01:42.253000 +08:00
PROBLEM_ID 8
CLOSE_TIME
FLOOD_CONTROLLED none
ERROR_FACILITY ORA
ERROR_NUMBER 600
ERROR_ARG1 6704
ERROR_ARG2 2
ERROR_ARG3 1405099
ERROR_ARG4 953234
ERROR_ARG5
ERROR_ARG6
ERROR_ARG7
ERROR_ARG8
ERROR_ARG9
ERROR_ARG10
ERROR_ARG11
ERROR_ARG12
SIGNALLING_COMPONENT
SIGNALLING_SUBCOMPONENT
SUSPECT_COMPONENT
SUSPECT_SUBCOMPONENT
ECID
IMPACTS 0
PROBLEM_KEY ORA 600 [6704]
FIRST_INCIDENT 701935
FIRSTINC_TIME 2014-10-01 00:02:45.452000 +08:00
LAST_INCIDENT 725310
LASTINC_TIME 2014-12-01 00:01:42.253000 +08:00
IMPACT1 0
IMPACT2 0
IMPACT3 0
IMPACT4 0
KEY_NAME ProcId
KEY_VALUE 194.83
KEY_NAME SID
KEY_VALUE 147.52491
KEY_NAME PQ
KEY_VALUE (25999740, 1417363285)
KEY_NAME Client ProcId
KEY_VALUE oracle@anbob (TNS V1-V3).852726_1
OWNER_ID 1
INCIDENT_FILE /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc
OWNER_ID 1
INCIDENT_FILE /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_852726.trc
1 rows fetched
adrci> show trace /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc
Output the results to file: /tmp/utsout_4719434_1_2.ado
"/tmp/utsout_4719434_1_2.ado" 109538 lines, 6682863 characters
/oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc
----------------------------------------------------------
LEVEL PAYLOAD
----- ------------------------------------------------------------------------------------------------------------------------------------
Dump file /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /oracle/app/oracle/product/11.2.0/db_1
System name: AIX
Node name: anbob
Release: 1
Version: 6
Machine: 00F87EE44C00
Instance name: anbob1
Redo thread mounted by this instance: 1
Oracle process number: 194
Unix process pid: 852726, image: oracle@anbob
*** 2014-12-01 00:01:42.302
*** SESSION ID:(147.52491) 2014-12-01 00:01:42.302
*** CLIENT ID:() 2014-12-01 00:01:42.302
*** SERVICE NAME:(anbob_1_2) 2014-12-01 00:01:42.302
*** MODULE NAME:(accountindb@weejar281 (TNS V1-V3)) 2014-12-01 00:01:42.302
*** ACTION NAME:() 2014-12-01 00:01:42.302
Dump continued from file: /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_852726.trc
1> ***** Error Stack *****
ORA-00600: internal error code, arguments: [6704], [2], [1405099], [953234], [], [], [], [], [], [], [], []
1< ***** Error Stack ***** 1> ***** Dump for incident 725310 (ORA 600 [6704]) *****
*** 2014-12-01 00:01:42.314
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
2> ***** Current SQL Statement for this session (sql_id=2f7xhhh2svw9d) *****
update tabxxx set bundle_id = :bundle_id, customer_id = :cust_id, region_code=:region_code, pay_unit = :pay_unit, real_u
nit = :real_unit, toll_unit = :toll_unit, call_times = :call_times, discharge = :discharge, fee1 = :fee1, fee2 = :fee2, fee3 = :fee3, fee4 =
:fee4, fee11= :fee11, fee12 = :fee12, fee13 = :fee13, fee14 = :fee14 ,item_source_id=:item_source_id,strategy_id=:strategy_id where user_id =
:user_id and account_id = :account_id and fee_kind = :fee_kind and city_code = :city_code and service_kind = :service_kind and special_weejar
= 0 and fee_date = '201411' returning 1 into :update_rowid
2< ***** current_sql_statement ***** 2> ***** Call Stack Trace *****
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+40 bl 107c60464 000000000 ? 000000001 ?
ksedst1()+112 call skdstdst() 13AB7C69F823AC58 ?
ksedst()+40 call ksedst1() 30300000000 ? 002050033 ?
dbkedDefDump()+1516 call ksedst() 000000000 ? 000000000 ?
FFFFFFFFFFEFF20 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 300000003 ?
ksedmp()+72 call dbkedDefDump() 300000000 ? 110000AC8 ?
FFFFFFFFFFF0510 ?
4842424409A0F964 ?
10011CA24 ? 109A0F958 ?
FFFFFFFFFFF0560 ? 11065B3A0 ?
ksfdmp()+100 call ksedmp() 000000000 ? 000000000 ?
009A0F973 ? 10AB28B98 ?
200000000000000 ? 000000000 ?
11089C290 ? 1106D93F8 ?
dbgexPhaseII()+1940 call ksfdmp() 000002004 ? 1106D93F8 ?
000000000 ? FFFFFFFFFFF06D8 ?
FFFFFFFFFFF0600 ? 000000000 ?
10024B778 ? 110000B00 ?
dbgexProcessError() call dbgexPhaseII() 1106D93F8 ? 1108A1978 ?
+932 0000B113E ? 200000000 ?
FFFFFFFFFFF12E8 ? 00000006C ?
700000C13FE97C8 ? 1106E3298 ?
dbgeExecuteForError call dbgexProcessError() 1106D93F8 ? 11089C290 ?
()+72 1FFFF46D0 ? 000000000 ?
000000000 ? 00000000F ?
022C18D82 ? 11089DFD8 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 000000000 ? 000000004 ?
2048 () 0FFFFFFFF ? 110000AC8 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 110000CB0 ? 1099D5F20 ?
68 25800000000 ? 000000000 ?
kgeade()+364 call dbkePostKGE_kgsf() 000000000 ? 000000000 ?
kgeriv_int()+116 call kgeade() 000000038 ? 700000B314914D0 ?
kgeriv()+36 call kgeriv_int() 700000B31491450 ? 0000000BC ?
700000B3C009028 ? 110BDB560 ?
FFFFFFFFFFF55C0 ?
4422488139346014 ?
10580E62C ? 000000002 ?
kgeasi()+232 call kgeriv() 110BF4490 ? 110A63E38 ?
110B76700 ? FFFFFFFFFFF6208 ?
700000B3C971BF0 ? 110A63E38 ?
000000000 ? 0000000A9 ?
kdugetpart()+272 call kgeasi() 109046C04 ? E8B92000E8B92 ?
updrow()+15796 call kdugetpart() 10089DEFC ? 000000007 ?
qerupFetch()+856 call updrow() 000000000 ? 7FFF00000000 ?
000000000 ? 000000000 ?
FFFFFFFFFFF5F90 ? 110000CB0 ?
FFFFFFFFFFF64F0 ? 000000001 ?
updaul()+1100 call qerupFetch() 700000B3F7AC068 ? 110BF45A8 ?
000000000 ? 000000000 ?
7FFF00000000 ? 000000000 ?
000000000 ? 000000000 ?
updThreePhaseExe()+ call updaul() 700000B3C971BF0 ?
336 FFFFFFFFFFF7078 ? 000000000 ?
updexe()+344 call updThreePhaseExe() 1FFFF78A8 ? 100000001 ?
opiexe()+14476 call updexe() 700000B3C971BF0 ? 300000000 ?
000000000 ? FFFFFFFFFFF78A8 ?
000000000 ? FFFFFFFFFFF7AA8 ?
FFFFFFFFFFF79E0 ? 000000000 ?
kpoal8()+4616 call opiexe() FFFFFFFFFFF8C90 ? 11065B3A0 ?
opiodr()+720 call kpoal8() 1FFFF9D30 ? 200000000 ?
ttcpip()+1028 call opiodr() 5E1010C870 ? 1C000003C8 ?
FFFFFFFFFFFA890 ? 000000000 ?
opitsk()+1508 call ttcpip() 11010C870 ? 11065B3A0 ?
opiino()+940 call opitsk() 110043CC0 ? 000000000 ?
opiodr()+720 call opiino() 3C009A7C7C ?
opidrv()+1132 call opiodr() 3C0AA3BE28 ? 41010B878 ?
11065B3A0 ?
sou2o()+136 call opidrv() 3C07E7DA50 ? 400000000 ?
11065B3A0 ?
opimai_real()+608 call sou2o() FFFFFFFFFFFF440 ? 000000001 ?
ssthrdmain()+268 call opimai_real() 9001000A070DE30 ?
9001000A077BD18 ?
sou2o()+136 call opidrv() 3C07E7DA50 ? 400000000 ?
11065B3A0 ?
opimai_real()+608 call sou2o() FFFFFFFFFFFF440 ? 000000001 ?
A0000000A000000 ? 10B146F30 ?
ssthrdmain()+268 call opimai_real() 9001000A070DE30 ?
9001000A077BD18 ?
main()+204 call ssthrdmain() 2F0003640 ? FFFFFFFFFFFF888 ?
__start()+112 call main() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
2< ***** call_stack_dump *****
TIP:
从今年的错误来看,该错误具有周期性,只在1号出现,应该是帐期特殊的事务触发的问题,从错误的Trace文件分析,应该是从10月份开始该SQL触发, 引起的ora-600 【6704】
ORA-600 [6704] "bad internal rowid when updating partition table" ORA-00600: [6704] ARGUMENTS: Arg [a] 1 for a partitioned table, 2 for a non-partitioned table Arg [b] Data object number from the table Arg [c] Data object number from the input ROWID
SQL> @ddl WEEJAR.tabxxx
PL/SQL procedure successfully completed.
DBMS_METADATA.GET_DDL(OBJECT_TYPE,OBJECT_NAME,OWNER)
-----------------------------------------------------------------------------------------------------------------
CREATE TABLE "WEEJAR"."tabxxx"
( "SERVICE_KIND" NUMBER(4,0) NOT NULL ENABLE,
"USER_ID" NUMBER(12,0) NOT NULL ENABLE,
"ACCOUNT_ID" NUMBER(12,0) NOT NULL ENABLE,
"CUSTOMER_ID" NUMBER(12,0) NOT NULL ENABLE,
"BUNDLE_ID" NUMBER(10,0) DEFAULT 0 NOT NULL ENABLE,
...
"FEE14" NUMBER(12,2) DEFAULT 0 NOT NULL ENABLE,
"ITEM_SOURCE_ID" NUMBER(9,0) DEFAULT 0 NOT NULL ENABLE,
"STRATEGY_ID" NUMBER(9,0) DEFAULT 0 NOT NULL ENABLE
) PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255
NOCOMPRESS NOLOGGING
STORAGE(
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "TS_TAB_GATHER"
PARTITION BY RANGE ("CITY_CODE","FEE_DATE")
(PARTITION "PART_000_MAX" VALUES LESS THAN ('000', MAXVALUE) SEGMENT CREATION DEFERRED
PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255
NOCOMPRESS NOLOGGING
STORAGE( INITIAL 16384
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "TS_TAB_GATHER" ,
PARTITION "PART_180_201308" VALUES LESS THAN ('180', '201309') SEGMENT CREATION IMMEDIATE
PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255
NOCOMPRESS NOLOGGING
STORAGE(INITIAL 8388608 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "TS_TAB_GATHER" ,
...
奇怪的是2是 non-partitioned ,但该表分区。
SQL> select owner,data_object_id,subobject_name,object_name,object_type,created from dba_objects where object_name='tabxxx' and owner='WEEJAR' and data_object_id in(1405099,953234) OWNER DATA_OBJECT_ID SUBOBJECT_NAME OBJECT_NAME OBJECT_TYPE CREATED ------------------------------ -------------- -------------------- -------------------- ------------------- ----------------- WEEJAR 953234 PART_189_201411 tabxxx TABLE PARTITION 20130923 12:05:19 WEEJAR 1405099 PART_720_201312 tabxxx TABLE PARTITION 20130923 12:05:19
从上面的TRACE 文件中确认是下面的SQL 引起:
update tabxxx set bundle_id = :bundle_id, customer_id = :cust_id, region_code=:region_code, pay_unit = :pay_unit, real_u nit = :real_unit, toll_unit = :toll_unit, call_times = :call_times, discharge = :discharge, fee1 = :fee1, fee2 = :fee2, fee3 = :fee3, fee4 = :fee4, fee11= :fee11, fee12 = :fee12, fee13 = :fee13, fee14 = :fee14 ,item_source_id=:item_source_id,strategy_id=:strategy_id where user_id = :user_id and account_id = :account_id and fee_kind = :fee_kind and city_code = :city_code and service_kind = :service_kind and special_weejar = 0 and fee_date = '201411' returning 1 into :update_rowid;
根据MOS 上查询,目前只有这一个未公开BUG和目前发现的3个现象,现象3# 很相似。
Bug 8768508 - ORA-600: [6704] on merge against Global Temporary Table (文档 ID 8768508.8)
Versions confirmed as being affected
11.2.0.3
10.2.0.4
The fix for 8768508 is first included in
12.1.0.1 Release
Rediscovery Notes
1) You are executing a MERGE into a GLOBAL TEMP table that contains no rows.
2) The MERGE is executed within a FORALL loop.
3) An ORA-00600: [6704] is raised when updating a row.
分析:
To do the following:
analyze table partition () validate structure cascade into invalid_rows;
most likely you will need to execute $ORACLE_HOME/rdbms/admin/utlvalid.sql if the table is partitioned
analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade into invalid_rows ;
Note:
analyze table will requested TM lock. you can use “online” option like the following:
analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade online into invalid_rows ;
”
ONLINE option for ANALYZE VALIDATE STRUCTURE statement
You can specify that you want to perform structure validation online while DML is occurring against the object being validated.
There can be a slight performance impact when validating with ongoing DML affecting the object, but this is offset by the flexibility
of being able to perform ANALYZE online.
”
SQL> analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade online into invalid_rows ; Table analyzed. SQL> select * from invalid_rows; no rows selected
Note:
另一个分区表对象被删除了,所以无法判断是不是在另一个分区表上的索引有逻辑corrupt block. 但是又观察了一个月发现没有再出现。
另外ora-600 kgantc_1 一般是因为ctrl+c 或kill -9后引起的。