今天开发的同事问有个表上的数据为什么删不掉?我看了一下,也就不到100000条数据,表上有外键,等了5分钟hang在那里,时间原因,我对表上的外键禁用后,瞬间删除。
现在来还原这个问题。
sys@ANBOB>select count(*) from bjhr.doctor_exam_member;
COUNT(*)
——————–
92102
sys@ANBOB>delete bjhr.doctor_exam_member;
–hang
–等待10分钟都未执行完,检查表的外键信息
bjhr@ANBOB> SELECT /*+RULE*/ D.CONSTRAINT_NAME pk_name,-- d.table_name, D.TABLE_NAME || '.' || D.COLUMN_NAME pk_column, A.CONSTRAINT_TYPE, B.CONSTRAINT_NAME fk_name, B.TABLE_NAME || '.' || B.COLUMN_NAME fk_column FROM user_constraints a JOIN user_cons_columns b ON a.constraint_name = b.constraint_name AND a.owner = b.owner JOIN user_constraints c ON A.R_CONSTRAINT_NAME = C.CONSTRAINT_NAME AND A.R_OWNER = c.owner JOIN user_cons_columns d ON c.constraint_name = d.constraint_name AND c.owner = d.owner WHERE D.table_name = 'DOCTOR_EXAM_MEMBER' bjhr@ANBOB>/ PK_NAME PK_COLUMN C FK_NAME FK_COLUMN -------------------- ---------------------------------------- - ------------------------------ ------------------------------------------------------- PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R FK_RESULT_N_REFERENCE_DOCTOR RESULT_NOTIFICATION_RECORD.DOCTOR_EXAM_MEMBER_ID PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R RESULT_RE_DOCTOR_MEMBER DOCTOR_EXAM_RESULT.DOCTOR_EXAM_MEMBER_ID
–有外键,之前已对子表进行过删除,否则会报错ORA-02266
delete RESULT_NOTIFICATION_RECORD;
delete DOCTOR_EXAM_RESULT;
commit;
–下面开始分析,创建新的session
sys@ANBOB>select xidsqn,xidusn,object_id,session_id,locked_mode from v$locked_object;
XIDSQN XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE
-------------------- -------------------- -------------------- -------------------- --------------------
2102 203 1639631 2290 3
2102 203 1639572 2290 3
sys@ANBOB>select object_name,object_type from dba_objects where object_id in(1639631,1639572);
OBJECT_NAME OBJECT_TYPE
------------------------------ -------------------
DOCTOR_EXAM_MEMBER TABLE
DOCTOR_EXAM_RESULT TABLE
sys@ANBOB>select event,p1,p2,p1text,p2text,seconds_in_wait,state from v$session_wait where sid=2290;
EVENT P1 P2 P1TEXT P2TEXT SECONDS_IN_WAIT STATE
------------------------------ ----------- ----- -------------------- -------------------- -------------------- -------------------
latch: shared pool 1611704464 307 address number 213 WAITED SHORT TIME
--trace hanganalyze and systemstate
alter session set events 'immediate trace name systemstate level 266';
alter session set events 'immediate trace name hanganalyze level 3';
--hanganalyze trace
===============================================================================
Chains most likely to have caused the hang:
[a] Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
[b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
===============================================================================
Sessions in an involuntary wait or not in a wait:
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (ANBOB.ANBOB)
os id: 27158
process id: 94, oracle@dev-db (TNS V1-V3)
session id: 2290
session serial #: 7981
}
is not in a wait:
{
last wait: 11 min 0 sec ago
blocking: 0 sessions
wait history:
1. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 183 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #1 and #2: 1.586255 sec
2. event: 'latch: shared pool'
time waited: 0.000032 sec
wait id: 182 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
* time between wait #2 and #3: 0.133830 sec
3. event: 'latch: shared pool'
time waited: 0.000114 sec
wait id: 181 p1: 'address'=0x6010a890
p2: 'number'=0x133
p3: 'tries'=0x0
}
Chain 1 Signature:
Chain 1 Signature Hash: 0x673a0128
–对systemstate 没发现可疑信息
[oracle@dev-db ~]$ awk -f ass109.awk /oracle/diag/rdbms/ANBOB/ANBOB/trace/ANBOB_ora_23020.trc
— 奇怪为什么会发生在latch:shared pool上? 应该是sql解析和shared pool相关的事件,随后结束delete,做10046 观察究竟
sys@ANBOB>oradebug setmypid;
Statement processed.
sys@ANBOB>oradebug event 10046 trace name context forever,level 12
Statement processed.
sys@ANBOB>delete bjhr.doctor_exam_member;
92102 rows deleted.
sys@ANBOB>oradebug tracefile_name
/u01/app/oracle/diag/rdbms/anbob/anbob/trace/anbob_ora_7784.trc
sys@ANBOB>oradebug event 10046 trace name context off;
Statement processed.
— 格式化trace,终于发现了答案.
delete bjhr.doctor_exam_member call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 47.30 48.39 201 222 657611 92102 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 47.31 48.39 201 222 657611 92102 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ Disk file operations I/O 2 0.00 0.00 db file scattered read 26 0.00 0.00 db file sequential read 24 0.00 0.00 SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00 ******************************************************************************** -- check deferred objects select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg, maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg, maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg, cmpflag_stg, cmplvl_stg from deferred_stg$ where obj# =:1 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."RESULT_NOTIFICATION_RECORD" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 11.31 11.34 0 0 0 0 Fetch 92102 0.63 0.64 0 0 0 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 11.95 11.99 0 0 0 92102 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: SYS (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=0 pr=0 pw=0 time=43 us) 0 0 0 TABLE ACCESS FULL RESULT_NOTIFICATION_RECORD (cr=0 pr=0 pw=0 time=12 us cost=3 size=5 card=1) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ latch: shared pool 2 0.00 0.00 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."DOCTOR_EXAM_RESULT" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 6.97 7.11 0 0 0 0 Fetch 92102 1012.96 1016.14 0 566243096 92102 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 1019.93 1023.25 0 566243096 92102 92102 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=6148 pr=0 pw=0 time=30196 us) 0 0 0 TABLE ACCESS FULL DOCTOR_EXAM_RESULT (cr=6148 pr=0 pw=0 time=30184 us cost=1647 size=5 card=1) ******************************************************************************** OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 39 0.00 0.01 0 0 0 0 Execute 184248 18.29 18.46 0 0 0 0 Fetch 184276 1013.60 1016.79 0 566243218 92102 184238 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 368563 1031.90 1035.27 0 566243218 92102 184238 1129 elapsed seconds in trace file.
TIP:
在删除doctor_exam_member表时,检查了他的所有参照表(子表),然后对doctor_exam_member表的每次记录都要去参照表查询是否存在,此时刚好参考表的外键列上并无索引,导致每一行记录都会导致FTS(full table scan),这也是查询v$session_event时偶尔出现latch: CBC (hot block)的原因。
你可能疑问子表数据都delete了为什么还查询这么久?我做个小测试
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT;
COUNT(*)
——————–
0
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
50331648 6144
sys@ORA10GR2>set autot trace stat
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
6040 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result enable row movement;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.DOCTOR_EXAM_RESULT shrink space;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result disable row movement;
Table altered.
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
196608 24
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
TIP:
FTS查询会遍历表segment 已格式化过所有data block.
Summary:
在建有外键约束的子表列上需要创建索引,对子表全表删除时可以采用truncate 或delete(有外键不能truncate时)后对表进行shrink space操作,或删除父表前对子表的外键约束做Disable.
As far as unindexed foreign keys are concerned, the big change came in 9i and I haven’t spotted any further changes in 11g. (Although the switch from mode 2 to mode 3 at the opposite end of the RI constraint will probably have some consequences.)
When you delete a parent row, or update the key value, your session will try to acquire a mode 4 – share – or mode 5 – share row exclusive – lock on the child table. (Mode 5 if the transaction has previously modified the child – otherwise mode 4, and most of the literature only mentions the mode 4).
In 8i the lock is released when your transaction commits. In 9i the lock is released (or downgraded from 5 to 3) as soon as the operation on the parent completes. This means that any sessions that had been queued behind the child lock will be able to resume sooner. I don’t think I’ve seen anything in 11g that reduces the interference any further.
In passing: it’s not commonly realised that the “foreign key index” doesn’t have to be an exact match for the foreign key; it merely has to start with the foreign key columns (which can be in any order) to avoid the locking issue.