首页 » ORACLE » sql performance truning ,hint,parallel,10053 trace

sql performance truning ,hint,parallel,10053 trace

问题的起初是有个网友问我一个付费查询的sql如何再优,我在本机其它系统的环境找找了个相似结构的两个表,就是一对多。
icme_student(小表) ic_code唯一,对应icme_noproject_score(大表)的多条记录每个ic_code.
下面看我执行的执行计划,为了节约篇幅省略了一部分,并且都是第二次运行的结果

SQL>  select count(*) total,count(distinct stu.ic_code) hav_sc_stucnt,sum(ns.class_score),stu.ic_code  from icme_noproject_score ns
  2  join icme_student stu on ns.ic_code=stu.ic_code
  3  group by stu.ic_code
  4  ;

854409 rows selected.

Elapsed: 00:01:38.28

Execution Plan
----------------------------------------------------------
Plan hash value: 1102099122

-----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                    |   608K|    13M|       |   121K (13)| 00:24:13 |
|   1 |  SORT GROUP BY         |                    |   608K|    13M|   648M|   121K (13)| 00:24:13 |
|   2 |   NESTED LOOPS         |                    |    20M|   470M|       | 66742  (22)| 00:13:21 |
|   3 |    INDEX FAST FULL SCAN| UK_NOPROJECT_SCORE |    20M|   274M|       | 47986   (1)| 00:09:36 |
|*  4 |    INDEX RANGE SCAN    | PK_ICME_STUDENT    |     1 |    10 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
   41430942  consistent gets

默认CBO的执行计划运行了98秒,产生了41430942  consistent gets,比较怀疑的是为什么用要用大表做为驱动表?

说点题外话,在两个表的JOIN 中,CBO总是会计算三种JOIN的连接方式的代价,分别为NL JOIN,SM JOIN,HA JOIN,三种JOIN对前后两表的我称呼还不一样,分别是NL JOIN :outer table,inner table,SM :firest table,second table,HA:build table ,probe table,Nested jOIN中的outer table又叫driver table,是执行计划中上面的表,是循环中的外层表,如其它语言中的

FOR I IN 1..10 LOOP FOR J IN 1...100000 LOOP if j.xx=i.xx do something end loop; end loop;

i就是驱动表的rows,j就是inner table的记录,一般情况下是把小表做为驱动表。 我加上ordered hint改变驱动表 select /*+ ordered */ count(*) total,count(distinct stu.ic_code) hav_sc_stucnt,sum(ns.class_score),stu.ic_code from icme_student stu join icme_noproject_score ns on stu.ic_code=ns.ic_code group by stu.ic_code 854409 rows selected. Elapsed: 00:00:34.78 Execution Plan ---------------------------------------------------------- Plan hash value: 2339117530 ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 608K| 13M| | 129K (2)| 00:26:00 | | 1 | SORT GROUP BY | | 608K| 13M| 648M| 129K (2)| 00:26:00 | |* 2 | HASH JOIN | | 20M| 470M| 24M| 75642 (1)| 00:15:08 | | 3 | INDEX FAST FULL SCAN| PK_ICME_STUDENT | 1179K| 11M| | 956 (1)| 00:00:12 | | 4 | INDEX FAST FULL SCAN| UK_NOPROJECT_SCORE | 20M| 274M| | 47986 (1)| 00:09:36 | ----------------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 222813 consistent gets 运行时间变成了34秒,consistent gets也相对少了很多,而且变成了HA JOIN,小表主键也变成了IFFS,如果加上并行呢? 再说点题外话,如果是DBA用或就是那么一小摄儿人查询几次,在服务器有多核CPU,且IO资源有空闲的情况下可以开并行减少查询时间,但是对于app,在oltp系统中并发用户又很多,进行的都是非常短的事务,这时并发可能会给服务器增加压力和一些稀缺资源的竞争,并发受parallel_threads_per_cpu 等多个系统参数的影响,因为一般用户如果查询服务器cpu的核数呢?

SQL> DECLARE 2 parnam VARCHAR2(256); 3 intval BINARY_INTEGER; 4 strval VARCHAR2(256); 5 partyp BINARY_INTEGER; 6 BEGIN 7 partyp := dbms_utility.get_parameter_value('cpu_count', 8 intval, strval); 9 dbms_output.put('parameter value is: '); 10 IF partyp = 1 THEN 11 dbms_output.put_line(strval); 12 ELSE 13 dbms_output.put_line(intval); 14 END IF; 15 16 END; 17 / parameter value is: 8 alter session force parallel query; SQL> select count(*) total,count(distinct stu.ic_code) hav_sc_stucnt,sum(ns.class_score),stu.ic_code from icme_noproject_score ns 2 join icme_student stu on ns.ic_code=stu.ic_code 3 group by stu.ic_code 4 ; 854409 rows selected. Elapsed: 00:00:20.62 Execution Plan ---------------------------------------------------------- Plan hash value: 4250446769 ------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 619K| 14M| | 3227 (3)| 00:00:39 | | | | | 1 | PX COORDINATOR | | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10002 | 619K| 14M| | 3227 (3)| 00:00:39 | Q1,02 | P->S | QC (RAND) | | 3 | SORT GROUP BY | | 619K| 14M| 649M| 3227 (3)| 00:00:39 | Q1,02 | PCWP | | |* 4 | HASH JOIN | | 20M| 471M| | 3167 (1)| 00:00:39 | Q1,02 | PCWP | | | 5 | PX RECEIVE | | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,02 | PCWP | | | 6 | PX SEND HASH | :TQ10000 | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | P->P | HASH | | 7 | PX BLOCK ITERATOR | | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | PCWC | | | 8 | INDEX FAST FULL SCAN| PK_ICME_STUDENT | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | PCWP | | | 9 | PX RECEIVE | | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,02 | PCWP | | | 10 | PX SEND HASH | :TQ10001 | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | P->P | HASH | | 11 | PX BLOCK ITERATOR | | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | PCWC | | | 12 | INDEX FAST FULL SCAN| UK_NOPROJECT_SCORE | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | PCWP | | ------------------------------------------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 181 recursive calls 3 db block gets 226981 consistent gets

再加上ORDERED提示 SQL> select /*+ ordered */ count(*) total,count(distinct stu.ic_code) hav_sc_stucnt,sum(ns.class_score),stu.ic_code from icme_student stu join icme_noproject_score ns on stu.ic_code=ns.ic_code 3 group by stu.ic_code; 854409 rows selected. Elapsed: 00:00:16.88 Execution Plan ---------------------------------------------------------- Plan hash value: 4250446769 --------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 619K| 14M| | 3227 (3)| 00:00:39 | | | | | 1 | PX COORDINATOR | | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10002 | 619K| 14M| | 3227 (3)| 00:00:39 | Q1,02 | P->S | QC (RAND) | | 3 | SORT GROUP BY | | 619K| 14M| 649M| 3227 (3)| 00:00:39 | Q1,02 | PCWP | | |* 4 | HASH JOIN | | 20M| 471M| | 3167 (1)| 00:00:39 | Q1,02 | PCWP | | | 5 | PX RECEIVE | | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,02 | PCWP | | | 6 | PX SEND HASH | :TQ10000 | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | P->P | HASH | | 7 | PX BLOCK ITERATOR | | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | PCWC | | | 8 | INDEX FAST FULL SCAN| PK_ICME_STUDENT | 1181K| 11M| | 53 (0)| 00:00:01 | Q1,00 | PCWP | | | 9 | PX RECEIVE | | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,02 | PCWP | | | 10 | PX SEND HASH | :TQ10001 | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | P->P | HASH | | 11 | PX BLOCK ITERATOR | | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | PCWC | | | 12 | INDEX FAST FULL SCAN| UK_NOPROJECT_SCORE | 20M| 274M| | 3107 (1)| 00:00:38 | Q1,01 | PCWP | | --------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("NS"."IC_CODE"="STU"."IC_CODE") Statistics ---------------------------------------------------------- 117 recursive calls 3 db block gets 226981 consistent gets 对开始CBO默认解析做10053 trace 出于篇幅的原因也只是理出一些cost结果,如果要学习分析可以参考书的14章 语法 current session: alter session events '10053 trace name context forever,level <1|2>'; execute sql alter session events '10053 trace name context off'; other session: exec dbms_system.set_ev(,,10053,<1|2>,''); execute sql exec dbms_system.set_ev(,,10053,0,''); trace的内容结构是: OPTIMIZER PARAMETERS:优化器参数 SYSTEM STATITICS:OS系统的统计信息如IO速度和CPU速度; TABLE STATITISCS:表对象统计信息如行数,块数,平均行长度,有无直方图 INDEX STATISTICS:表上索引的统计信息如层数,叶子块数,distinct 键值数,聚集因子 COLUMNS STATISTICS:表信息 1,sigle table access cost: 单个表的连接方式的各种cost,如全表扫描(TABLE SCAN),最适合的索引读取方式IFFS\IFS\RANGE SCAN\INDEX AND-Equal,根据最小cost确定的连接方式 如 ******** End index join costing ******** Best:: AccessPath: IndexFFS Index: PK_ICME_STUDENT Cost: 955.76 Degree: 1 Resp: 955.76 Card: 1179881.00 Bytes: 0 *************************************** ******** End index join costing ******** Best:: AccessPath: IndexFFS Index: UK_NOPROJECT_SCORE Cost: 47986.14 Degree: 1 Resp: 47986.14 Card: 20565707.00 Bytes: 0 Grouping column cardinality [ IC_CODE] 1179881 *************************************** 2,join order and method cost: 基于cardinality的join连接方式,CBO会考虑用不同的table join order(顺序) 选择outer table(外部表) 和inner table(内部表) NL\SM\HA JOIN 三种连接方式分别计算代价(外部表每返回一行都会去内部表去执行一次,所以在带统计信息的执行计划中的Starts列 PK_ICME_STUDENT 执行了 UK_NOPROJECT_SCORE 的行数次,且每次ICME_STUDENT 的e-rows中为什么会是1). NL = NESTED JOIN SM = SORT MERGE JOIN HA = HASH JOIN *********************** Join order[1]: ICME_STUDENT[STU]#0 ICME_NOPROJECT_SCORE[NS]#1 *************** Best NL cost: 1410028.31 SM cost: 155683.05 HA cost: 75642.10 Best:: JoinMethod: Hash Cost: 129928.23 Degree: 1 Resp: 129928.23 Card: 20565707.00 Bytes: 24 *********************** Join order[2]: ICME_NOPROJECT_SCORE[NS]#1 ICME_STUDENT[STU]#0 *************** Best NL cost: 66742.49 SM cost: 155683.05 HA cost: 75642.10 Best:: JoinMethod: NestedLoop Cost: 121028.62 Degree: 1 Resp: 121028.62 Card: 20565707.00 Bytes: 24 note:这里的Cost121028.62 是整个执行计划的总和,也就是执行计划图表中ID为0的cost 121K *********************** Best so far: Table#: 1 cost: 47986.1401 card: 20565707.0000 bytes: 287919898 Table#: 0 cost: 121028.6238 card: 20565707.0000 bytes: 493576968 Number of join permutations tried: 2 ********************************* (newjo-save) [0 1 ] GROUP BY cardinality: 608125.00, TABLE cardinality: 20565707.00 SORT resource Sort statistics Sort width: 1198 Area size: 1048576 Max Area size: 209715200 Degree: 1 Blocks to Sort: 93161 Row size: 37 Total Rows: 20565707 Initial runs: 4 Merge passes: 1 IO Cost / pass: 50464 Total IO sort cost: 53219 Total CPU sort cost: 23713302516 Total Temp space used: 680231000 Final - All Rows Plan: Best join order: 2 --这里给出了终级JOIN 表顺序为2 Cost: 121028.6238 Degree: 1 Card: 20565707.0000 Bytes: 493576968 Resc: 121028.6238 Resc_io: 105627.0000 Resc_cpu: 342247398504 Resp: 121028.6238 Resp_io: 105627.0000 Resc_cpu: 342247398504 对于NL JOIN居然 ICME_STUDENT做outer table的cost是1410028.31,远远高出ICME_NOPROJECT_SCORE 做outer table 的66742.49
打赏

目前这篇文章有1条评论(Rss)评论关闭。

  1. admin | #1
    2012-05-14 at 16:20

    要学习10053 trace file 可以参考《Cost Based Oracle Fundamentals》书的14章,