首页 » ORACLE » How to estimate the cost if partition table missing partition level statistics ?(缺失分区表级统计信息如何计算多分区扫描的COST)

How to estimate the cost if partition table missing partition level statistics ?(缺失分区表级统计信息如何计算多分区扫描的COST)

月初3号时一套数据库SQL突然变慢,查看SQL文本就是一个普通的select语句,查询的是以月份命名的再以天划分新建的分区表,SQL语句中有一个选择率很好的手机号的字段,语句是查询该手机最近两天的记录,但是查看当时的执行计划使用的是PARTITION RANGE ITERATOR FTS 分区全表扫, 该手机号字段上有索引, 只有2天的数据但是记录已经几千万条记录, 为什么不使用索引呢?

如select xx,xx from tab201508 where time between date’20150801′ and date’20150803′ and tel=’17000000000′;

首先查看表的统计信息,分区表的统计信息分global stats(DBA_TABLE)和partition stats(DBA_TAB_PARTITION),发现该分区表是global stats 有,partition stats全为空, 查看该SQL发现有两个子游标有不同的执行计划,#1使用的是了正确的索引,#2使用就是当前的多分区全表扫描, 查看#2第一次解析到shared_pool时间是9:18 分, 为什么执行计划会改变呢?

再次查看表的global stats 的last_analyzed 刚好是9:10分, 收集统计信息有可能导致执行计划改变, 查看收集统计信息的scheduler GATHER_STATS_JOB 果然是在running, 工作日收集统计信息的时间window是22:00 开始持续4个小时 , 周末6:00开始持续20个小时,刚好当天是周末,查看了GATHER_STATS_JOB的运行日志,前两天都因为时间窗口关闭而终止, 让我们大胆的猜测一下会不会是这样:该SQL 在9:10之前第一次解析时因为表的统计信息都不存在,使用的动态采样生成了使用索引的正确的执行计划, 后来因为收集统计信息的job 在收集到该表时,先是产生了表的global stats, 但是partition stats没有收集前,利用当前分区表上的global stats解析生成了#2 sql游标,使用了错误的执行计划,收集统计的job GRANULARITY默认是AUTO, 理论上是会收集partition 级的统计信息的,也许是还未来得及收集 。

下面是GRANULARITY不同值时统计信息收集的级别

Granularity	      Global Statistics Level
Global	              Table
Partition	      Partition
SubPartition          SubPartition
Default	              Table + Partition
All	              Table + Partition + Subpartition

如果global and partition 都无,那就动态采样(default如果没禁用Dynamic Sample),如果global and partition 都有没有疑问,如果有partition 无global 那多分区扫时可以聚合,但是如果有global 无partition 时,单分区的CBO代价是如何算?多分区扫又如何记录呢?带个这个问题我来做个测试

create table anbob.t(id int,ctime date,name varchar2(100))
partition by range(ctime)
(
partition p1 values less than (TO_DATE('2015-1-1','yyyy-mm-dd')),
partition p2 values less than (TO_DATE('2015-2-1','yyyy-mm-dd')),
partition p3 values less than (TO_DATE('2015-3-1','yyyy-mm-dd')),
partition p4 values less than (TO_DATE('2015-4-1','yyyy-mm-dd')),
partition p5 values less than (TO_DATE('2015-5-1','yyyy-mm-dd')),
partition p6 values less than (TO_DATE('2015-6-1','yyyy-mm-dd')),
partition p7 values less than (TO_DATE('2015-7-1','yyyy-mm-dd')),
partition p8 values less than (TO_DATE('2015-8-1','yyyy-mm-dd')),
partition p9 values less than (TO_DATE('2015-9-1','yyyy-mm-dd')),
partition p10 values less than (TO_DATE('2015-10-1','yyyy-mm-dd')),
partition p11 values less than (TO_DATE('2015-11-1','yyyy-mm-dd')),
partition p12 values less than (TO_DATE('2015-12-1','yyyy-mm-dd')),
partition p13 values less than (TO_DATE('2016-1-1','yyyy-mm-dd')),
partition p14 values less than (TO_DATE('2016-2-1','yyyy-mm-dd')),
partition p15 values less than (TO_DATE('2016-3-1','yyyy-mm-dd')),
partition p16 values less than (TO_DATE('2016-4-1','yyyy-mm-dd')),
partition p17 values less than (TO_DATE('2016-5-1','yyyy-mm-dd')),
partition p18 values less than (TO_DATE('2016-6-1','yyyy-mm-dd')),
partition p19 values less than (TO_DATE('2016-7-1','yyyy-mm-dd')),
partition p20 values less than (TO_DATE('2016-8-1','yyyy-mm-dd'))
);

SQL> insert into anbob.t select dbms_random.random, sysdate, 'anbob'||rownum from dual connect by rownum<=1E6; 1000000 rows created. SQL> commit;
Commit complete.

SQL> select * from anbob.t where rownum<10;         ID CTIME             NAME ---------- ----------------- -------------- -396539064 20150818 11:24:31 anbob989 -108530760 20150818 11:24:31 anbob990 -944850610 20150818 11:24:31 anbob991  650366221 20150818 11:24:31 anbob992  487828661 20150818 11:24:31 anbob993 1673114746 20150818 11:24:31 anbob994 1278669788 20150818 11:24:31 anbob995  528234050 20150818 11:24:31 anbob996 2013900942 20150818 11:24:31 anbob997 9 rows selected. SQL> @seg anbob.t
    SEG_MB OWNER       SEGMENT_NAME      SEG_PART_NAME   SEGMENT_TYPE         SEG_TABLESPACE_NAME       BLOCKS     HDRFIL     HDRBLK
---------- ----------- ----------------- --------------- -------------------- --------------------- ---------- ---------- ----------
        36 ANBOB       T                 P9              TABLE PARTITION      USERS                       4608         36    3437899

SQL>  exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'ANBOB' , tabname => 'T',cascade => true,  estimate_percent => dbms_stats.auto_sample_size,method_opt=>'for all columns size 1', granularity => 'GLOBAL', degree => 8,no_invalidate=>false);
PL/SQL procedure successfully completed.

SQL> @tab anbob.t
Show tables matching condition "%anbob.t%" (if schema is not specified then current users tables only are shown)...

OWNER                TABLE_NAME                     TYPE     NUM_ROWS        BLOCKS     EMPTY AVGSPC ROWLEN TAB_LAST_ANALYZED DEGREE     COMPRESS
-------------------- ------------------------------ ---- ------------ ------------- --------- ------ ------ ----------------- ---------- --------
ANBOB                T                              PTAB      1000000          4528         0      0     27 20150818 11:29:13          1

SQL> @tabpart anbob.t

TABLE_OWNER      TABLE_NAME               POS COM PARTITION_NAME       NUM_ROWS SUBPARTITION_COUNT HIGH_VALUE_RAW                         HIGH_VALUE_LENGTH COMPRESS COMPRESS_FOR
---------------- ----------------- ---------- --- ------------------ ---------- ------------------ -------------------------------------- ----------------- -------- ------------
ANBOB            T                          1 NO  P1                                             0 TO_DATE(' 2015-01-01 00:00:00', )                     83 DISABLED
ANBOB            T                          2 NO  P2                                             0 TO_DATE(' 2015-02-01 00:00:00', )                     83 DISABLED
ANBOB            T                          3 NO  P3                                             0 TO_DATE(' 2015-03-01 00:00:00', )                     83 DISABLED
ANBOB            T                          4 NO  P4                                             0 TO_DATE(' 2015-04-01 00:00:00', )                     83 DISABLED
ANBOB            T                          5 NO  P5                                             0 TO_DATE(' 2015-05-01 00:00:00', )                     83 DISABLED
ANBOB            T                          6 NO  P6                                             0 TO_DATE(' 2015-06-01 00:00:00', )                     83 DISABLED
ANBOB            T                          7 NO  P7                                             0 TO_DATE(' 2015-07-01 00:00:00', )                     83 DISABLED
ANBOB            T                          8 NO  P8                                             0 TO_DATE(' 2015-08-01 00:00:00', )                     83 DISABLED
ANBOB            T                          9 NO  P9                                             0 TO_DATE(' 2015-09-01 00:00:00', )                     83 DISABLED
ANBOB            T                         10 NO  P10                                            0 TO_DATE(' 2015-10-01 00:00:00', )                     83 DISABLED
ANBOB            T                         11 NO  P11                                            0 TO_DATE(' 2015-11-01 00:00:00', )                     83 DISABLED
ANBOB            T                         12 NO  P12                                            0 TO_DATE(' 2015-12-01 00:00:00', )                     83 DISABLED
ANBOB            T                         13 NO  P13                                            0 TO_DATE(' 2016-01-01 00:00:00', )                     83 DISABLED
ANBOB            T                         14 NO  P14                                            0 TO_DATE(' 2016-02-01 00:00:00', )                     83 DISABLED
ANBOB            T                         15 NO  P15                                            0 TO_DATE(' 2016-03-01 00:00:00', )                     83 DISABLED
ANBOB            T                         16 NO  P16                                            0 TO_DATE(' 2016-04-01 00:00:00', )                     83 DISABLED
ANBOB            T                         17 NO  P17                                            0 TO_DATE(' 2016-05-01 00:00:00', )                     83 DISABLED
ANBOB            T                         18 NO  P18                                            0 TO_DATE(' 2016-06-01 00:00:00', )                     83 DISABLED
ANBOB            T                         19 NO  P19                                            0 TO_DATE(' 2016-07-01 00:00:00', )                     83 DISABLED
ANBOB            T                         20 NO  P20                                            0 TO_DATE(' 2016-08-01 00:00:00', )                     83 DISABLED

20 rows selected.

note:
现在我们创建了个分区表,有20个分区,insert 了1000000记录, 收集统计信息只收集了global级的。真正的数据都在p9分区,其它分区空。

SQL>  select * from anbob.t ;
1000000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1951471977

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |  1000K|    25M|  1245   (2)| 00:00:15 |       |       |
|   1 |  PARTITION RANGE ALL|      |  1000K|    25M|  1245   (2)| 00:00:15 |     1 |    20 |
|   2 |   TABLE ACCESS FULL | T    |  1000K|    25M|  1245   (2)| 00:00:15 |     1 |    20 |
--------------------------------------------------------------------------------------------


SQL> select * from anbob.t partition(p18);
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4030076868

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      | 50000 |  1318K|    64   (2)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|      | 50000 |  1318K|    64   (2)| 00:00:01 |    18 |    18 |
|   2 |   TABLE ACCESS FULL    | T    | 50000 |  1318K|    64   (2)| 00:00:01 |    18 |    18 |
-----------------------------------------------------------------------------------------------

         
exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'ANBOB' , tabname => 'T',cascade => true,  estimate_percent => dbms_stats.auto_sample_size,method_opt=>'for all columns size 1', granularity => 'ALL', degree => 8,no_invalidate=>false);          

SQL>  select * from anbob.t partition(p18);

Execution Plan
----------------------------------------------------------
Plan hash value: 4030076868

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |    74 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |    74 |     2   (0)| 00:00:01 |    18 |    18 |
|   2 |   TABLE ACCESS FULL    | T    |     1 |    74 |     2   (0)| 00:00:01 |    18 |    18 |
-----------------------------------------------------------------------------------------------

note:
可以看那单分区扫描p18时,CBO 估算行数时使用的是global统计信息的num_rows/number of partitions ,那刚好就1000000/20=50000。cost 也几乎是全表招的cost / number of partitions. 如果收集了所有的统计信息时才是正确的cost和估算的有很大差距。

exec dbms_stats.delete_table_stats(ownname => 'ANBOB' , tabname => 'T');

SQL> select * from anbob.t ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1951471977

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   900K|    63M|  1243   (2)| 00:00:15 |       |       |
|   1 |  PARTITION RANGE ALL|      |   900K|    63M|  1243   (2)| 00:00:15 |     1 |    20 |
|   2 |   TABLE ACCESS FULL | T    |   900K|    63M|  1243   (2)| 00:00:15 |     1 |    20 |
--------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)
   
SQL> select * from anbob.t partition(p18);

Execution Plan
----------------------------------------------------------
Plan hash value: 4030076868

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |    74 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |    74 |     2   (0)| 00:00:01 |    18 |    18 |
|   2 |   TABLE ACCESS FULL    | T    |     1 |    74 |     2   (0)| 00:00:01 |    18 |    18 |
-----------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

Note:
如果统计信息global and partition level都没有,使用的动态采样也能正确的估算数据。

如果是只有global statistics,多分区范围扫描呢?

SQL>  exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'ANBOB' , tabname => 'T',cascade => true,  estimate_percent => dbms_stats.auto_sample_size,method_opt=>'for all columns size 1', granularity => 'GLOBAL', degree => 8,no_invalidate=>false);
PL/SQL procedure successfully completed.

SQL>  select * from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |    84   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) SQL> select * from anbob.t where ctime between TO_DATe('2016-4-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |   112   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |   112   (2)| 00:00:02 |    17 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |   112   (2)| 00:00:02 |    17 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) SQL> select * from anbob.t where ctime between TO_DATe('2016-3-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;
Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |   139   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |   139   (2)| 00:00:02 |    16 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |   139   (2)| 00:00:02 |    16 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note:
虽然是估算的rows没太大错误,但是cost确发现了个规律每加一个分区约增加cost 28左右.

如果删除统计信息后

SQL>  select /*53*/* from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    74 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    74 |     2   (0)| 00:00:01 |    18 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    74 |     2   (0)| 00:00:01 |    18 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
   - dynamic sampling used for this statement (level=2

Note:
动态采样也可以正确的估算COST。

如多分区范围扫描的COST是怎么算的呢?会不会也是从global statistics 计算来的呢? 很好证明,我们把global statistics的rows 和blocks 增加10倍再查询同样的SQL看COST会不会也相应增加?

SQL>  exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'ANBOB' , tabname => 'T',cascade => true,  estimate_percent => dbms_stats.auto_sample_size,method_opt=>'for all columns size 1', granularity => 'GLOBAL', degree => 8,no_invalidate=>false);
PL/SQL procedure successfully completed.

exec dbms_stats.set_table_stats(ownname => 'ANBOB' , tabname => 'T',numrows=>10000000,numblks=>45280);

SQL> select * from anbob.t ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1951471977

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    10M|   257M| 12430   (2)| 00:02:30 |       |       |
|   1 |  PARTITION RANGE ALL|      |    10M|   257M| 12430   (2)| 00:02:30 |     1 |    20 |
|   2 |   TABLE ACCESS FULL | T    |    10M|   257M| 12430   (2)| 00:02:30 |     1 |    20 |
--------------------------------------------------------------------------------------------

SQL> select * from anbob.t partition(p18);

Execution Plan
----------------------------------------------------------
Plan hash value: 4030076868

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |   500K|    12M|   623   (2)| 00:00:08 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |   500K|    12M|   623   (2)| 00:00:08 |    18 |    18 |
|   2 |   TABLE ACCESS FULL    | T    |   500K|    12M|   623   (2)| 00:00:08 |    18 |    18 |
-----------------------------------------------------------------------------------------------

SQL>  select * from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-5-31','yyyy-mm-dd') ;
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4030076868

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |    27 |   624   (2)| 00:00:08 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |    27 |   624   (2)| 00:00:08 |    18 |    18 |
|*  2 |   TABLE ACCESS FULL    | T    |     1 |    27 |   624   (2)| 00:00:08 |    18 |    18 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2016-05-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

NOTE:
可以发现全分区扫和单分区扫的ROWS和COST都增加了10倍,下面看多分区范围扫描

SQL> select * from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |    84   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))    

 SQL> select * from anbob.t where ctime between TO_DATe('2016-4-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |   112   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |   112   (2)| 00:00:02 |    17 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |   112   (2)| 00:00:02 |    17 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note:
可以看到修改global stats, 不会影响多分区扫的COST,还和上面得到的一致。得出结论这是计算的COST不是利用的GLOBAL statis.

alter table anbob.t add partition p21 values less than (TO_DATE('2016-9-1','yyyy-mm-dd'));
alter table anbob.t add partition p22 values less than (TO_DATE('2016-10-1','yyyy-mm-dd'));
alter table anbob.t add partition p23 values less than (TO_DATE('2016-11-1','yyyy-mm-dd'));
alter table anbob.t add partition p24 values less than (TO_DATE('2016-12-1','yyyy-mm-dd'));
alter table anbob.t add partition p25 values less than (TO_DATE('2017-1-1','yyyy-mm-dd'));
alter table anbob.t add partition p26 values less than (TO_DATE('2017-2-1','yyyy-mm-dd'));
alter table anbob.t add partition p27 values less than (TO_DATE('2017-3-1','yyyy-mm-dd'));
alter table anbob.t add partition p28 values less than (TO_DATE('2017-4-1','yyyy-mm-dd'));
alter table anbob.t add partition p29 values less than (TO_DATE('2017-5-1','yyyy-mm-dd'));
alter table anbob.t add partition p30 values less than (TO_DATE('2017-6-1','yyyy-mm-dd'));

SQL> select /*53*/* from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |     1 |    27 |    84   (2)| 00:00:02 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
|*  2 |   TABLE ACCESS FULL      | T    |     1 |    27 |    84   (2)| 00:00:02 |    18 |    20 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2016-07-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note:
和分区数也没有直接关系,如果研究这个问题我们可以做10053 trace计划代价是怎么来的。

SQL> select owner,table_name,PARTITION_NAME,PARTITION_POSITION,NUM_ROWS,BLOCKS,AVG_ROW_LEN from  dba_tab_statistics where table_name='T';

OWNER          TABLE_NAME         PARTITION_NAME       PARTITION_POSITION   NUM_ROWS     BLOCKS AVG_ROW_LEN
-------------- ------------------ -------------------- ------------------ ---------- ---------- -----------
ANBOB          T                                                            10000000      45280          27
ANBOB          T                  P1                                    1
ANBOB          T                  P2                                    2
ANBOB          T                  P3                                    3
ANBOB          T                  P4                                    4
ANBOB          T                  P5                                    5
ANBOB          T                  P6                                    6
ANBOB          T                  P7                                    7
ANBOB          T                  P8                                    8
ANBOB          T                  P9                                    9
ANBOB          T                  P10                                  10
ANBOB          T                  P11                                  11
ANBOB          T                  P12                                  12
ANBOB          T                  P13                                  13
ANBOB          T                  P14                                  14
ANBOB          T                  P15                                  15
ANBOB          T                  P16                                  16
ANBOB          T                  P17                                  17
ANBOB          T                  P18                                  18
ANBOB          T                  P19                                  19
ANBOB          T                  P20                                  20
 
SQL> alter session set events '10053 trace name context forever, level 2';
Session altered.

SQL>  explain plan for
  2   select /*53*/* from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-7-1','yyyy-mm-dd') ;
Explained.

SQL> alter session set events '10053 trace name context off';
Session altered.

SQL> oradebug setmypid
Statement processed.
SQL> oradebug tracefile_name
/oracle/app/oracle/diag/rdbms/im/im1/trace/im1_ora_22398.trc

# 10053 trace file

CPUSPEED - CPU Speed 
IOTFRSPEED - I/O transfer speed
IOSEEKTIM - I/O seek time
SREADTIM - average single block read time
MREADTIM - average multiblock read time
MBRC - average multiblock read count

db_file_multiblock_read_count       = 128
_optimizer_block_size               = 8192
-----------------------------
SYSTEM STATISTICS INFORMATION
-----------------------------
  Using NOWORKLOAD Stats
  CPUSPEEDNW: 1122 millions instructions/sec (default is 100)
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM:  10 milliseconds (default is 10)
  MBRC:       NO VALUE blocks (default is 8)

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T  (Using composite stats)
  (making adjustments for partition skews)
  ORIGINAL VALUES::    #Rows: 10000000  #Blks:  45280  AvgRowLen:  27.00  ChainCnt:  0.00
  PARTITIONS::
  PRUNED: 3
  ANALYZED: 0  UNANALYZED: 3
    #Rows: 10000000  #Blks:  300  AvgRowLen:  27.00  ChainCnt:  0.00

Access path analysis for T
***************************************
SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for T[T] 

  Column (#2): CTIME(
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2457253 Max: 2457253
  Table: T  Alias: T
    Card: Original: 10000000.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  84.24  Resp: 84.24  Degree: 0
      Cost_io: 83.00  Cost_cpu: 16712523
      Resp_io: 83.00  Resp_cpu: 16712523
  Best:: AccessPath: TableScan
         Cost: 84.24  Degree: 1  Resp: 84.24  Card: 1.00  Bytes: 0

***************************************

OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]:  T[T]#0
***********************
Best so far:  Table#: 0  cost: 84.2411  card: 1.0000  bytes: 27
***********************
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:2000


SQL> select pname,pval1 from aux_stats$ where sname='SYSSTATS_MAIN';

PNAME                               PVAL1
------------------------------ ----------
CPUSPEEDNW                     1122.16321
IOSEEKTIM                              10
IOTFRSPEED                           4096
SREADTIM
MREADTIM
CPUSPEED                             1177
MBRC
MAXTHR
SLAVETHR

SREADTIM=IOSEEKTIM+db_block_size/IOTFRSPEED
=10+8192/4096
=12

MREADTIM=IOSEEKTIM+MBRC*db_block_size/IOTFRSPEED
=10+8*8192/4096
=26

#MRDS=#blks/MBRC
=300/8

Cost=(I/O COST)+(CPU COST)
=(1+ceil(#MRDS*MREADTIM/SREADTIM))+(cost_cpu/CPUSPEED/1000/SREADTIM)
=(1+ceil(300/8*26/12))+(16712523/1177/1000/12)
=83+1

note:
通过计算我可以确认估算UNANALYZED分区的blocks 使用的是300,而不是45280, 300 是怎么来的呢? 从ORACLE的官方文档找到了答案,是一个默认值。


Handling Missing Statistics

When Oracle encounters a table with missing statistics, Oracle dynamically gathers the necessary statistics needed by the optimizer. However, for certain types of tables, Oracle does not perform dynamic sampling. These include remote tables and external tables. In those cases and also when dynamic sampling has been disabled, the optimizer uses default values for its statistics, shown in Table 15-3 and Table 15-4.

Table 15-3 Default Table Values When Statistics Are Missing
Table Statistic Default Value Used by Optimizer
Cardinality num_of_blocks * (block_size – cache_layer) / avg_row_len
Average row length 100 bytes
Number of blocks 100 or actual value based on the extent map <================
Remote cardinality 2000 rows
Remote average row length 100 bytes

http://docs.oracle.com/cd/B13789_01/server.101/b10752/stats.htm

我再加多查询一个分区试试

SQL> explain plan for
  2       select /*53*/* from anbob.t where ctime between TO_DATe('2016-5-1','yyyy-mm-dd') and TO_DATe('2016-8-1','yyyy-mm-dd')  ;
Explained.

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T  (Using composite stats)
  (making adjustments for partition skews)
  ORIGINAL VALUES::    #Rows: 10000000  #Blks:  45280  AvgRowLen:  27.00  ChainCnt:  0.00
  PARTITIONS::
  PRUNED: 4
  ANALYZED: 0  UNANALYZED: 4
    #Rows: 10000000  #Blks:  400  AvgRowLen:  27.00  ChainCnt:  0.00

NOTE:
#BLKS 的确是默认值100* PRUNED partitions数, 当然也不并是所有的都取100,就像文档里提的100 or actual value based on the extent map, 比如下面如果带上有数据分区,应该参考的就是extent map得来的值。

SQL> select /*53*/* from anbob.t where ctime between TO_DATe('2015-8-1','yyyy-mm-dd') and TO_DATe('2015-9-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    10M|   257M|  1293   (3)| 00:00:16 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |    10M|   257M|  1293   (3)| 00:00:16 |     9 |    10 |
|*  2 |   TABLE ACCESS FULL      | T    |    10M|   257M|  1293   (3)| 00:00:16 |     9 |    10 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2015-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))    

 SQL> select /*53*/* from anbob.t where ctime between TO_DATe('2015-8-1','yyyy-mm-dd') and TO_DATe('2015-10-1','yyyy-mm-dd') ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3249057999

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    10M|   257M|  1320   (3)| 00:00:16 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |    10M|   257M|  1320   (3)| 00:00:16 |     9 |    11 |
|*  2 |   TABLE ACCESS FULL      | T    |    10M|   257M|  1320   (3)| 00:00:16 |     9 |    11 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CTIME"<=TO_DATE(' 2015-10-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

summary:
分区表的统计信息分为table level and parition level(先不说subpartition), 当CBO 在计算访问路径的COST时,如果table level and partition 都不存在统计信息,使用动态采样反而采集的数据没有问题,或都的确存在正确的统计信息时也都可以生成正确的执行计划,只有partition 级也不存在问题,但是如果只有table 级没有partition级时,多分区扫描时有可能会错误的利用 每个分区100 blocks生成较低的cost, 就像我们遇到的案例短时间内生成大量的数据到新分区,统计信息还没来的及收集, 而没有使用上正确的索引,后期我们是手动提前set 统计信息然后lock, 问题没有在发生。 具体案例具体分析,最好用数据说话。

 

— 致敬Thomas Kyte

打赏

对不起,这篇文章暂时关闭评论。