YashanDB数据库中 Scalar-subquery (标量子查询)缓存

之前写过一篇《Oracle数据库中 Scalar-subquery 缓存和 DETERMINISTIC Function》记录了在oracle中标量子查询时,如果子表的相同的关连值时,可以利用cache,减少运行时的执行调用,PostgreSQL 在V14(14.4) 引入了Memoize 新特性也是类似的优化,在Opengauss系目前还不支持,这种测试在Gauss会查询转换为left out join,最近测试了两天的YashanDB是否支持SS cache,因为没有monitor hint和starts列,比较费时。简单记录。

创建测试数据表

drop table test_ec1;
drop table test_ec2;

create table test_ec1 as select * from dba_objects;
create table test_ec2 as select * from test_ec1;

BEGIN
for i in 1..10000 LOOP
 BEGIN
   insert into test_ec2 select * from test_ec1; 
 end;
end loop;
end;
/

create index idx_ec1_id on test_ec1(object_id);

SQL> select count(*) from test_ec1;

             COUNT(*)
---------------------
                 2349

SQL> select count(*) from test_ec2;

             COUNT(*)
---------------------
             25864839

SQL> select owner,count(*) from test_ec2 group by owner;

OWNER                                                                         COUNT(*)
---------------------------------------------------------------- ---------------------
SYS                                                                           16703687
MDSYS                                                                            88088
XA_SYS                                                                           33033
ANBOB                                                                            11011
PUBLIC                                                                         9029020

测试标量子查询

SQL> select * from v$version;

BANNER                                                           VERSION_NUMBER
---------------------------------------------------------------- -----------------
Enterprise Edition Release 23.5.1.100 x86_64                     23.5.1.100

 SQL> show parameter cache

NAME                                                             VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
SCOL_CACHEABLE_SCAN_ROWS                                         18446744073709551615
SCOL_DISK_CACHEABLE_SCAN_ROWS                                    18446744073709551615
LSC_DICTIONARY_CACHE_AUTOEXTEND                                  FALSE
SCOL_WRITE_CACHE_POLICY                                          WRITE_BACK
ENABLE_DISKCACHE                                                 FALSE
DISKCACHE_ROOT                                                   ?/cache/diskcache
DISKCACHE_CAPACITY                                               100M

SQL> select  object_id,owner from test_ec1 where rownum<=10;

            OBJECT_ID OWNER
--------------------- ----------------------------------------------------------------
                    0 SYS
                    1 SYS
                    2 SYS
                    3 SYS
                    4 SYS
                    5 SYS
                    6 SYS
                    7 SYS
                    8 SYS
                    9 SYS

SQL> set autotrace trace exp
SQL> alter session set STATISTICS_LEVEL=all;  

SQL> select  object_id,(select count(*) from test_ec2 e2 where e1.owner=e2.owner ) cnt from test_ec1 e1 where object_id in(0,1);

Execution Plan
----------------------------------------------------------------
SQL hash value: 1123266897
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         2|             |        43|         3|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         2|             |  67198562|         4|         0|         0|                                |
|* 2 |   TABLE ACCESS FULL (AGGR PUSHED)| TEST_EC2             | SYS        |         1|         2|      147( 0)|  67198551|         4|         0|         0|                                |
|  3 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       201|          |        1( 0)|          |          |          |          |                                |
|* 4 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       201|         2|        1( 0)|        41|         3|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   2 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   4 - Predicate : access("E1"."OBJECT_ID" IN (0[OPTMZ-0], 1[OPTMZ-0]))

Note: yashandb 没有starts ,loop 列的值不明白是什么意思,下面测试loops 区别

SQL> select  object_id,owner from test_ec1 e1 where object_id in(0);

Execution Plan
----------------------------------------------------------------
SQL hash value: 592253244
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         1|             |        27|         2|         0|         0|                                |
|  1 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       101|          |        1( 0)|          |          |          |          |                                |
|* 2 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       101|         1|        1( 0)|        26|         2|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

SQL>
select  object_id,owner from test_ec1 e1 where object_id in(0,1); 

Execution Plan
----------------------------------------------------------------
SQL hash value: 175903804
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         2|             |        25|         3|         0|         0|                                |
|  1 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       201|          |        1( 0)|          |          |          |          |                                |
|* 2 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       201|         2|        1( 0)|        24|         3|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Note: 似乎增加一个值loops +1,既然没有执行次数的值,下面我们使用执行时间判断是否有使用cache。

SQL> set timing on

SQL> select  object_id,(select count(*) from test_ec2 e2 where e1.owner=e2.owner ) cnt from test_ec1 e1 
where object_id in(0);

Execution Plan
----------------------------------------------------------------
SQL hash value: 4252418592
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         1|             |        30|         2|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         1|             |  33863621|         2|         0|         0|                                |
|* 2 |   TABLE ACCESS FULL (AGGR PUSHED)| TEST_EC2             | SYS        |         1|         1|      147( 0)|  33863616|         2|         0|         0|                                |
|  3 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       101|          |        1( 0)|          |          |          |          |                                |
|* 4 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       101|         1|        1( 0)|        28|         2|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   2 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   4 - Predicate : access("E1"."OBJECT_ID" = 0[OPTMZ-0])


Elapsed: 00:00:33.873

SQL> select  object_id,(select count(*) from test_ec2 e2 where e1.owner=e2.owner ) cnt from test_ec1 e1 
where object_id in(0,1);

Execution Plan
----------------------------------------------------------------
SQL hash value: 1123266897
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         2|             |        43|         3|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         2|             |  67198562|         4|         0|         0|                                |
|* 2 |   TABLE ACCESS FULL (AGGR PUSHED)| TEST_EC2             | SYS        |         1|         2|      147( 0)|  67198551|         4|         0|         0|                                |
|  3 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       201|          |        1( 0)|          |          |          |          |                                |
|* 4 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       201|         2|        1( 0)|        41|         3|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   2 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   4 - Predicate : access("E1"."OBJECT_ID" IN (0[OPTMZ-0], 1[OPTMZ-0]))

Elapsed: 00:01:07.208

SQL> select  object_id,(select count(*) from test_ec2 e2 where e1.owner=e2.owner ) cnt from test_ec1 e1 
where object_id in(0,1,2);

Execution Plan
----------------------------------------------------------------
SQL hash value: 1807856721
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         3|             |        62|         4|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         3|             | 100824329|         6|         0|         0|                                |
|* 2 |   TABLE ACCESS FULL (AGGR PUSHED)| TEST_EC2             | SYS        |         1|         3|      147( 0)| 100824313|         6|         0|         0|                                |
|  3 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |       300|          |        1( 0)|          |          |          |          |                                |
|* 4 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |       300|         3|        1( 0)|        61|         4|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   2 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   4 - Predicate : access("E1"."OBJECT_ID" IN (0[OPTMZ-0], 1[OPTMZ-0], 2[OPTMZ-0]))


Elapsed: 00:01:40.835

Note: 执行计划是使用了标量子查询,和带了”AGGR PUSHED”, 这是YASHANDB 对于子查询有聚合参数的一种优化方式。

但是从传入的值 0 1 2对应的都是SYS, 如果有cache 那三次执行应该接近,但是目前看恰恰成倍的增长,一值33.8s 二值67.2s 三值100.8s。 这难道是崖山不支持 标量子查询cache? 不是的,原因是”AGGR PUSHED”和cache 优化 不兼容。

目前没有参数和直接hint 可以禁用这个执行计划优化,这是基于cost代码模型评估,如果想干预的话可以使用selectivity hint. 下面继续。

SQL> select  object_id,(select  count(*) from test_ec2 e2 where e1.owner=e2.owner SELECTIVITY 0.0001  ) cnt from test_ec1 e1 where object_id in(0);

Execution Plan
----------------------------------------------------------------
SQL hash value: 2611185001
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         1|             |        34|         2|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         1|             |  35837090|         2|         0|         0|                                |
|  2 |   AGGREGATE                    |                      |            |         1|         1|  1076561( 0)|  35837087|         2|         0|         0|                                |
|* 3 |    TABLE ACCESS FULL           | TEST_EC2             | SYS        |      2587|  16703687|  1076561( 0)|  34854974|  16703688|         0|         0|                                |
|  4 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |         1|          |        1( 0)|          |          |          |          |                                |
|* 5 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |         1|         1|        1( 0)|        33|         2|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   3 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   5 - Predicate : access("E1"."OBJECT_ID" = 0[OPTMZ-0])

Elapsed: 00:00:35.849
SQL> select  object_id,(select  count(*) from test_ec2 e2 where e1.owner=e2.owner SELECTIVITY 0.0001  ) cnt from test_ec1 e1 
where object_id in(0,1);

Execution Plan
----------------------------------------------------------------
SQL hash value: 2609558636
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         2|             |        41|         3|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         1|             |  40668578|         2|         0|         0|                                |
|  2 |   AGGREGATE                    |                      |            |         1|         1|  1076561( 0)|  40668572|         2|         0|         0|                                |
|* 3 |    TABLE ACCESS FULL           | TEST_EC2             | SYS        |      2587|  16703687|  1076561( 0)|  39687209|  16703688|         0|         0|                                |
|  4 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |         2|          |        1( 0)|          |          |          |          |                                |
|* 5 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |         2|         2|        1( 0)|        41|         3|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   1 - Subquery NDV info - NDV percentage: 0.500000, NDV Expression: ("E1"."OWNER"[OPTMZ-3])
   3 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   5 - Predicate : access("E1"."OBJECT_ID" IN (0[OPTMZ-0], 1[OPTMZ-0]))

Elapsed: 00:00:40.679
SQL>  select  object_id,(select  count(*) from test_ec2 e2 where e1.owner=e2.owner SELECTIVITY 0.0001  ) cnt from test_ec1 e1
 where object_id in(0,1,2);

Execution Plan
----------------------------------------------------------------
SQL hash value: 1496909357
Optimizer: ADOPT_C

+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
| Id | Operation type                 | Name                 | Owner      | E - Rows | A - Rows | Cost(%CPU)  | A - Time | Loops    | Memory   | Disk     | Partition info                 |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
|  0 | SELECT STATEMENT               |                      |            |          |         3|             |        50|         4|         0|         0|                                |
|  1 |  SUBQUERY                      | QUERY[1]             |            |          |         1|             |  40908022|         2|         0|         0|                                |
|  2 |   AGGREGATE                    |                      |            |         1|         1|  1076561( 0)|  40908018|         2|         0|         0|                                |
|* 3 |    TABLE ACCESS FULL           | TEST_EC2             | SYS        |      2587|  16703687|  1076561( 0)|  39923911|  16703688|         0|         0|                                |
|  4 |  TABLE ACCESS BY INDEX ROWID   | TEST_EC1             | SYS        |         3|          |        1( 0)|          |          |          |          |                                |
|* 5 |   INDEX RANGE SCAN             | IDX_EC1_ID           | SYS        |         3|         3|        1( 0)|        47|         4|         0|         0|                                |
+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

Operation Information (identified by operation id):
---------------------------------------------------

   1 - Subquery NDV info - NDV percentage: 0.333333, NDV Expression: ("E1"."OWNER"[OPTMZ-3])
   3 - Predicate : filter("E1"."OWNER"[OPTMZ-3][OPTMZ-2] = "E2"."OWNER"[OPTMZ-1])
   5 - Predicate : access("E1"."OBJECT_ID" IN (0[OPTMZ-0], 1[OPTMZ-0], 2[OPTMZ-0]))

Elapsed: 00:00:40.922

Note: 现在的执行计划不再是aggr pushed,而是又增加了一层算子AGGREGATE , 但是从响应时间上看应该是用上了cache后的,一值 35s,二值40s,三值 40s, 减少了相同值的重复调用。

结论:

YashanDB是支持标量子查询的cache功能的,但是在使用“aggr pushed”优化时不会。可以使用selectivity hint干预cost达到目的。

— over–