Oracle迁移PostgreSQL系性能问题:not in

最近有客户在oracle迁移到Highgo数据库后,有些SQL运行时间timeout而中止,然后应用就开始调中间件的timeout时间如增加socketTimeout或removeAbandonedTimeout, 这是治标不治本的方法,主要是因为SQL时间变长了,查看了SQL 也是个简单的not in( …) 子查询。 这种SQL其实在oracle是有自动优化的,但PG没有, 所以应该是影响PG系的所有DB. 下面我构建个测试环境。

先看Oracle

SQL> create table tobj as select object_id,object_name,object_type,owner from dba_objects;
Table created.


SQL> create table tobj1 as select object_id,object_name,object_type,owner from dba_objects where rownum<=10;
Table created.

SQL> insert into tobj values(null,'anbob','a','b');
1 row created.

SQL>  insert into tobj1 values(null,'anbob','a','b');
1 row created.

SQL> commit;
Commit complete.

SQL> @gts tobj
Gather Table Statistics for table tobj...

PL/SQL procedure successfully completed.

SQL> @gts tobj1
Gather Table Statistics for table tobj1...

PL/SQL procedure successfully completed.

SQL> select * from tobj where object_id not in(select object_id from tobj1);

Note: 这是个not in的子查询,注意这种有个陷阱, not in (子查询中不能有‘null’) ,否则返回空结果,所以上面我在2个表中都insert ‘null‘ 值。

not in (…) 子查询NULL 值陷阱

SQL> select count(*) from tobj;

  COUNT(*)
----------
     73955

SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 ) ;

  COUNT(*)
----------
         0

SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;

  COUNT(*)
----------
     73942

SQL> select count(*) from tobj where not exists(select 1 from tobj1 where tobj.object_id=tobj1.object_id);

  COUNT(*)
----------
     73945

SQL> select count(*) from tobj where object_id is null;

  COUNT(*)
----------
         3

Elapsed: 00:00:00.01
SQL> select count(*) from tobj1  where object_id is null;

  COUNT(*)
----------
         1

所以严谨期间,除非子查询不存在null 或表上有not null 约束,否则不建议使用”not in (…)”.

not in的执行计划

SQL> alter session  set "_optimizer_null_aware_antijoin" =true;

Session altered.

Elapsed: 00:00:00.00
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;

  COUNT(*)
----------
     73942

Elapsed: 00:00:00.01
SQL> @x2

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3412522038

-----------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |     1 |     8 |   168   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE           |       |     1 |     8 |            |          |
|*  2 |   HASH JOIN RIGHT ANTI SNA|       | 73942 |   577K|   168   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL      | TOBJ1 |    10 |    30 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL      | TOBJ  | 73955 |   361K|   165   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"="OBJECT_ID")
   3 - filter("OBJECT_ID" IS NOT NULL)

17 rows selected.

Elapsed: 00:00:00.02
SQL> alter session  set "_optimizer_null_aware_antijoin" =false;

Session altered.

Elapsed: 00:00:00.01
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;

  COUNT(*)
----------
     73942

Elapsed: 00:00:00.28
SQL> @x2

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1368431935

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |     5 |   126K  (1)| 00:00:05 |
|   1 |  SORT AGGREGATE     |       |     1 |     5 |            |          |
|*  2 |   FILTER            |       |       |       |            |          |
|   3 |    TABLE ACCESS FULL| TOBJ  | 73955 |   361K|   166   (1)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| TOBJ1 |     2 |     6 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( NOT EXISTS (SELECT 0 FROM "TOBJ1" "TOBJ1" WHERE
              "OBJECT_ID" IS NOT NULL AND LNNVL("OBJECT_ID"<>:B1)))
   4 - filter("OBJECT_ID" IS NOT NULL AND LNNVL("OBJECT_ID"<>:B1))

18 rows selected.

SQL> select banner from v$version;

BANNER
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.0.0 - Production

SQL> @pd "null%anti"
Show all parameters and session values from x$ksppi/x$ksppcv...

       NUM N_HEX      NAME                                                     VALUE                          DESCRIPTION
---------- ---------- -------------------------------------------------------- ------------------------------ ---------------------------------------------------------------------------------------------------
      5054  13BE      _optimizer_null_aware_antijoin                           TRUE                           null-aware antijoin parameter

Note: oracle允许程序员写低效的SQL, 而oracle优化器会自动查询转换为更高效的等价SQL, 这也是现在其他数据库一直追逐或短时间模拟不来的。 对于这个场景是有参数“_optimizer_null_aware_antijoin”控制,默认是开启,早期11G存在一些bug建议禁用,但到26ai 还默认还是启用的所以还是利大于弊。 而如果你源库刚好是默认参数,那not in这类SQL在oracle 可以转换为和not exists一样使用antijoin反连接。

为什么antijoin反连接要比filter快?

Antijoin(反连接)是一种关系代数操作,用于返回左表中在右表中没有匹配记录的所有行。Anti Join 通常比 Filter 效率高,核心区别在于:Anti Join 是一种“基于集合”的操作,而 Filter 是一种“基于行”的操作。

如select * from a where id not in (select id from b)

  • Filter方式

这种方式在概念上类似于一个嵌套循环。数据库优化器如果选择Filter(或称为 Subquery Scan),它会这样执行,遍历从A表读取每一行记录,去子查询b表执行,判断过滤。它是一个逐行过滤的过程。

  • AntiJoin方式

ANti Join执行计划主要有2种,hash anti join和merge anti join, 区别是merge anti join是当数据已经有序或有索引时。

Hash anti join它是这样执行,从库里扫描B表一次,读取所有id值 在内存构建hash table, 再扫描A表一次,对A表中的每一行使用id值去hash table匹配。所以效率更快。就是会消耗一些内存。

在HighgoDB中

下面我们在highgo数据库构建测试数据,测试上面的SQL.

highgo=# create table tobj as select * from pg_class;
SELECT 683

highgo=# insert into tobj  select * from tobj;
INSERT 0 683
highgo=# insert into tobj  select * from tobj;
INSERT 0 1366
highgo=# \watch 1
Wed 03 Jun 2026 06:19:48 PM CST (every 1s)

INSERT 0 2732

Wed 03 Jun 2026 06:19:49 PM CST (every 1s)

INSERT 0 5464

Wed 03 Jun 2026 06:19:50 PM CST (every 1s)

INSERT 0 10928

Wed 03 Jun 2026 06:19:51 PM CST (every 1s)

INSERT 0 21856

Wed 03 Jun 2026 06:19:52 PM CST (every 1s)

INSERT 0 43712

Wed 03 Jun 2026 06:19:54 PM CST (every 1s)

INSERT 0 87424

^C
highgo=# select count(*) from tobj;
 count
--------
 174848
(1 row)

highgo=# create table tobj1 as select * from pg_class limit 10;
SELECT 10

highgo=# insert into tobj(oid,relname) values(null,'anbob');
INSERT 0 1
highgo=# insert into tobj1(oid,relname) values(null,'anbob');
INSERT 0 1
highgo=# explain select * from tobj where oid not in(select oid from tobj1);
                           QUERY PLAN
----------------------------------------------------------------
 Seq Scan on tobj  (cost=13.75..6679.35 rows=87424 width=515)
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Seq Scan on tobj1  (cost=0.00..13.00 rows=300 width=4)
(4 rows)

highgo=# explain select * from tobj t where not exists (select oid from tobj1 t1 where t.oid=t1.oid);
                              QUERY PLAN
-----------------------------------------------------------------------
 Hash Anti Join  (cost=16.75..10555.09 rows=98048 width=515)
   Hash Cond: (t.oid = t1.oid)
   ->  Seq Scan on tobj t  (cost=0.00..6228.48 rows=174848 width=515)
   ->  Hash  (cost=13.00..13.00 rows=300 width=4)
         ->  Seq Scan on tobj1 t1  (cost=0.00..13.00 rows=300 width=4)
(5 rows)

Note: 我们先看执行计划not in 走的是filter, 而not exists走的是hash anti join.

highgo=# alter table tobj rename oid to id;
ALTER TABLE
Time: 32.464 ms
highgo=# alter table tobj1 rename oid to id;
ALTER TABLE
Time: 10.815 ms

highgo=# explain select * from tobj where id not in(select id from tobj1 where id is not null);
                           QUERY PLAN
----------------------------------------------------------------
 Seq Scan on tobj  (cost=13.74..6679.34 rows=87424 width=515)
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Seq Scan on tobj1  (cost=0.00..13.00 rows=298 width=4)
           Filter: (id IS NOT NULL)
(5 rows)

Time: 0.538 ms
highgo=# explain select * from tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id and t1.id is not null) ;
                              QUERY PLAN
-----------------------------------------------------------------------
 Hash Anti Join  (cost=16.73..10541.05 rows=98560 width=515)
   Hash Cond: (t.id = t1.id)
   ->  Seq Scan on tobj t  (cost=0.00..6228.48 rows=174848 width=515)
   ->  Hash  (cost=13.00..13.00 rows=298 width=4)
         ->  Seq Scan on tobj1 t1  (cost=0.00..13.00 rows=298 width=4)
               Filter: (id IS NOT NULL)
(6 rows)

highgo=# select count(*) from tobj where id not in(select id from tobj1 );
 count
-------
     0
(1 row)

highgo=# select count(*) from tobj where id not in(select id from tobj1 where id is not null);
 count
--------
 172288
(1 row)


highgo=#  select count(*) from  tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id and t1.id is not null) ;
 count
--------
 172289
(1 row)

highgo=# delete tobj1 where id is null;
DELETE 1
Time: 11.368 ms
highgo=# select count(*) from tobj where id not in(select id from tobj1 );
 count
--------
 172288
(1 row)

Note: 在postgresql中同样有not in (子查询有null) 无返回的限制。

highgo=# create index on tobj(id);
CREATE INDEX
Time: 149.668 ms
highgo=# create index on tobj1(id);
CREATE INDEX
Time: 23.598 ms


highgo=#  select count(*) from tobj where id not in(select id from tobj1 );
 count
--------
 172288
(1 row)

Time: 33.471 ms
highgo=# explain select count(*) from tobj where id not in(select id from tobj1 );
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Aggregate  (cost=3899.84..3899.85 rows=1 width=8)
   ->  Index Only Scan using tobj_id_idx on tobj  (cost=1.42..3681.28 rows=87424 width=0)
         Filter: (NOT (hashed SubPlan 1))
         SubPlan 1
           ->  Seq Scan on tobj1  (cost=0.00..1.10 rows=10 width=4)
(5 rows)

Time: 0.552 ms
highgo=# select count(*) from  tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id);
 count
--------
 172289
(1 row)

Time: 20.980 ms
highgo=# explain select count(*) from  tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id);
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=4050.05..4050.06 rows=1 width=8)
   ->  Gather  (cost=4049.93..4050.04 rows=1 width=8)
         Workers Planned: 1
         ->  Partial Aggregate  (cost=3049.93..3049.94 rows=1 width=8)
               ->  Merge Anti Join  (cost=1.56..2796.57 rows=101346 width=0)
                     Merge Cond: (t.id = t1.id)
                     ->  Parallel Index Only Scan using tobj_id_idx on tobj t  (cost=0.29..2523.06 rows=102852 width=4)
                     ->  Sort  (cost=1.27..1.29 rows=10 width=4)
                           Sort Key: t1.id
                           ->  Seq Scan on tobj1 t1  (cost=0.00..1.10 rows=10 width=4)
(10 rows)

Time: 0.616 ms

Note: 创建索引后使用merge anti join, 当然响应时间 都要比filter快。

总结

使用not in (…) 子查询中不能出现null值,称为null值陷阱,NOT IN 子查询结果中只要包含一个NULL值,整个查询返回空结果,且不报错。 如果确认没有null值后,在性能方面在oracle 对于Not in可以使用anti join自动查询转换,但是在postgresql系做不到,导致使用filter,性能变慢。优化方案建议改写为not exists, 这也是好多开发规范要求用not exists替not in 原因.

Leave a Comment