最近有客户在oracle迁移到Highgo数据库后,有些SQL运行时间timeout而中止,然后应用就开始调中间件的timeout时间如增加socketTimeout或removeAbandonedTimeout, 这是治标不治本的方法,主要是因为SQL时间变长了,查看了SQL 也是个简单的not in( …) 子查询。 这种SQL其实在oracle是有自动优化的,但PG没有, 所以应该是影响PG系的所有DB. 下面我构建个测试环境。
先看Oracle
SQL> create table tobj as select object_id,object_name,object_type,owner from dba_objects;
Table created.
SQL> create table tobj1 as select object_id,object_name,object_type,owner from dba_objects where rownum<=10;
Table created.
SQL> insert into tobj values(null,'anbob','a','b');
1 row created.
SQL> insert into tobj1 values(null,'anbob','a','b');
1 row created.
SQL> commit;
Commit complete.
SQL> @gts tobj
Gather Table Statistics for table tobj...
PL/SQL procedure successfully completed.
SQL> @gts tobj1
Gather Table Statistics for table tobj1...
PL/SQL procedure successfully completed.
SQL> select * from tobj where object_id not in(select object_id from tobj1);
Note: 这是个not in的子查询,注意这种有个陷阱, not in (子查询中不能有‘null’) ,否则返回空结果,所以上面我在2个表中都insert ‘null‘ 值。
not in (…) 子查询NULL 值陷阱
SQL> select count(*) from tobj;
COUNT(*)
----------
73955
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 ) ;
COUNT(*)
----------
0
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;
COUNT(*)
----------
73942
SQL> select count(*) from tobj where not exists(select 1 from tobj1 where tobj.object_id=tobj1.object_id);
COUNT(*)
----------
73945
SQL> select count(*) from tobj where object_id is null;
COUNT(*)
----------
3
Elapsed: 00:00:00.01
SQL> select count(*) from tobj1 where object_id is null;
COUNT(*)
----------
1
所以严谨期间,除非子查询不存在null 或表上有not null 约束,否则不建议使用”not in (…)”.
not in的执行计划
SQL> alter session set "_optimizer_null_aware_antijoin" =true;
Session altered.
Elapsed: 00:00:00.00
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;
COUNT(*)
----------
73942
Elapsed: 00:00:00.01
SQL> @x2
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3412522038
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 168 (1)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | HASH JOIN RIGHT ANTI SNA| | 73942 | 577K| 168 (1)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | TOBJ1 | 10 | 30 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | TOBJ | 73955 | 361K| 165 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"="OBJECT_ID")
3 - filter("OBJECT_ID" IS NOT NULL)
17 rows selected.
Elapsed: 00:00:00.02
SQL> alter session set "_optimizer_null_aware_antijoin" =false;
Session altered.
Elapsed: 00:00:00.01
SQL> select count(*) from tobj where object_id not in(select object_id from tobj1 where object_id is not null) ;
COUNT(*)
----------
73942
Elapsed: 00:00:00.28
SQL> @x2
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1368431935
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 126K (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL| TOBJ | 73955 | 361K| 166 (1)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| TOBJ1 | 2 | 6 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( NOT EXISTS (SELECT 0 FROM "TOBJ1" "TOBJ1" WHERE
"OBJECT_ID" IS NOT NULL AND LNNVL("OBJECT_ID"<>:B1)))
4 - filter("OBJECT_ID" IS NOT NULL AND LNNVL("OBJECT_ID"<>:B1))
18 rows selected.
SQL> select banner from v$version;
BANNER
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.0.0 - Production
SQL> @pd "null%anti"
Show all parameters and session values from x$ksppi/x$ksppcv...
NUM N_HEX NAME VALUE DESCRIPTION
---------- ---------- -------------------------------------------------------- ------------------------------ ---------------------------------------------------------------------------------------------------
5054 13BE _optimizer_null_aware_antijoin TRUE null-aware antijoin parameter
Note: oracle允许程序员写低效的SQL, 而oracle优化器会自动查询转换为更高效的等价SQL, 这也是现在其他数据库一直追逐或短时间模拟不来的。 对于这个场景是有参数“_optimizer_null_aware_antijoin”控制,默认是开启,早期11G存在一些bug建议禁用,但到26ai 还默认还是启用的所以还是利大于弊。 而如果你源库刚好是默认参数,那not in这类SQL在oracle 可以转换为和not exists一样使用antijoin反连接。
为什么antijoin反连接要比filter快?
Antijoin(反连接)是一种关系代数操作,用于返回左表中在右表中没有匹配记录的所有行。Anti Join 通常比 Filter 效率高,核心区别在于:Anti Join 是一种“基于集合”的操作,而 Filter 是一种“基于行”的操作。
如select * from a where id not in (select id from b)
- Filter方式
这种方式在概念上类似于一个嵌套循环。数据库优化器如果选择Filter(或称为 Subquery Scan),它会这样执行,遍历从A表读取每一行记录,去子查询b表执行,判断过滤。它是一个逐行过滤的过程。
- AntiJoin方式
ANti Join执行计划主要有2种,hash anti join和merge anti join, 区别是merge anti join是当数据已经有序或有索引时。
Hash anti join它是这样执行,从库里扫描B表一次,读取所有id值 在内存构建hash table, 再扫描A表一次,对A表中的每一行使用id值去hash table匹配。所以效率更快。就是会消耗一些内存。
在HighgoDB中
下面我们在highgo数据库构建测试数据,测试上面的SQL.
highgo=# create table tobj as select * from pg_class;
SELECT 683
highgo=# insert into tobj select * from tobj;
INSERT 0 683
highgo=# insert into tobj select * from tobj;
INSERT 0 1366
highgo=# \watch 1
Wed 03 Jun 2026 06:19:48 PM CST (every 1s)
INSERT 0 2732
Wed 03 Jun 2026 06:19:49 PM CST (every 1s)
INSERT 0 5464
Wed 03 Jun 2026 06:19:50 PM CST (every 1s)
INSERT 0 10928
Wed 03 Jun 2026 06:19:51 PM CST (every 1s)
INSERT 0 21856
Wed 03 Jun 2026 06:19:52 PM CST (every 1s)
INSERT 0 43712
Wed 03 Jun 2026 06:19:54 PM CST (every 1s)
INSERT 0 87424
^C
highgo=# select count(*) from tobj;
count
--------
174848
(1 row)
highgo=# create table tobj1 as select * from pg_class limit 10;
SELECT 10
highgo=# insert into tobj(oid,relname) values(null,'anbob');
INSERT 0 1
highgo=# insert into tobj1(oid,relname) values(null,'anbob');
INSERT 0 1
highgo=# explain select * from tobj where oid not in(select oid from tobj1);
QUERY PLAN
----------------------------------------------------------------
Seq Scan on tobj (cost=13.75..6679.35 rows=87424 width=515)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on tobj1 (cost=0.00..13.00 rows=300 width=4)
(4 rows)
highgo=# explain select * from tobj t where not exists (select oid from tobj1 t1 where t.oid=t1.oid);
QUERY PLAN
-----------------------------------------------------------------------
Hash Anti Join (cost=16.75..10555.09 rows=98048 width=515)
Hash Cond: (t.oid = t1.oid)
-> Seq Scan on tobj t (cost=0.00..6228.48 rows=174848 width=515)
-> Hash (cost=13.00..13.00 rows=300 width=4)
-> Seq Scan on tobj1 t1 (cost=0.00..13.00 rows=300 width=4)
(5 rows)
Note: 我们先看执行计划not in 走的是filter, 而not exists走的是hash anti join.
highgo=# alter table tobj rename oid to id;
ALTER TABLE
Time: 32.464 ms
highgo=# alter table tobj1 rename oid to id;
ALTER TABLE
Time: 10.815 ms
highgo=# explain select * from tobj where id not in(select id from tobj1 where id is not null);
QUERY PLAN
----------------------------------------------------------------
Seq Scan on tobj (cost=13.74..6679.34 rows=87424 width=515)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on tobj1 (cost=0.00..13.00 rows=298 width=4)
Filter: (id IS NOT NULL)
(5 rows)
Time: 0.538 ms
highgo=# explain select * from tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id and t1.id is not null) ;
QUERY PLAN
-----------------------------------------------------------------------
Hash Anti Join (cost=16.73..10541.05 rows=98560 width=515)
Hash Cond: (t.id = t1.id)
-> Seq Scan on tobj t (cost=0.00..6228.48 rows=174848 width=515)
-> Hash (cost=13.00..13.00 rows=298 width=4)
-> Seq Scan on tobj1 t1 (cost=0.00..13.00 rows=298 width=4)
Filter: (id IS NOT NULL)
(6 rows)
highgo=# select count(*) from tobj where id not in(select id from tobj1 );
count
-------
0
(1 row)
highgo=# select count(*) from tobj where id not in(select id from tobj1 where id is not null);
count
--------
172288
(1 row)
highgo=# select count(*) from tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id and t1.id is not null) ;
count
--------
172289
(1 row)
highgo=# delete tobj1 where id is null;
DELETE 1
Time: 11.368 ms
highgo=# select count(*) from tobj where id not in(select id from tobj1 );
count
--------
172288
(1 row)
Note: 在postgresql中同样有not in (子查询有null) 无返回的限制。
highgo=# create index on tobj(id);
CREATE INDEX
Time: 149.668 ms
highgo=# create index on tobj1(id);
CREATE INDEX
Time: 23.598 ms
highgo=# select count(*) from tobj where id not in(select id from tobj1 );
count
--------
172288
(1 row)
Time: 33.471 ms
highgo=# explain select count(*) from tobj where id not in(select id from tobj1 );
QUERY PLAN
------------------------------------------------------------------------------------------
Aggregate (cost=3899.84..3899.85 rows=1 width=8)
-> Index Only Scan using tobj_id_idx on tobj (cost=1.42..3681.28 rows=87424 width=0)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on tobj1 (cost=0.00..1.10 rows=10 width=4)
(5 rows)
Time: 0.552 ms
highgo=# select count(*) from tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id);
count
--------
172289
(1 row)
Time: 20.980 ms
highgo=# explain select count(*) from tobj t where not exists (select 1 from tobj1 t1 where t.id=t1.id);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4050.05..4050.06 rows=1 width=8)
-> Gather (cost=4049.93..4050.04 rows=1 width=8)
Workers Planned: 1
-> Partial Aggregate (cost=3049.93..3049.94 rows=1 width=8)
-> Merge Anti Join (cost=1.56..2796.57 rows=101346 width=0)
Merge Cond: (t.id = t1.id)
-> Parallel Index Only Scan using tobj_id_idx on tobj t (cost=0.29..2523.06 rows=102852 width=4)
-> Sort (cost=1.27..1.29 rows=10 width=4)
Sort Key: t1.id
-> Seq Scan on tobj1 t1 (cost=0.00..1.10 rows=10 width=4)
(10 rows)
Time: 0.616 ms
Note: 创建索引后使用merge anti join, 当然响应时间 都要比filter快。
总结
使用not in (…) 子查询中不能出现null值,称为null值陷阱,NOT IN 子查询结果中只要包含一个NULL值,整个查询返回空结果,且不报错。 如果确认没有null值后,在性能方面在oracle 对于Not in可以使用anti join自动查询转换,但是在postgresql系做不到,导致使用filter,性能变慢。优化方案建议改写为not exists, 这也是好多开发规范要求用not exists替not in 原因.