首页 » ORACLE 9i-23ai, PostgreSQL/GaussDB » Alert: PostgreSQL JDBC 记得配置Fetch Size

Alert: PostgreSQL JDBC 记得配置Fetch Size

在使用PostgreSQL JDBC处理大型结果集时,正确配置fetch size对于优化性能和内存使用至关重要。最近我们在国产化改造过程中总有一些差异导致应用性能问题,有时不只在数据库还可能在驱动中,如分页查询或其他OLTP场景,我们总希望尽快的返回结果,这里我分享一下Oracle和Postgresql JDBC 默认fetchsize 从服务向客户端发送数据的差异。

Oracle 和PostgreSQL JDBC fetch差异

首先Oracle Jdbc中的resultSet是服务器端的游标,  默认是一个SQL 流式执行,而不是缓冲区,可以从大量结果中读取几行数据返回,而不用等待整个结果完成就开始处理。但是这种可能在Fetch了几批后表不存在了执行报错,或直接MVCC的快照过久,直接ORA-1555, 当然还可能因为没有合理的增加Fetchsize, 结果在服务和客户端频繁的往返传递数据和上下文切换时间。但是在PostgreSQL JDBC中它不是服务器端游标,而是客户端缓存,当执行第一次java中调用 rs.next()时,等整合结果返回到客户端,导致初始响应过长,尤其是应用端使用set对象的分页查询,还可能会占用更大的内存,当然PostgreSQL中可以配置fetchsize 来改善这个场景的性能。

 

PostgreSQL Fetch Size使用

当Jdbc启用自动提交(默认)时,fetch size设置会被忽略,所有行都会被立即获取, 需要禁用自动提交才能使fetch size生效:

conn.setAutoCommit(false);

默认fetchsize 0禁用。defaultRowFetchSize (int) Default 0如果配置了Fetch Size,PostgreSQL JDBC驱动通过服务端游标实现fetch size功能,每个fetch size窗口会在服务器上创建一个临时游标,长时间运行的事务配合大fetch size可能会占用服务器资源,某些连接池可能会影响fetch size行为,推荐的Fetch Size值,中小型结果集:100-1000行,大型结果集:1000-5000行,取决于网络包大小和数据行长,并不是越大越好,需要测试,根据具体使用场景设置合适的fetch size,对于超大数据集,考虑分批处理或分页查询使用SQL limit。

新版JDBC中似乎增加了另一个参数adaptiveFetch (boolean) Default false  ,当使用自适应大小时,” First fetch will have number of rows declared in defaultRowFetchSize. Number of rows can be limited by adaptiveFetchMinimum and adaptiveFetchMaximum.”

实践测试

下面我们使用一段简单的Java 代码对比没有配置Fetchsize时的性能。这里我使用的highgoDB(base on Postgresql V14)

PostgreSQL JDBC 不配置fetch size

-- db 
create table demo(n int);
 insert into demo
  select generate_series from generate_series(1000,1020);
  
create function mytime(n int) returns timestamptz as
$$
begin
 perform pg_sleep(1);
 return clock_timestamp();
end;
$$
language plpgsql;


-- java
import java.io.*;
import java.sql.*;
import java.util.*;
import java.time.*;
//import oracle.jdbc.*;
public class fetchtest {
 public static void println(String text){
  System.out.println(Instant.now().toString() +":" + text);
 }
 
public static void main(String[] args)
 throws SQLException,InterruptedException {
  try (Connection c = (Connection) DriverManager.getConnection(
    args[2],args[0],args[1]) // url, user, password
   ) {
   c.setAutoCommit(false);
   try (PreparedStatement s = c.prepareStatement(
    "select n,mytime(n) t from demo order by 1 desc"
    )) {
   // s.setFetchSize(5);
    Thread.sleep(1000);
    println(" PRS "+s);
    try ( ResultSet rs = s.executeQuery() ) {
     Thread.sleep(1000);
     println(" EXE "+rs);
     while ( rs.next()) {
      Thread.sleep(1000);
      println(" FCH "+rs.getLong("N")+" "+rs.getTimestamp("T"));
     }
     System.out.println("fetch size: "+rs.getFetchSize());
    }
   }
  }
 }
}

Note:
Function是非确认性的在每次执行时会调用,Java 先打印时间戳,初始resultSET显示时间,然后在每fetch数据调用function 显示时间。 每个批次执行中间sleep 1秒。

$ javac fetchtest.java
$ java fetchtest  "anbob" "anbob" "jdbc:postgresql://localhost:1521/anbob"
2025-05-24T09:24:42.948450Z: PRS select n,mytime(n) t from demo
2025-05-24T09:25:26.028755Z: EXE org.postgresql.jdbc.PgResultSet@5ef60048
2025-05-24T09:25:27.046707Z: FCH 1000 2025-05-24 17:24:43.984567
2025-05-24T09:25:28.047332Z: FCH 1001 2025-05-24 17:24:44.985286
2025-05-24T09:25:29.047909Z: FCH 1002 2025-05-24 17:24:45.986356
2025-05-24T09:25:30.048511Z: FCH 1003 2025-05-24 17:24:46.987412
2025-05-24T09:25:31.049044Z: FCH 1004 2025-05-24 17:24:47.988449
2025-05-24T09:25:32.049619Z: FCH 1005 2025-05-24 17:24:48.989487
2025-05-24T09:25:33.050203Z: FCH 1006 2025-05-24 17:24:49.990537
2025-05-24T09:25:34.050832Z: FCH 1007 2025-05-24 17:24:50.991578
2025-05-24T09:25:35.051442Z: FCH 1008 2025-05-24 17:24:51.992778
2025-05-24T09:25:36.051995Z: FCH 1009 2025-05-24 17:24:52.993974
2025-05-24T09:25:37.052536Z: FCH 1010 2025-05-24 17:24:53.995179
2025-05-24T09:25:38.053124Z: FCH 1011 2025-05-24 17:24:54.996372
2025-05-24T09:25:39.053718Z: FCH 1012 2025-05-24 17:24:55.99757
2025-05-24T09:25:40.054267Z: FCH 1013 2025-05-24 17:24:56.998783
2025-05-24T09:25:41.054842Z: FCH 1014 2025-05-24 17:24:57.99999
2025-05-24T09:25:42.055418Z: FCH 1015 2025-05-24 17:24:59.001174
2025-05-24T09:25:43.056041Z: FCH 1016 2025-05-24 17:25:00.002366
2025-05-24T09:25:44.056764Z: FCH 1017 2025-05-24 17:25:01.003552
2025-05-24T09:25:45.057781Z: FCH 1018 2025-05-24 17:25:02.004744
2025-05-24T09:25:46.058731Z: FCH 1019 2025-05-24 17:25:03.005939
2025-05-24T09:25:47.059574Z: FCH 1020 2025-05-24 17:25:04.007058
2025-05-24T09:25:48.060392Z: FCH 1000 2025-05-24 17:25:05.007682
2025-05-24T09:25:49.061197Z: FCH 1001 2025-05-24 17:25:06.008098
2025-05-24T09:25:50.061997Z: FCH 1002 2025-05-24 17:25:07.008665
2025-05-24T09:25:51.062884Z: FCH 1003 2025-05-24 17:25:08.00925
2025-05-24T09:25:52.063716Z: FCH 1004 2025-05-24 17:25:09.009835
2025-05-24T09:25:53.064609Z: FCH 1005 2025-05-24 17:25:10.010371
2025-05-24T09:25:54.065412Z: FCH 1006 2025-05-24 17:25:11.010726
2025-05-24T09:25:55.066281Z: FCH 1007 2025-05-24 17:25:12.011234
2025-05-24T09:25:56.067165Z: FCH 1008 2025-05-24 17:25:13.011771
2025-05-24T09:25:57.068138Z: FCH 1009 2025-05-24 17:25:14.012344
2025-05-24T09:25:58.069092Z: FCH 1010 2025-05-24 17:25:15.012879
2025-05-24T09:25:59.069985Z: FCH 1011 2025-05-24 17:25:16.013391
2025-05-24T09:26:00.070847Z: FCH 1012 2025-05-24 17:25:17.013901
2025-05-24T09:26:01.071667Z: FCH 1013 2025-05-24 17:25:18.014483
2025-05-24T09:26:02.072549Z: FCH 1014 2025-05-24 17:25:19.015061
2025-05-24T09:26:03.073448Z: FCH 1015 2025-05-24 17:25:20.015401
2025-05-24T09:26:04.074326Z: FCH 1016 2025-05-24 17:25:21.015943
2025-05-24T09:26:05.075137Z: FCH 1017 2025-05-24 17:25:22.017047
2025-05-24T09:26:06.076051Z: FCH 1018 2025-05-24 17:25:23.018178
2025-05-24T09:26:07.076869Z: FCH 1019 2025-05-24 17:25:24.019292
2025-05-24T09:26:08.077684Z: FCH 1020 2025-05-24 17:25:25.020406
fetch size: 0

Note:
默认没有fetchsize时(默认0),查询数据2025-05-24T09:24:42 到2025-05-24T09:25:26.028755Z ,一共20条数据用时44秒,所有数据都已缓存,后面每次fetch时间都很快。

PostgreSQL JDBC 配置fetch size

去掉上面java代码中的注释,重新编译, 因为一共20条,我们配置fetchsize 5(注:这不是最佳)

$ vi fetchtest.java
...
s.setFetchSize(5);
...

$ javac fetchtest.java

相同的SQL语句

$ java fetchtest  "anbob" "anbob" "jdbc:postgresql://localhost:1521/anbob"
2025-05-24T09:31:31.762581Z: PRS select n,mytime(n) t from demo
2025-05-24T09:31:37.809352Z: EXE org.postgresql.jdbc.PgResultSet@1d548a08
2025-05-24T09:31:38.832481Z: FCH 1000 2025-05-24 17:31:32.794696
2025-05-24T09:31:39.833394Z: FCH 1001 2025-05-24 17:31:33.795984
2025-05-24T09:31:40.834181Z: FCH 1002 2025-05-24 17:31:34.79716
2025-05-24T09:31:41.834984Z: FCH 1003 2025-05-24 17:31:35.798336
2025-05-24T09:31:42.835784Z: FCH 1004 2025-05-24 17:31:36.799517
2025-05-24T09:31:48.844759Z: FCH 1005 2025-05-24 17:31:43.838961
2025-05-24T09:31:49.845554Z: FCH 1006 2025-05-24 17:31:44.840151
2025-05-24T09:31:50.846379Z: FCH 1007 2025-05-24 17:31:45.841329
2025-05-24T09:31:51.847314Z: FCH 1008 2025-05-24 17:31:46.842517
2025-05-24T09:31:52.848112Z: FCH 1009 2025-05-24 17:31:47.843702
2025-05-24T09:31:58.855430Z: FCH 1010 2025-05-24 17:31:53.849791
2025-05-24T09:31:59.856295Z: FCH 1011 2025-05-24 17:31:54.850988
2025-05-24T09:32:00.857167Z: FCH 1012 2025-05-24 17:31:55.852168
2025-05-24T09:32:01.858011Z: FCH 1013 2025-05-24 17:31:56.853352
2025-05-24T09:32:02.858876Z: FCH 1014 2025-05-24 17:31:57.854533
2025-05-24T09:32:08.866230Z: FCH 1015 2025-05-24 17:32:03.860501
2025-05-24T09:32:09.867171Z: FCH 1016 2025-05-24 17:32:04.86168
2025-05-24T09:32:10.868017Z: FCH 1017 2025-05-24 17:32:05.862904
2025-05-24T09:32:11.868874Z: FCH 1018 2025-05-24 17:32:06.864083
2025-05-24T09:32:12.869728Z: FCH 1019 2025-05-24 17:32:07.865285
2025-05-24T09:32:18.876161Z: FCH 1020 2025-05-24 17:32:13.871456
2025-05-24T09:32:19.876787Z: FCH 1000 2025-05-24 17:32:14.871951
2025-05-24T09:32:20.877376Z: FCH 1001 2025-05-24 17:32:15.873103
2025-05-24T09:32:21.877991Z: FCH 1002 2025-05-24 17:32:16.874269
2025-05-24T09:32:22.878553Z: FCH 1003 2025-05-24 17:32:17.87543
2025-05-24T09:32:28.885330Z: FCH 1004 2025-05-24 17:32:23.880003
2025-05-24T09:32:29.885913Z: FCH 1005 2025-05-24 17:32:24.881158
2025-05-24T09:32:30.886467Z: FCH 1006 2025-05-24 17:32:25.882293
2025-05-24T09:32:31.887046Z: FCH 1007 2025-05-24 17:32:26.883446
2025-05-24T09:32:32.887658Z: FCH 1008 2025-05-24 17:32:27.884618
2025-05-24T09:32:38.895189Z: FCH 1009 2025-05-24 17:32:33.889234
2025-05-24T09:32:39.896115Z: FCH 1010 2025-05-24 17:32:34.890437
2025-05-24T09:32:40.896993Z: FCH 1011 2025-05-24 17:32:35.891635
2025-05-24T09:32:41.897862Z: FCH 1012 2025-05-24 17:32:36.892864
2025-05-24T09:32:42.898686Z: FCH 1013 2025-05-24 17:32:37.894064
2025-05-24T09:32:48.906187Z: FCH 1014 2025-05-24 17:32:43.900453
2025-05-24T09:32:49.907052Z: FCH 1015 2025-05-24 17:32:44.901658
2025-05-24T09:32:50.907865Z: FCH 1016 2025-05-24 17:32:45.902853
2025-05-24T09:32:51.908745Z: FCH 1017 2025-05-24 17:32:46.904053
2025-05-24T09:32:52.909661Z: FCH 1018 2025-05-24 17:32:47.905254
2025-05-24T09:32:55.913102Z: FCH 1019 2025-05-24 17:32:53.91125
2025-05-24T09:32:56.913695Z: FCH 1020 2025-05-24 17:32:54.912391
fetch size: 5

Note:
注意这里2025-05-24T09:31:31到2025-05-24T09:31:37初始用了6秒, 后面的每5行fetch 也基本都是6秒。这种对于快速返回结果相比不指定fetchsize,快了30多秒。

Oracle Ojdbc Fetch Size使用

Oracle JDBC 驱动程序同样提供了fetch size, JDBC 驱动程序发出单行 FETCH 请求时,我们会批量抓取N行数据(fetch size),然后一次返回,和Scalary-subquery cache差不多的目的,有兴趣可以看我另一个blog<Oracle数据库中 Scalar-subquery 缓存和 DETERMINISTIC Function>,fetch size减少少了访问数据库的次数,提高了效率。默认情况下,Fetch Size为每次 10 行。

Oracle8i JDBC Developer’s Guide and Reference

By default, when Oracle JDBC executes a query, it receives the result set 10 rows at a time from the database cursor.

Oracle® Fusion Middleware Java Persistence API (JPA) Extensions Reference for Oracle TopLink

By default, most JDBC drivers use a fetch size of 10. , so if you are reading 1000 objects, increasing the fetch size to 256 can significantly reduce the time required to fetch the query’s results. The optimal fetch size is not always obvious. Usually, a fetch size of one half or one quarter of the total expected result size is optimal.

如何证明是10? oracle还好可诊断的工具完善,做trace跟踪可以看到。

FETCH #1998675021656:c = 0,e = 5,p = 0,cr = 1,cu = 0,mis = 0,r = 10,dep = 0,og = 1,plh = 1575980263,tim = 179310469275
FETCH #1998675021656:c = 0,e = 5,p = 0,cr = 1,cu = 0,mis = 0,r = 10,dep = 0,og = 1,plh = 1575980263,tim = 179310469549
FETCH #1998675021656:c = 0,e = 9,p = 0,cr = 1,cu = 0,mis = 0,r = 10,dep = 0,og = 1,plh = 1575980263,tim = 179310469830

如果增加fetch size到500可以达到增加多达10倍或更明显的性能提升,trace跟踪中可以看到,:

FETCH #1997623590632:c=0,e=38,p=0,cr=8,cu=0,mis=0,r=500,dep=0,og=1,plh=1575980263,tim=179351278248
FETCH #1997623590632:c=0,e=53,p=15,cr=9,cu=0,mis=0,r=500,dep=0,og=1,plh=1575980263,tim=179351278572

我们都知道sql trace是数据库服务端运行,不包含网络传输成本。

但是随着JDBC版本的更新,JDBC的fetch size可能会有变化,前段时间Connor McDonald反馈在测试Oracle 23Ai 的ojdbc驱动时,发现默认已经是自适应大小,初始化10,同一查询后后面增加到了250.

FETCH #1997623585400:c=0,e=2721,p=28,cr=3,cu=0,mis=0,r=10,dep=0,og=1,plh=1575980263,tim=179456214593
FETCH #1997623585400:c=0,e=17,p=0,cr=1,cu=0,mis=0,r=10,dep=0,og=1,plh=1575980263,tim=179456260896
FETCH #1997623585400:c=0,e=7,p=0,cr=1,cu=0,mis=0,r=10,dep=0,og=1,plh=1575980263,tim=179456261293
FETCH #1997623585400:c=0,e=28,p=0,cr=5,cu=0,mis=0,r=250,dep=0,og=1,plh=1575980263,tim=179456261853
FETCH #1997623585400:c=0,e=24,p=0,cr=4,cu=0,mis=0,r=250,dep=0,og=1,plh=1575980263,tim=179456263399

不过这确实符合Oracle 的做法,如同数据库中一样,大量的自适应,如统计信息、执行计划。 建议使用较新的驱动以测试结果为准。

总结:
Postgresql JDBC不同于ORACLE 使用的是客户端缓存,当有大的查询时,建议应用中配置fetchsize,同时记的禁用autocommit, 适用于基于pg系的所有数据库,如highgoDB, Kingbase, 对于opengauss系后面再测。
— over —

打赏

,

目前这篇文章还没有评论(Rss)

我要评论