PostgreSQL 和 Oracle 在处理事务内 SQL 错误时的行为确实存在根本性的差异, 最近有个客户的应用系统从oracle迁移到Highgo(base on postgresql)后, 以过去的oracle开发习惯会有些不适应的地方,像在事务中遇到“ERROR: current transaction is aborted, commands ignored until end of transaction block”错误后,后续所有SQL会执行失败。这里简单记录。
情况1 当前SESSION 的事务中的SQL失败
psql (14.20)
hgdb-client-V9.0.5
Type "help" for help.
highgo=# create table test(id int primary key);
ERROR: relation "test" already exists
highgo=# drop table test;
DROP TABLE
highgo=# create table test(id int primary key);
CREATE TABLE
highgo=# begin;
BEGIN
highgo=*# insert into test values(1);
INSERT 0 1
highgo=*# insert into test values(1);
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
highgo=!# select * from test;
ERROR: current transaction is aborted, commands ignored until end of transaction block
highgo=!# create table t2(id int);
ERROR: current transaction is aborted, commands ignored until end of transaction block
highgo=!#
PostgreSQL 为了保证事务的原子性(即“全有或全无”),当一个事务中的某条语句出错后(例如违反唯一约束、数据类型不匹配等),整个事务会进入一个“中止”(aborted)状态。
在这种状态下,除了 ROLLBACK(回滚)命令,数据库会拒绝执行任何其他 SQL 命令,并抛出你看到的这个错误。
highgo=!# rollback;
ROLLBACK
highgo=# create table t2(id int);
CREATE TABLE
情况2 锁等待超时
# session 1
highgo=# select * from test;
id
----
(0 rows)
highgo=# insert into test values (1);
INSERT 0 1
highgo=# begin;
BEGIN
highgo=*# update test set id=2;
UPDATE 1
# session 2
highgo=# set statement_timeout=1000; --unit ms
SET
highgo=# begin;
BEGIN
highgo=*# update test set id=3;
ERROR: canceling statement due to statement timeout
CONTEXT: while updating tuple (0,3) in relation "test"
highgo=!# set statement_timeout=5000;
ERROR: current transaction is aborted, commands ignored until end of transaction block
highgo=!# select * from test;
ERROR: current transaction is aborted, commands ignored until end of transaction block
解决方法
在 psql 命令行中,遇到错误导致事务中止(current transaction is aborted)通常是因为默认行为是“一旦出错,立即停止并等待回滚”。
要实现“忽略错误,继续执行”,主要取决于你是在交互式命令行操作,还是在执行SQL 脚本文件。
| 场景 | 变量设置 | 行为描述 |
|---|---|---|
| 交互式调试 | \set ON_ERROR_ROLLBACK on | 报错后自动回滚到保存点,事务不中止,可继续输入下一条 SQL。 |
| 脚本执行 | \set ON_ERROR_STOP off | 遇到错误不停止脚本,继续执行文件中的下一行命令。 |
| 数据导入 | ON_ERROR 'ignore' | 在 COPY 命令中跳过错误行,继续导入数据。 |
highgo=# \set ON_ERROR_ROLLBACK on
highgo=# BEGIN;
BEGIN
psql -v ON_ERROR_STOP=off -f your_script.sql
-- or --
\set ON_ERROR_STOP off
COPY table_name FROM '/path/to/file.csv' WITH (FORMAT csv, ON_ERROR 'ignore');
而对于导致锁超时的问题,可以自动结束blocker session,使用参数
$ sh show timeout
archive_timeout | 0 | Forces a switch to the next WAL file if a new file has not been started within N seconds.
authentication_timeout | 1min | Sets the maximum allowed time to complete client authentication.
checkpoint_timeout | 5min | Sets the maximum time between automatic WAL checkpoints.
deadlock_timeout | 1s | Sets the time to wait on a lock before checking for deadlock.
foreign_transaction_resolver_timeout | 1min | Sets the maximum time to wait for foreign transaction resolution.
hg_pool_timeout | 0 | Delay time when there have more connection than hg_pool_size.
hg_session_locking_timeout | 0 | Sets the maximum idle time before locking the session.
idle_in_transaction_session_timeout | 0 | Sets the maximum allowed idle time between queries, when in a transaction.
idle_session_timeout | 0 | Sets the maximum allowed idle time between queries, when not in a transaction.
interactive_timeout | 28800 | After this number of seconds of inactivity, the server will close the interactive connection.
lock_timeout | 0 | Sets the maximum allowed duration of any wait for a lock.
net_write_timeout | 60 | After this number of seconds of waiting, the server will abort the write.
pg_resource_group_queuing_timeout | 0 | A transaction gives up on queuing on a resource group after this timeout (in ms).
statement_timeout | 0 | Sets the maximum allowed duration of any statement.
tcp_user_timeout | 0 | TCP user timeout.
wait_timeout | 28800 | After this number of seconds of inactivity, the server will close the noninteractive connection.
wal_receiver_timeout | 1min | Sets the maximum wait time to receive data from the sending server.
wal_sender_timeout | 1min | Sets the maximum time to wait for WAL replication.
配置参数
# session 1
highgo=# set idle_in_transaction_session_timeout=50000;
SET
highgo=# begin;
BEGIN
highgo=*# update test set id=20;
UPDATE 1
#session 2
highgo=# update test set id=30;
-- waiting 50s
UPDATE 1
#session 1
highgo=*# select * from test;
FATAL: terminating connection due to idle-in-transaction timeout
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
hgdb-client-V9.0.5
highgo=# select * from test;
id
----
30
(1 row)
开启数据库日志:在 PostgreSQL 配置中,可以设置 log_min_error_statement = error 来记录所有导致错误的 SQL 语句,帮助你快速定位问题源头。
Oracle 与PostgreSQL的差异点
Oracle 的行为:语句级回滚 (Statement-Level Rollback)
在 Oracle 中,一个事务内的某条 SQL 语句执行失败,只会回滚这一条语句的效果,而不会影响到整个事务的状态。
- 行为:事务会继续处于活跃状态。
- 后续操作:你可以继续执行事务中的下一条 SQL 语句,之前的成功操作依然有效。
- 最终结果:只要最后执行
COMMIT,之前成功的操作就会被提交。
场景模拟 (Oracle):
-- 1. 开启事务 (Oracle 中第一条 DML 语句即开启事务)
INSERT INTO users (id, name) VALUES (1, 'Alice'); -- 成功
-- 2. 执行一条会失败的语句 (例如,插入重复主键)
INSERT INTO users (id, name) VALUES (1, 'Bob'); -- 失败,报错 ORA-00001
-- 3. 继续执行下一条语句
INSERT INTO users (id, name) VALUES (2, 'Charlie'); -- 成功!Oracle 允许继续
-- 4. 提交事务
COMMIT; -- 成功!最终结果是 id=1 的 Alice 和 id=2 的 Charlie 被插入。
PostgreSQL 的行为:事务级中止 (Transaction-Level Abort)
在 PostgreSQL 中,一个事务内的任何一条 SQL 语句执行失败,都会导致整个事务立即进入“中止”(aborted)状态。
- 行为:整个事务被标记为失败。
- 后续操作:除了
ROLLBACK命令,任何后续的 SQL 语句都会被拒绝执行,并抛出你看到的current transaction is aborted错误。 - 最终结果:必须执行
ROLLBACK来结束这个失败的事务,所有操作(包括成功和失败的)都会被撤销。
场景模拟 (PostgreSQL):
-- 1. 开启事务
BEGIN;
INSERT INTO users (id, name) VALUES (1, 'Alice'); -- 成功
-- 2. 执行一条会失败的语句
INSERT INTO users (id, name) VALUES (1, 'Bob'); -- 失败,报错 duplicate key
-- 3. 尝试继续执行下一条语句
INSERT INTO users (id, name) VALUES (2, 'Charlie'); -- 失败!报错 "current transaction is aborted..."
-- 4. 只能回滚
ROLLBACK; -- 整个事务被撤销,id=1 的 Alice 也没有被插入。
如何在 PostgreSQL 中实现类似 Oracle 的行为?推荐使用 \set ON_ERROR_ROLLBACK on,因为它会在 psql 交互式环境中自动为每条语句创建一个 SAVEPOINT,从而实现出错后继续执行的效果。
— over —