Highgo数据库实际是Postgresql内核,这篇同样也适用于Kingbase, GaussDB一样存在的PG系,最近一客户上了Highgo数据库后晚上的批作业任务总是失败,查询JOB日志,显示因为deadlock失败,其实很好理解,提示的信息有会话、表、行的信息,这里模拟一下2个会话交叉更新相同记录产生的deadlock.
-- 创建测试表
CREATE TABLE accounts (
id SERIAL PRIMARY KEY,
name VARCHAR(50),
balance DECIMAL(10,2)
);
-- 插入测试数据
INSERT INTO accounts (name, balance) VALUES
('Alice', 1000.00),
('Bob', 500.00),
('Charlie', 750.00);
$ /data/highgo/app/bin/psql -U hg -d postgres
psql (14.10)
IvorySQL-pro-3.2.0
Type "help" for help.
postgres=# CREATE TABLE accounts (
postgres(# id SERIAL PRIMARY KEY,
postgres(# name VARCHAR(50),
postgres(# balance DECIMAL(10,2)
postgres(# );
CREATE TABLE
postgres=# INSERT INTO accounts (name, balance) VALUES
postgres-# ('Alice', 1000.00),
postgres-# ('Bob', 500.00),
postgres-# ('Charlie', 750.00);
INSERT 0 3
postgres=# select * from accounts;
id | name | balance
----+---------+---------
1 | Alice | 1000.00
2 | Bob | 500.00
3 | Charlie | 750.00
(3 rows)
下面2个会话更新相同数据
# SESSION 1
postgres=# BEGIN;
BEGIN
postgres=*# UPDATE accounts SET balance = balance + 100 WHERE name = 'Alice';
UPDATE 1
# SESSION 2
postgres=# begin
postgres-# ;
BEGIN
postgres=*# UPDATE accounts SET balance = balance + 50 WHERE name = 'Bob';
UPDATE 1
# SESSION 1
postgres=*# UPDATE accounts SET balance = balance - 100 WHERE name = 'Bob';
UPDATE 1
# SESSION 2
postgres=*# UPDATE accounts SET balance = balance - 50 WHERE name = 'Alice';
ERROR: deadlock detected
DETAIL: Process 3834105 waits for ShareLock on transaction 2523916; blocked by process 3833807.
Process 3833807 waits for ShareLock on transaction 2523917; blocked by process 3834105.
HINT: See server log for query details.
CONTEXT: while updating tuple (0,1) in relation "accounts"
显示会话3834105 和3833807互堵塞,对象accounts, 记录tuple (0,1),展示了两个会话因以不同顺序更新相同资源而导致的死锁情况。
tuple (0,1)
这样的信息时,这表示的是数据页中的行位置(CTID),tuple (0,1)
表示:
0
:数据页编号(page number)1
:行在页内的索引(item index,从1开始)
postgres=# select * from accounts where ctid='(0,1)';
id | name | balance
----+-------+---------
1 | Alice | 1000.00
(1 row)
Note: CTID 不是永久的:VACUUM、UPDATE 等操作会改变 CTID
查看库的统计信息
postgres=*# SELECT DATNAME,deadlocks FROM pg_stat_database ;
datname | deadlocks
-------------------------+-----------
| 0
highgo | 0
postgres | 1
template1 | 0
template0 | 0
解决方法
应用程序调整业务逻辑,一致的访问顺序:总是按照相同的顺序访问表记录,缩短事务的时间,尽快提交。