Highgo数据库模拟deadlock

Highgo数据库实际是Postgresql内核,这篇同样也适用于Kingbase, GaussDB一样存在的PG系,最近一客户上了Highgo数据库后晚上的批作业任务总是失败,查询JOB日志,显示因为deadlock失败,其实很好理解,提示的信息有会话、表、行的信息,这里模拟一下2个会话交叉更新相同记录产生的deadlock.

-- 创建测试表
CREATE TABLE accounts (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    balance DECIMAL(10,2)
);

-- 插入测试数据
INSERT INTO accounts (name, balance) VALUES 
('Alice', 1000.00),
('Bob', 500.00),
('Charlie', 750.00);

$ /data/highgo/app/bin/psql -U hg -d postgres
psql (14.10)
IvorySQL-pro-3.2.0
Type "help" for help.

postgres=# CREATE TABLE accounts (
postgres(#     id SERIAL PRIMARY KEY,
postgres(#     name VARCHAR(50),
postgres(#     balance DECIMAL(10,2)
postgres(# );
CREATE TABLE
postgres=# INSERT INTO accounts (name, balance) VALUES
postgres-# ('Alice', 1000.00),
postgres-# ('Bob', 500.00),
postgres-# ('Charlie', 750.00);
INSERT 0 3
postgres=# select * from accounts;
 id |  name   | balance
----+---------+---------
  1 | Alice   | 1000.00
  2 | Bob     |  500.00
  3 | Charlie |  750.00
(3 rows)

下面2个会话更新相同数据

# SESSION 1
postgres=# BEGIN;
BEGIN
postgres=*# UPDATE accounts SET balance = balance + 100 WHERE name = 'Alice';
UPDATE 1

# SESSION 2
postgres=# begin
postgres-# ;
BEGIN
postgres=*# UPDATE accounts SET balance = balance + 50 WHERE name = 'Bob';
UPDATE 1

# SESSION 1
postgres=*# UPDATE accounts SET balance = balance - 100 WHERE name = 'Bob';
UPDATE 1

# SESSION 2
postgres=*# UPDATE accounts SET balance = balance - 50 WHERE name = 'Alice';
ERROR:  deadlock detected
DETAIL:  Process 3834105 waits for ShareLock on transaction 2523916; blocked by process 3833807.
Process 3833807 waits for ShareLock on transaction 2523917; blocked by process 3834105.
HINT:  See server log for query details.
CONTEXT:  while updating tuple (0,1) in relation "accounts"

显示会话3834105 和3833807互堵塞,对象accounts, 记录tuple (0,1),展示了两个会话因以不同顺序更新相同资源而导致的死锁情况。

tuple (0,1) 这样的信息时,这表示的是数据页中的行位置(CTID),tuple (0,1) 表示:

  • 0:数据页编号(page number)
  • 1:行在页内的索引(item index,从1开始)
postgres=# select * from accounts where ctid='(0,1)';
 id | name  | balance
----+-------+---------
  1 | Alice | 1000.00
(1 row)

Note: CTID 不是永久的:VACUUM、UPDATE 等操作会改变 CTID

查看库的统计信息

postgres=*# SELECT DATNAME,deadlocks FROM pg_stat_database ;
         datname         | deadlocks
-------------------------+-----------
                         |         0
 highgo                  |         0
 postgres                |         1
 template1               |         0
 template0               |         0

解决方法

应用程序调整业务逻辑,一致的访问顺序:总是按照相同的顺序访问表记录,缩短事务的时间,尽快提交。

Leave a Comment