首页 » MySQL » MySQL 5.7同步延迟案例1: FLUSH PRIVILEGES死锁

MySQL 5.7同步延迟案例1: FLUSH PRIVILEGES死锁

一套MySQL V5.7主从同步(MTS)检查时延时较高,这是mysql的常见现象,如果重启备库时提示err 1236,又是mysql的常见错误,  个人比较排斥MySQL(当然这是个人主观意愿,但并不否认它的流行和某些场景的能力),遇到了这个问题就跟着看看,分析slave上的进程发现 GRANT 在 FLUSH PRIVILEGES 之前获得锁时出现死锁,简单记录这个5.7 bug。

How to check Replica_SQL health

-- from source:
show master status;

-- From Replica
show slave status\G

Note:
MySQL 8后建议使用show replica ,show slave 已过时。

show slave 关注指标

Replica_IO_Running: Yes
Replica_SQL_Running: Yes
...
Replica_IO_State: Waiting for source to send event
...
Source_Log_File: mysql-bin.000006
Read_Source_Log_Pos: 1437728
Relay_Log_File: mysql-relay.000013
Relay_Log_Pos: 453735
Relay_Source_Log_File: mysql-bin.000006

...

Exec_Source_Log_Pos: 453522
Relay_Log_Space: 1438184

...
Seconds_Behind_Source: 217

...
Retrieved_Gtid_Set: 00020192-1111-1111-1111-111111111111:1-105931
Executed_Gtid_Set: 00020192-1111-1111-1111-111111111111:1-103277

本案例现象

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.230.xxx.xxx
                  Master_User: xxx
                  Master_Port: 10000
                Connect_Retry: 1
              Master_Log_File: mysql-bin.002375
          Read_Master_Log_Pos: 786639555
               Relay_Log_File: mysql-relay.004188
                Relay_Log_Pos: 311803492
        Relay_Master_Log_File: mysql-bin.001396
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 311803279
              Relay_Log_Space: 1140905480027
     ...
        Seconds_Behind_Master: 15937230
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 200922033
                  Master_UUID: 8690d6b3-fc9c-11ea-97c2-8446fe2f3274
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for dependent transaction to commit
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 8690d6b3-fc9c-11ea-97c2-8446fe2f3274:121233235-1775703749
            Executed_Gtid_Set: 8690d6b3-fc9c-11ea-97c2-8446fe2f3274:1-1075955764
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.01 sec)

Note:
这是一主两从的架构,1个slave库已重启提示1236, 上面这输出是另一套未重启的slave 。

查看Slave进程

mysql> show processlist;
+----------+-----------------+---------------------+------+---------+----------+---------------------------------------------+-----------------------------------------------------------+
| Id       | User            | Host                | db   | Command | Time     | State                                       | Info                                                      |
+----------+-----------------+---------------------+------+---------+----------+---------------------------------------------+-----------------------------------------------------------+
|        2 | system user     |                     | NULL | Connect | 15936302 | Waiting for dependent transaction to commit | NULL                                                      |
|        3 | system user     |                     |      | Connect | 15936302 | Waiting for table level lock                | flush privileges                                          |
|        4 | system user     |                     |      | Killed  | 15936302 | Waiting for preceding transaction to commit | GRANT SHOW VIEW, EVENT, TRIGGER ON *.* TO 'myxxxx_bk'@'%' |
|        5 | system user     |                     | NULL | Connect | 15936329 | Waiting for an event from Coordinator       | NULL                                                      |
|        6 | system user     |                     | NULL | Connect | 15936329 | Waiting for an event from Coordinator       | NULL                                                      |
|        7 | system user     |                     | NULL | Connect | 15936329 | Waiting for an event from Coordinator       | NULL                                                      |
|        8 | system user     |                     | NULL | Connect | 15936329 | Waiting for an event from Coordinator       | NULL                                                      |
|        9 | system user     |                     | NULL | Connect | 15936329 | Waiting for an event from Coordinator       | NULL                                                      |
|       10 | system user     |                     | NULL | Connect | 15936359 | Waiting for an event from Coordinator       | NULL                                                      |
|       11 | event_scheduler | localhost           | NULL | Daemon  | 39435600 | Waiting on empty queue                      | NULL                                                      |
| 14424427 | system user     |                     | NULL | Connect |   126323 | Waiting for master to send event            | NULL                                                      |
| 14427736 | root            | 10.xxx.xxx.61:42354 | NULL | Query   |   117100 | Killing slave                               | stop slave                                                |
| 14437090 | root            | 10.xxx.xxx.45:33436 | NULL | Query   |    91887 | checking permissions                        | stop slave                                                |
| 14470519 | root            | localhost           | NULL | Query   |        0 | starting                                    | show processlist                                          |
| 14470849 | mydata_ha       | 10.xxx.xxx.45:32978 | NULL | Sleep   |        0 |                                             | NULL                                                      |
| 14470850 | mydata_ha       | 10.xxx.xxx.45:32982 | NULL | Sleep   |        0 |                                             | NULL                                                      |
| 14470851 | mydata_ha       | 10.xxx.xxx.45:32986 | NULL | Sleep   |        0 |                                             | NULL                                                      |
+----------+-----------------+---------------------+------+---------+----------+---------------------------------------------+-----------------------------------------------------------+
17 rows in set (0.00 sec)

mysql> select * from performance_schema.threads where PROCESSLIST_ID in (3,4)\G
*************************** 1. row ***************************
          THREAD_ID: 52
               NAME: thread/sql/slave_worker
               TYPE: FOREGROUND
     PROCESSLIST_ID: 3
   PROCESSLIST_USER: NULL
   PROCESSLIST_HOST: NULL
     PROCESSLIST_DB: NULL
PROCESSLIST_COMMAND: Connect
   PROCESSLIST_TIME: 15936422
  PROCESSLIST_STATE: Waiting for table level lock
   PROCESSLIST_INFO: flush privileges
   PARENT_THREAD_ID: 51
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: NULL
       THREAD_OS_ID: 67865
*************************** 2. row ***************************
          THREAD_ID: 53
               NAME: thread/sql/slave_worker
               TYPE: FOREGROUND
     PROCESSLIST_ID: 4
   PROCESSLIST_USER: NULL
   PROCESSLIST_HOST: NULL
     PROCESSLIST_DB: NULL
PROCESSLIST_COMMAND: Connect
   PROCESSLIST_TIME: 15936422
  PROCESSLIST_STATE: Waiting for preceding transaction to commit
   PROCESSLIST_INFO: GRANT SHOW VIEW, EVENT, TRIGGER ON *.* TO 'myxxxx'@'%'
   PARENT_THREAD_ID: 51
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: NULL
       THREAD_OS_ID: 67866
2 rows in set (0.00 sec)

Note:
Waiting for table level lock 常见于MyISAM 存储引擎, MySQL 5.7中的数据字典默认存储引擎, 8.0后为InnoDB. 当前正在执行的是flush privileges和GRANT SHOW VIEW, EVENT, TRIGGER xxx

使用mysqlbinlog 分析relay log

# at 311808069
#210825  0:00:00 server id 200922033  end_log_pos 311807921 CRC32 0x5710eff2    GTID    last_committed=255033   sequence_number=255034  rbr_only=no
SET @@SESSION.GTID_NEXT= '8690d6b3-fc9c-11ea-97c2-8446fe2f3274:1075955765'/*!*/;
# at 311808134
#210825  0:00:00 server id 200922033  end_log_pos 311808008 CRC32 0xb7cb4c60    Query   thread_id=9776243       exec_time=0     error_code=0
SET TIMESTAMP=1629820800/*!*/;
flush privileges
/*!*/;
# at 311808221
#210825  0:00:00 server id 200922033  end_log_pos 311808073 CRC32 0x823bb746    GTID    last_committed=255033   sequence_number=255035  rbr_only=no
SET @@SESSION.GTID_NEXT= '8690d6b3-fc9c-11ea-97c2-8446fe2f3274:1075955766'/*!*/;
# at 311808286
#210825  0:00:00 server id 200922033  end_log_pos 311808209 CRC32 0xbd6a96f9    Query   thread_id=9776244       exec_time=0     error_code=0
SET TIMESTAMP=1629820800/*!*/;
GRANT SHOW VIEW, EVENT, TRIGGER ON *.* TO 'myxxx_bk'@'%'
/*!*/;

Note:
FLUSH PRIVILEGES 和 GRANT 有相同的last_committed number。

搜索匹配MySQL Bug #89229 .

Description:
FLUSH PRIVILEGES commit its transaction and release the table locks before binlog. That means a following GRANT statement could be binlogged in the same commit window with the FLUSH PRIVILEGES statement. Thus the FLUSH PRIVILEGES and the GRANT can be applied parallel on slave. That may cause a deadlock if slave-preserve-commit-order is ON and GRANT gets the locks before FLUSH PRIVILEGES. 

The deadlock looks like:
- GRANT (applied before FLUSH PRIVILEGES and the locks will be hold until commit)
  waiting for FLUSH PRIVILEGES to commit
- FLUSH PRIVILEGES(blocked when acquiring the locks.)
  waiting for GRANT to release the locks.
打赏

,

对不起,这篇文章暂时关闭评论。