InnoDB 死锁
文章目录
- 死锁案例
- 等待超时时间
- InnoDB 状态信息
- 死锁日志
- 死锁检测
- 死锁日志分析
死锁是指多个事务无法继续进行的情况,因为每个事务都持有另一个事务所需的锁。因为所有涉及的事务都在等待同一资源可用,所以它们都不会释放它所持有的锁。
当事务锁定多个表中的行时(通过UPDATE或SELECT ... FOR UPDATE
等语句),可能会发生死锁,但顺序相反。当此类语句锁定索引记录和间隙的范围时,也可能发生死锁,由于时间问题,每个事务都会获得一些锁,但不会获得其他锁。关于死锁示例,请参阅第17.7.5.1节“InnoDB死锁示例
为了减少死锁的可能性,可以使用以下策略:
- 使用事务而不是LOCK TABLES语句;
- 保持插入或更新数据的事务足够小,不会长时间保持打开状态;
- 当不同的事务更新多个表或大范围的行时,在每个事务中使用相同的操作顺序(如SELECT … FOR update);
- 在
SELECT ... FOR UPDATE
和UPDATE ... WHERE
语句中使用的列上创建索引
死锁的可能性不受隔离级别的影响,因为隔离级别只会改变读取操作的行为,而死锁是由于写入操作而发生的。
死锁案例
https://dev.mysql.com/doc/refman/8.4/en/innodb-deadlock-example.html
假设有A和B两个客户端
# 开启打印死锁信息
SET GLOBAL innodb_print_all_deadlocks = ON;DROP TABLE IF EXISTS Animals;
CREATE TABLE Animals (name VARCHAR(10) PRIMARY KEY, value INT) ENGINE = InnoDB;
DROP TABLE IF EXISTS Birds;
CREATE TABLE Birds (name VARCHAR(10) PRIMARY KEY, value INT) ENGINE = InnoDB;INSERT INTO Animals (name, value) VALUES ("Aardvark", 10);
INSERT INTO Birds (name, value) VALUES ("Buzzard", 20);
A和B均执行SET autocommit=0;
关闭自动提交
- 客户端A开始事务,并在共享模式的Animals中选择一行:
START TRANSACTION;
SELECT value FROM Animals WHERE name='Aardvark' FOR SHARE;
- 然后B开始事务
START TRANSACTION;
SELECT value FROM Birds WHERE name='Buzzard' FOR SHARE;
通过performance_schema.data_locks可以看到加锁情况:
mysql> SELECT ENGINE_TRANSACTION_ID as Trx_Id,-> OBJECT_NAME as `Table`,-> INDEX_NAME as `Index`,-> LOCK_DATA as Data,-> LOCK_MODE as Mode,-> LOCK_STATUS as Status,-> LOCK_TYPE as Type-> FROM performance_schema.data_locks;
+-----------------+---------+---------+------------+---------------+---------+--------+
| Trx_Id | Table | Index | Data | Mode | Status | Type |
+-----------------+---------+---------+------------+---------------+---------+--------+
| 422009360179200 | Animals | NULL | NULL | IS | GRANTED | TABLE |
| 422009360179200 | Animals | PRIMARY | 'Aardvark' | S,REC_NOT_GAP | GRANTED | RECORD |
| 422009360180008 | Birds | NULL | NULL | IS | GRANTED | TABLE |
| 422009360180008 | Birds | PRIMARY | 'Buzzard' | S,REC_NOT_GAP | GRANTED | RECORD |
+-----------------+---------+---------+------------+---------------+---------+--------+
4 rows in set (0.00 sec)
- 客户端B随后更新Animals中的一行:
mysql> UPDATE Animals SET value=30 WHERE name='Aardvark';
B必须等待,通过performance_schema.data_lock_waits
表可以查到正在等待的事务,结果如下
mysql> SELECT REQUESTING_ENGINE_LOCK_ID as Req_Lock_Id,-> REQUESTING_ENGINE_TRANSACTION_ID as Req_Trx_Id,-> BLOCKING_ENGINE_LOCK_ID as Blk_Lock_Id,-> BLOCKING_ENGINE_TRANSACTION_ID as Blk_Trx_Id-> FROM performance_schema.data_lock_waits;
+----------------------------------------+------------+----------------------------------------+-----------------+
| Req_Lock_Id | Req_Trx_Id | Blk_Lock_Id | Blk_Trx_Id |
+----------------------------------------+------------+----------------------------------------+-----------------+
| 140534383469352:92:4:2:140534282801400 | 4243765 | 140534383468544:92:4:2:140534282795040 | 422009360179200 |
+----------------------------------------+------------+----------------------------------------+-----------------+
1 row in set (0.00 sec)mysql> SELECT ENGINE_TRANSACTION_ID as Trx_Id,
CT_NAME as `Tabl -> OBJECT_NAME as `Table`,-> INDEX_NAME as `Index`,-> LOCK_DATA as Data,-> LOCK_MODE as Mode,-> LOCK_STATUS as Status,-> LOCK_TYPE as Type-> FROM performance_schema.data_locks;
+-----------------+---------+---------+------------+---------------+---------+--------+
| Trx_Id | Table | Index | Data | Mode | Status | Type |
+-----------------+---------+---------+------------+---------------+---------+--------+
| 4243765 | Animals | NULL | NULL | IX | GRANTED | TABLE |
| 4243765 | Birds | NULL | NULL | IS | GRANTED | TABLE |
| 4243765 | Birds | PRIMARY | 'Buzzard' | S,REC_NOT_GAP | GRANTED | RECORD |
| 4243765 | Animals | PRIMARY | 'Aardvark' | X,REC_NOT_GAP | WAITING | RECORD |
| 422009360179200 | Animals | NULL | NULL | IS | GRANTED | TABLE |
| 422009360179200 | Animals | PRIMARY | 'Aardvark' | S,REC_NOT_GAP | GRANTED | RECORD |
+-----------------+---------+---------+------------+---------------+---------+--------+
6 rows in set (0.00 sec)mysql> SELECT REQUESTING_ENGINE_LOCK_ID as Req_Lock_Id,-> REQUESTING_ENGINE_TRANSACTION_ID as Req_Trx_Id,-> BLOCKING_ENGINE_LOCK_ID as Blk_Lock_Id,-> BLOCKING_ENGINE_TRANSACTION_ID as Blk_Trx_Id-> FROM performance_schema.data_lock_waits;
Empty set (0.01 sec)
InnoDB仅在事务试图修改数据库时使用顺序事务ID。因此,之前的只读事务id从422009360180008
更改为4243765。
- 如果客户端A试图同时更新Birds中的一行,这将导致死锁:
mysql> UPDATE Birds SET value=40 WHERE name='Buzzard';
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
InnoDB回滚导致死锁的事务。现在客户端B可以继续进行第一次更新
mysql> UPDATE Animals SET value=30 WHERE name='Aardvark';
Query OK, 1 row affected (1 min 26.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
整个过程如下图所示
mysql> SELECT `count` FROM INFORMATION_SCHEMA.INNODB_METRICSWHERE NAME="lock_deadlocks";
+-------+
| count |
+-------+
| 1 |
+-------+
1 row in set (0.00 sec)
等待超时时间
innodb_lock_wait_timeout
指的是事务等待获取资源等待的最长时间,超过这个时间还未分配到资源则会返回应用失败;
参数的时间单位是秒,最小可设置为1s(一般不会设置得这么小),最大可设置1073741824秒。默认安装时这个值是50s
参数支持范围为Session和Global,且支持动态修改,所以可以通过两种方法修改;
通过语句修改
set innodb_lock_wait_timeout=100;
set global innodb_lock_wait_timeout=100;
注意global的修改对当前线程是不生效的,只有建立新的连接才生效
修改配置文件/etc/my.cnf
innodb_lock_wait_timeout = 50
当锁等待时间超过设置时间的时候,就会报如下的错误;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
InnoDB 状态信息
InnoDB状态包含以下有关死锁和事务的信息。在LATEST DETECTED DEADLOCK
一栏,可以看到,只读事务id由422009360180008更改为顺序事务id值4243765。
mysql> SHOW ENGINE INNODB STATUS\G;
*************************** 1. row ***************************Type: InnoDBName:
Status:
=====================================
2024-09-29 11:19:20 140533613905664 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 4 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 3 srv_active, 0 srv_shutdown, 3228 srv_idle
srv_master_thread log flush and writes: 0
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 2
OS WAIT ARRAY INFO: signal count 5
RW-shared spins 0, rounds 0, OS waits 0
RW-excl spins 0, rounds 0, OS waits 0
RW-sx spins 0, rounds 0, OS waits 0
Spin rounds per wait: 0.00 RW-shared, 0.00 RW-excl, 0.00 RW-sx
------------------------
LATEST DETECTED DEADLOCK
------------------------
2024-09-29 10:30:35 140533919143680
*** (1) TRANSACTION:
TRANSACTION 4243765, ACTIVE 155 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 15, OS thread handle 140533616297728, query id 118 localhost root updating
UPDATE Animals SET value=30 WHERE name='Aardvark'*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 93 page no 4 n bits 72 index PRIMARY of table `mysql_learn`.`Birds` trx id 4243765 lock mode S locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 00: len 7; hex 42757a7a617264; asc Buzzard;;1: len 6; hex 00000040c12c; asc @ ,;;2: len 7; hex 81000001160110; asc ;;3: len 4; hex 80000014; asc ;;*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `mysql_learn`.`Animals` trx id 4243765 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 00: len 8; hex 416172647661726b; asc Aardvark;;1: len 6; hex 00000040c12a; asc @ *;;2: len 7; hex 81000001150110; asc ;;3: len 4; hex 8000000a; asc ;;*** (2) TRANSACTION:
TRANSACTION 4243766, ACTIVE 164 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 14, OS thread handle 140533880583936, query id 137 localhost root updating
UPDATE Birds SET value=40 WHERE name='Buzzard'*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 92 page no 4 n bits 72 index PRIMARY of table `mysql_learn`.`Animals` trx id 4243766 lock mode S locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 00: len 8; hex 416172647661726b; asc Aardvark;;1: len 6; hex 00000040c12a; asc @ *;;2: len 7; hex 81000001150110; asc ;;3: len 4; hex 8000000a; asc ;;*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 93 page no 4 n bits 72 index PRIMARY of table `mysql_learn`.`Birds` trx id 4243766 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 00: len 7; hex 42757a7a617264; asc Buzzard;;1: len 6; hex 00000040c12c; asc @ ,;;2: len 7; hex 81000001160110; asc ;;3: len 4; hex 80000014; asc ;;*** WE ROLL BACK TRANSACTION (2)
------------
TRANSACTIONS
------------
Trx id counter 4243767
Purge done for trx's n:o < 4243753 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 422009360183240, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360182432, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360181624, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360180816, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360179200, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360178392, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360177584, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 422009360176776, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 4243765, ACTIVE 3080 sec
4 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 1
MySQL thread id 15, OS thread handle 140533616297728, query id 118 localhost root
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
1061 OS file reads, 450 OS file writes, 194 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 60, seg size 62, 0 merges
merged operations:insert 0, delete mark 0, delete 0
discarded operations:insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 4 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 2325725057
Log buffer assigned up to 2325725057
Log buffer completed up to 2325725057
Log written up to 2325725057
Log flushed up to 2325725057
Added dirty pages up to 2325725057
Pages flushed up to 2325725057
Last checkpoint at 2325725057
88 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137039872
Dictionary memory allocated 465748
Buffer pool size 8192
Free buffers 7001
Database pages 1186
Old database pages 457
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 1034, created 152, written 287
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 1186, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=30037, Main thread ID=140533678733056 , state=sleeping
Number of rows inserted 2, updated 1, deleted 0, read 3
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
Number of system rows inserted 28, updated 335, deleted 24, read 6040
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================1 row in set (0.00 sec)
死锁日志
错误日志包含有关事务和锁的信息
mysql> SELECT @@log_error;
+---------------------------+
| @@log_error |
+---------------------------+
| /var/log/mysql/mysqld.log |
+---------------------------+
1 row in set (0.00 sec)
日志中要由死锁信息的话需要开启innodb_print_all_deadlocks
设置:SET GLOBAL innodb_print_all_deadlocks = ON;
死锁检测
参考:https://dev.mysql.com/doc/refman/8.4/en/innodb-deadlock-detection.html
当启用死锁检测(默认设置)并且确实发生死锁时,InnoDB会自动检测到这种情况并回滚其中一个事务(受害者)。如果使用innodb_lock_detect
变量禁用死锁检测,innodb将依赖innodb_lock _wait_timeout
设置在死锁的情况下回滚事务。
因此,即使您的应用程序逻辑是正确的,您仍然必须处理必须重试事务的情况。
如果频繁的死锁突出了事务结构或应用程序错误处理的问题,请启用innodb_print_all_deadlocks
,将有关所有死锁的信息打印到mysqld错误日志中。
InnoDB试图选择要回滚的小事务,其中事务的大小由插入、更新或删除的行数决定。
如果innodb_table_locks=1
且autocommit=0,则InnoDB知道表锁,其上的MySQL层知道行级锁。否则,当涉及MySQL lock TABLES语句设置的表锁或InnoDB以外的存储引擎设置的锁时,InnoDB无法检测死锁。通过设置innodb_lock_wait_timeout
系统变量的值来解决这些情况。
如果InnoDB Monitor输出的LATEST DETECTED DEADLOCK
部分包含一条消息以下面内容开头
TOO DEEP OR LONG SEARCH IN THE LOCK TABLE WAITS-FOR GRAPH, WE WILL ROLL BACK FOLLOWING TRANSACTION
这表示等待队列上的事务数量已达到200个的限制。超过200个事务的等待列表被视为死锁,试图检查等待队列的事务被回滚。如果正在加锁的线程必须查看等待列表上事务所拥有的1000000多个锁,也可能发生相同的错误。
禁用死锁检测
在高并发系统中,当许多线程等待同一个锁时,死锁检测可能会导致速度减慢。有时,当死锁发生时,禁用死锁检测并依赖innodb_lock_wait_timeout
设置进行事务回滚可能更有效。死锁检测可以使用innodb_tacklock_detect
变量禁用
死锁日志分析
参考:https://stackoverflow.com/questions/72431047/understanding-innodb-deadlock-log