MySQL8.0从库slave_rows_search_algorithms=HASH_SCAN潜在主从数据不一致问题
(2022-06-04 01:06:59)
标签:
mysql |
分类: mysql源码相关 |
问题描述
针对无主键和合适索引的表,mysql5.7开始引入了slave_rows_search_algorithms=HASH_SCAN算法,通过将要回放的event中多个row的变更前镜像存储进一个hash
map,而后通过一次全表扫描,将表的每一行在hash
map中进行命中检查,来提高无主键和合适索引表update和delete的复制效率。
但在MySQL8.0中,如果满足如下2个条件,那么主从数据将不一致:
1. 表中包含重复的行
2. 一个binlog event包含了重复行的变更,并且变更的内容还不一样
大致的复现步骤如下:
Setup a master-slave replication, m1->s1 with RBR. In m1 do the following init operations: 1. create table table without pk create table t(c1 int,c2 int); 2. insert the following test data: insert into t values (1,1),(2,2),(3,3),(4,4),(5,5),(1,1),(2,2); 3. check the data on m1 and s1, will see the result sequence is the same: Then start the test: 1. do the following update on m1 update t set c2=rand()*100 where 1=1; 2. check the data on m1 and s1, will see the result sequence is different: On m1 mysql> select * from t; +------+------+ | c1 | c2 | +------+------+ | 1 | 43 | | 2 | 23 | | 3 | 87 | | 4 | 65 | | 5 | 63 | | 1 | 21 | | 2 | 17 | +------+------+ 7 rows in set (0.01 sec) On s1 mysql> select * from t; +------+------+ | c1 | c2 | +------+------+ | 1 | 21 | <====== the result sequence is different with master | 2 | 17 | | 3 | 87 | | 4 | 65 | | 5 | 63 | | 1 | 43 | | 2 | 23 | +------+------+ 7 rows in set (0.01 sec) 3. add auto_increment as pk on m1 alter table t add column id int auto_increment primary key; 4. re-check the data in m1 and s1, we will see the data in-consistence On m1 mysql> select * from t; +------+------+----+ | c1 | c2 | id | +------+------+----+ | 1 | 43 | 1 | | 2 | 23 | 2 | | 3 | 87 | 3 | | 4 | 65 | 4 | | 5 | 63 | 5 | | 1 | 21 | 6 | | 2 | 17 | 7 | +------+------+----+ 7 rows in set (0.01 sec) On s1 mysql> select * from t; +------+------+----+ | c1 | c2 | id | +------+------+----+ | 1 | 21 | 1 | <===== the row with id=1 is different with the master | 2 | 17 | 2 | | 3 | 87 | 3 | | 4 | 65 | 4 | | 5 | 63 | 5 | | 1 | 43 | 6 | | 2 | 23 | 7 | +------+------+----+ 7 rows in set (0.01 sec) 4. Do the following update on m1 update t set c2=c2*10 where c1=1 and c2=43; 5. re-check the data in m1 and s1, we will see the data serious in-consistence On m1 mysql> select * from t; +------+------+----+ | c1 | c2 | id | +------+------+----+ | 1 | 430 | 1 | <==========c2 is update from 43 to 430 as expect | 2 | 23 | 2 | | 3 | 87 | 3 | | 4 | 65 | 4 | | 5 | 63 | 5 | | 1 | 21 | 6 | | 2 | 17 | 7 | +------+------+----+ 7 rows in set (0.01 sec) On s1 mysql> select * from t; +------+------+----+ | c1 | c2 | id | +------+------+----+ | 1 | 430 | 1 | <======the update in slave replay on the wrong row | 2 | 17 | 2 | | 3 | 87 | 3 | | 4 | 65 | 4 | | 5 | 63 | 5 | | 1 | 43 | 6 | <======it should update this row | 2 | 23 | 7 | +------+------+----+ 7 rows in set (0.01 sec)
问题原因分析
问题主要的原因是用于缓存一个event所有row变更前镜像的hash
map在8.0中改为基于性能更好的std::unordered_multimap实现,但对于hash
key一样落入同一个hash bucket的多个row变更,unordered_multimap采用的是逆序存储,即写入顺序与遍历顺序是正好相反的。
例如有如下3行row 变更按row1、row2、row3的顺序放入hash map:
row1: before_image:(1,1), after_image:(1,5)
row2: before_image:(1,1), after_image:(1,6)
row3: before_image:(1,1), after_image:(1,7)
那么实际遍历读取的顺序是row3、row2、row1。这就导致在主上最后执行的“before_image:(1,1),
after_image:(1,7)”在从上先执行的。所以在从上第一行(1,1)被改为(1,7)了,而不是主上的(1,5)。
可以看出从上重复行的变更顺序将与主上正好相反。这也解释了上文中的bug现象。
问题解决建议
将std::unordered_multimap替换为std:multimap即可,性能会弱于std::unordered_multimap,但正确性是没问题的,并且无主键无合适索引场景复制本身也快不到哪里去。
具体修复的patch参见我们给官方报的bug:https://bugs.mysql.com/bug.php?id=106918
转载请注明转自高孝鑫的博客!

加载中…