MySQL到TiDB,通过DM进行实时同步,上游维护时,重启了MySQL,下游TiDB导致报如下错误:
通过query-status -w 1.1.1.1:8231 查看报如下异常
"Type": "UnknownError",
"msg": "invalid event header, event size is 19, too small\ngithub.com/siddontang/go-mysql/replication.(*BinlogParser).parseSingleEvent\n\t/go/pkg/mod/github.com/siddontang/go-mysql@v0.0.0-20190312052122-c6ab05a85eb8/replication/parser.go:124\ngithub.com/siddontang/go-mysql/replication.(*BinlogParser).ParseReader\n\t/go/pkg/mod/github.com/siddontang/go-mysql@v0.0.0-20190312052122-c6ab05a85eb8/replication/parser.go:163\ngithub.com/siddontang/go-mysql/replication.(*BinlogParser).ParseFile\n\t/go/pkg/mod/github.com/siddontang/go-mysql@v0.0.0-20190312052122-c6ab05a85eb8/replication/parser.go:93\ngithub.com/pingcap/dm/pkg/streamer.(*BinlogReader).parseFile\n\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/streamer/reader.go:329\ngithub.com/pingcap/dm/pkg/streamer.(*BinlogReader).parseFileAsPossible\n\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/streamer/reader.go:244\ngithub.com/pingcap/dm/pkg/streamer.(*BinlogReader).parseDirAsPossible\n\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/streamer/reader.go:212\ngithub.com/pingcap/dm/pkg/streamer.(*BinlogReader).parseRelay\n\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/streamer/reader.go:141\ngithub.com/pingcap/dm/pkg/streamer.(*BinlogReader).StartSync.func1\n\t/home/jenkins/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/streamer/reader.go:114\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337\nrelay log file /data/dm_worker_1/relay_log/4b50a2f7-4cfd-11e7-8e39-00163e068296.000001/mysql-bin.041452\nparse relay log file mysql-bin.041452 from offset 0 in dir /data/dm_worker_1/relay_log/4b50a2f7-4cfd-11e7-8e39-00163e068296.000001\nparse relay log file mysql-bin.041452 from offset 50795566 in dir /data/dm_worker_1/relay_log/4b50a2f7-4cfd-11e7-8e39-00163e068296.000001"
原因是上游mysql重启后,tidb获取binlog不完整:
最后的relay_log为如下:
# at 50795566
#190520 21:35:51 server id 438111 end_log_pos 50795585 Stop
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
没有明确指向下一个relay_log位置
正常应该是如下:
# at 52428911
#190520 21:40:16 server id 438111 end_log_pos 52428954 Rotate to mysql-bin.041454 pos: 4
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
可以看到:Rotate to mysql-bin.041454 pos: 4 指向下一个日志。
如何正常从报错的日志继续应用日志呢?
方法一:修改checkpoint的relay_log位置点
1. 停dm-worker
1. 先query-status -w xxx 获取syncer同步的点,这里是这个位置:"syncerBinlog": "(mysql-bin|000001.041452, 50795566)"
1. update dm_meta.task_boss_syncer_checkpoint set binlog_name='mysql-bin|000001.041453',binlog_pos=120 where is_global=1; 表名就是任务名加_syncer_checkpoint
1. 再启动dm-work可恢复正常
方法二:修改relay.meta,重新拉取binlog
1. 停dm-worker
1. 先query-status -w xxx 获取syncer同步的点,这里是这个位置:"syncerBinlog": "(mysql-bin|000001.041452, 50795566)"
1. 进入dm-worker目录,编辑relay_log/4b50a2f7-4cfd-11e7-8e39-00163e068296.000001/relay.meta
binlog-name = "mysql-bin.041452"
binlog-pos = 50795566
binlog-gtid = ""
1. 启动dm-worker可恢复正常
文章最后更新时间:
2019年05月21日 00:06:14