【mysql】pt工具之pt-table-checksum

【mysql】pt工具之pt-table-checksum

时间:2015-09-09 02:09:57      作者:beebol      标签: mysql pt-table-checksum      分类: mysql

pt-table-checksum是percona-toolkit工具组中的其中之一,主要功能是检查mysql复制的一致性。大概的原理是在主上执行一致性检查,检查方式就是执行replace into的语句完成,然后通过复制传递到从库,再通过update更新master_src的值,然后通过决断this_src和master_src的值来判断复制是否一致。一般情况都是在线进行检查,如果在业务特别忙时,可以通过--max-load参数进行设置负载峰值,到达这个值后就将暂停运行,直到负载下来。如果发现一致性有问题,可以通过tp-table-sync来进行数据修复。

安装:

直接下载官网最新版本"https://www.percona.com/downloads/percona-toolkit/",这里使用的为2.2.12-1版本,需要安装如下东东:

perl(DBI) >= 1.13
perl(DBD::mysql) >= 1.0
perl(Time::HiRes)  
perl(IO::Socket::SSL)  
perl(Digest::MD5)  
rpmlib(PayloadFilesHavePrefix) <= 4.0-1rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(CompressedFileNames) <= 3.0.4-1

使用:

先要有一个主从环境,部署参考“mysql主从复制”,开始进行检查:

[root@mysql.dmc.com ~]$pt-table-checksum --max-load="Threads_connected=1025" --no-check-replication-filters  --no-check-binlog-format -h 10.1.16.74 -u checksum -p checksum -P 3306 
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-07T18:24:29      0      0        3       1       0   2.265 beebol.history
09-07T18:24:30 Cannot checksum table beebol.history_2: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6454.

09-07T18:24:36      0      0        5       6       0   5.568 beebol.navicat
09-07T18:24:50      0      0       20      11       0  14.076 beebol.people
09-07T18:24:50 Cannot checksum table beebol.t1: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6454.

09-07T18:24:50 Cannot checksum table beebol.t2: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6454.

09-07T18:24:51      0      0        0       1       0   0.805 beebol.t5
09-07T18:24:52      0      0        0       1       0   0.904 cor.accounts_auth_user_user_permissions
09-07T18:24:53      0      0        0       1       0   0.835 cor.accounts_members
09-07T18:24:54      0      0        0       1       0   1.182 cor.accounts_permissions
09-07T18:24:57      0      0        1       1       0   1.233 cor.auth_group
09-07T18:24:59      0      0        1       1       0   1.499 cor.auth_group_permissions
09-07T18:25:04 Checksum queries for table cor.auth_permission are executing very slowly.  --chunk-size has been automatically reduced to 1.  Check that the server is not being overloaded, or increase --chunk-time.  The last chunk, number 1 of table cor.auth_permission, selected 2 rows and took 4.695 seconds to execute.
Checksumming cor.auth_permission:  75% 00:10 remain09-07T18:25:54      0      0       36      37       0  54.930 cor.auth_permission
09-07T18:25:54      0      0       36      37       0  54.930 cor.auth_permission

如上已经正常执行,默认没有指定replicate库和表,默认为percona.checksums。如上已经正常执行了。其中有些输出这里再详细说明下:

TS --> The timestamp (without the year) when the tool finished checksumming the table.
ERRORS --> The number of errors and warnings that occurred while checksumming the table. Errors and warnings are
printed to standard error while the table is in progress.
DIFFS  --> The number of chunks that differ from the master on one or more replicas. If --no-replicate-check is
specified, this column will always have zeros. If --replicate-check-only is specified, then only tables
with differences are printed.
ROWS --> The number of rows selected and checksummed from the table. It might be different from the number of rows
in the table if you use the –where option.
CHUNKS -->  The number of chunks into which the table was divided.
SKIPPED -- > The number of chunks that were skipped due one or more of these problems:
* MySQL not using the --chunk-index
* MySQL not using the full chunk index (--[no]check-plan)
* Chunk size is greater than --chunk-size * --chunk-size-limit
* Lock wait timeout exceeded (--retries)
* Checksum query killed (--retries)
As of pt-table-checksum 2.2.5, skipped chunks cause a non-zero “EXIT STATUS”.
TIME -->  The time elapsed while checksumming the table.TABLE -->  The database and table that was checksummed.
TABLE -->  The database and table that was checksummed.
09-07T18:24:30 Cannot checksum table beebol.history_2: There is no good index and the table is oversized. at /usr/bin/pt-table-checksum line 6454.

这个由于是beebol.history_2中没有主键和唯一索引,因为该工具是根据主键索引或者唯一索引来分块进行校验的,默认是1000行为一块。

09-07T18:25:04 Checksum queries for table cor.auth_permission are executing very slowly.  --chunk-size has been automatically reduced to 1.  Check that the server is not being overloaded, or increase --chunk-time.  The last chunk, number 1 of table cor.auth_permission, selected 2 rows and took 4.695 seconds to execute.

这个是由于在这个表上选择2行数据的执行时间已经超过了chunk-time默认时间0.5s。这个提示自动将chunk-size降低为1,默认是1000.

如下一直在刷新如下信息,可以将--max-load进行设置一下:

Pausing because Threads_running=0.
Pausing because Threads_running=0.Pausing because Threads_running=0.
Pausing because Threads_running=0.

我这里设置--max-load="Threads_running=1025"还是会刷,后来将这个connected设置下就ok了,--max-load="Threads_connected=1025"。

Diffs cannot be detected because no slaves were found.  Please read the --recursion-method documentation for information.

这个表示找不到slave,检查可能会无效。需要指定 --recursion-method

获取slave信息的方式有:--recursion-method

method

参数,可以参考下官方文档和tp-table-checksum  --help

不一致情况:

[root@mysql.dmc.com ~]$pt-table-checksum --max-load="Threads_connected=1025" --no-check-replication-filters  --no-check-binlog-format -h 10.1.16.74 -u checksum -p checksum -P 3306 --databases=test
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-07T19:49:20      0      0        3       1       0   0.896 test.history
09-07T19:49:22      0      0        0       1       0   1.816 test.history_2
09-07T19:49:24      0      0        6       1       0   2.517 test.navicat
09-07T19:49:26      0      0       20       1       0   1.236 test.people
09-07T19:49:27      0      0        2       1       0   1.158 test.t1
09-07T19:49:29      0      0        7       1       0   1.484 test.t2
09-07T19:49:31      0      0        4       1       0   1.935 test.t3
09-07T19:49:32      0      0        3       1       0   1.247 test.t4
09-07T19:49:33      0      0        0       1       0   1.114 test.t509-07T19:49:35      0      1        6       1       0   1.843 test.t_order
09-07T19:49:35      0      1        6       1       0   1.843 test.t_order

检查后,发现test.t_order表有不一致情况,原因是由于我手动到从上进行了一次update操作。可以通过pt-table-sync进行修复。

时间:2015-09-09 02:09:57      作者:beebol      标签: mysql pt-table-checksum      分类: mysql
  • 分享到:
  • 微博
  • QQ空间
  • 腾讯微博
  • 微信

Copyright © 2015 Gitblog | Proudly powered by Gitblog.