之前mysql慢日志,错误日志都是由操作系统logroate进行按天切割,压缩。切割方式配置的是copytruncate,也即copy一份后压缩后,清空当前日志。这种方式有几个问题: 1、日志会存在丢失 2、如果日志比较大,进行清空时影响性能 3、日志切割了,但文件并没有更换,还是原来的文件(即文件inode未变),对收集日志的程序不友好(切割后很难判断何时需要重新打开日志文件) 当然可以不使用copytruncate方式,可以使用postrotate执行flush命令切割,但由于存在多实例部署,很难确定当前切的日志是哪个实例。所以如果继续使用logroate切换还是有问题。 所以本次对日志切割进行了优化,通过创建硬链接的方式切割日志来解决上述问题,具体过程: 1、找到机器部署的mysql实例 2、获取实例日志文件 3、日志创硬链接 4、删除当前日志 5、flush 日志,自动产生新的日志文件 6、压缩历史日志 具体脚本如下: #!/bin/bash # # Created by beebel # logroate mysql error/slow/general logs # DT=`date +%Y%m%d` PORTLIST=$(netstat -nltp|grep -w mysqld|awk '{print $4}'|awk -F ":" '{print $NF}') LOCAL_IP=$(ip addr |sed -r 's/(.*inet )(1.*)(\/.*)/\2/' |egrep "^10\.|^192\.|^172.1[6-9] |^172.2[0-9]|^172.3[0-1]"|head -n 1) CMD="/usr/local/mysql/bin/mysql -uroot -p`cat /home/mysql/.root_localhost` -h ${LOCAL_IP}" for port in ${PORTLIST} do # slow_log slow_log_file=$($CMD -P${port} -BNe "show variables like 'slow_query_log_file'"|awk '{print $NF}') if [ -f ${slow_log_file}-${DT}.gz ];then mv ${slow_log_file}-${DT}.gz ${slow_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${slow_log_file} ];then ln ${slow_log_file} ${slow_log_file}-${DT} && rm -f ${slow_log_file} && gzip ${slow_log_file}-${DT} $CMD -P${port} -e "flush slow logs;" fi # error_log error_log_file=$($CMD -P${port} -BNe "show variables like 'log_error'"|awk '{print $NF}') if [ -f ${error_log_file}-${DT}.gz ];then mv ${error_log_file}-${DT}.gz ${error_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${error_log_file} ];then ln ${error_log_file} ${error_log_file}-${DT} && rm -f ${error_log_file} && gzip ${error_log_file}-${DT} $CMD -P${port} -e "flush error logs;" fi # general_log general_log=$($CMD -P${port} -BNe "show variables like 'general_log'"|awk '{print $NF}') if [ "${general_log}" == "ON" ];then general_log_file=$($CMD -P${port} -BNe "show variables like 'general_log_file'"|awk '{print $NF}') if [ -f ${general_log_file}-${DT}.gz ];then mv ${general_log_file}-${DT}.gz ${general_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${general_log_file} ];then ln ${general_log_file} ${general_log_file}-${DT} && rm -f ${general_log_file} && gzip ${general_log_file}-${DT} $CMD -P${port} -e "flush general logs;" fi fi done find /var/log/mysql/ -mtime +7 -type f -name "*.gz"|xargs -i rm -f {} 看似脚本没有什么问题,在线上一跑,发现有些实例同步断了,同步中断原因为执行flush error logs失败,导致同步断了。此时有两个问题: 1、为什么从库无法执行flush error logs 2、为什么flush error logs会通过主从同步下来 第一个问题复现下就知道了,手动执行也是提示unkown error, 检查error日志发现日志文件不存在,且目录没有mysql是不可写的,创建一个空文件,并授权给mysql用户后,flush error logs;就能正常执行了。这个主要是部署时环境问题。 部署时/etc/my.cnf配置文件没有清,导致安装后error_log的文件指定了默认的/var/log/mysqld.log。 第二个问题就去检查主库binlog,确实binlog记录了flush error logs,flush general logs,flush slow logs。这点是很奇怪的,之前手动切换日志直接执行的是flush logs;为避免影响binlog 就执行指定的flush,但并没有记录binlog。所以理解上就认为 flush error/general/slow logs也不会记录binlog。 于是找官方文档确认,确实是之前的自我认为是错误的。 By default, the server writes FLUSH statements to the binary log so that they replicate to replicas. To suppress logging, specify the optional NO_WRITE_TO_BINLOG keyword or its alias LOCAL. 默认flush 是会记录到binlog,复制到从库的,如果不想记录到binlog可以用`NO_WRITE_TO_BINLOG` 或者local选项。 后面还有断提示: Note FLUSH LOGS, FLUSH BINARY LOGS, FLUSH TABLES WITH READ LOCK (with or without a table list), and FLUSH TABLES tbl_name ... FOR EXPORT are not written to the binary log in any case because they would cause problems if replicated to a replica. 之前执行的flush logs,FLUSH BINARY LOGS, FLUSH TABLES WITH READ LOCK and FLUSH TABLES tbl_name ... FOR EXPORT是不记录binlog的。所以第二个问题也就了解清楚原因了。 所以最后调整切割脚本如下,将flush加上NO_WRITE_TO_BINLOG: #!/bin/bash # # Created by beebel # logroate mysql error/slow/general logs # DT=`date +%Y%m%d` PORTLIST=$(netstat -nltp|grep -w mysqld|awk '{print $4}'|awk -F ":" '{print $NF}') LOCAL_IP=$(ip addr |sed -r 's/(.*inet )(1.*)(\/.*)/\2/' |egrep "^10\.|^192\.|^172.1[6-9] |^172.2[0-9]|^172.3[0-1]"|head -n 1) CMD="/usr/local/mysql/bin/mysql -uroot -p`cat /root/.dba/db.pwd/root_localhost` -h ${LOCAL_IP}" for port in ${PORTLIST} do # slow_log slow_log_file=$($CMD -P${port} -BNe "show variables like 'slow_query_log_file'"|awk '{print $NF}') if [ -f ${slow_log_file}-${DT}.gz ];then mv ${slow_log_file}-${DT}.gz ${slow_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${slow_log_file} ];then ln ${slow_log_file} ${slow_log_file}-${DT} && rm -f ${slow_log_file} && gzip ${slow_log_file}-${DT} $CMD -P${port} -e "flush NO_WRITE_TO_BINLOG slow logs;" fi # error_log error_log_file=$($CMD -P${port} -BNe "show variables like 'log_error'"|awk '{print $NF}') if [ -f ${error_log_file}-${DT}.gz ];then mv ${error_log_file}-${DT}.gz ${error_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${error_log_file} ];then ln ${error_log_file} ${error_log_file}-${DT} && rm -f ${error_log_file} && gzip ${error_log_file}-${DT} $CMD -P${port} -e "flush NO_WRITE_TO_BINLOG error logs;" fi # general_log general_log=$($CMD -P${port} -BNe "show variables like 'general_log'"|awk '{print $NF}') if [ "${general_log}" == "ON" ];then general_log_file=$($CMD -P${port} -BNe "show variables like 'general_log_file'"|awk '{print $NF}') if [ -f ${general_log_file}-${DT}.gz ];then mv ${general_log_file}-${DT}.gz ${general_log_file}-${DT}`date +%H%M -d "1 minutes ago"`.gz fi if [ -s ${general_log_file} ];then ln ${general_log_file} ${general_log_file}-${DT} && rm -f ${general_log_file} && gzip ${general_log_file}-${DT} $CMD -P${port} -e "flush NO_WRITE_TO_BINLOG general logs;" fi fi done find /var/log/mysql/ -mtime +7 -type f -name "*.gz"|xargs -i rm -f {} flush logs 等价执行如下: FLUSH BINARY LOGS FLUSH ENGINE LOGS FLUSH ERROR LOGS FLUSH GENERAL LOGS FLUSH RELAY LOGS FLUSH SLOW LOGS
文章最后更新时间: 2020年10月16日 15:44:17
分类文章统计
Django(5)
Flask(1)
Python常见错误(3)
Python基础(10)
linux排障(4)
虚拟化(1)
Consul(3)
Linux基础(5)
shell(11)
oracle(10)
MySQL(64)
ProxySQL(7)
SequoiaDB(2)
TiDB(4)
Redis(2)
常用软件(2)
硬件排障(2)
HTML(1)
JavaScript(1)
我们的作品(18)
windows(1)
总结(1)
按年文章统计
2013(43)
2014(19)
2015(25)
2016(6)
2017(30)
2018(7)
2019(17)
2020(4)
2021(4)
2023(1)
2024(2)
老版入口
亲,扫我吧!
友情链接
飞哥的:imbusy.me/
冰川的:www.mindg.cn
海洋的:hiaero.net
宏斌的:techindeep.com
若水的:nosa.me
段郎的:sixther.me
肥客联邦:fk68.net