MySQL高可用MHA
环境:
角色 | IP | 主机名 | Server ID | 类型 |
---|---|---|---|---|
master1 | 192.168.1.12 | master1 | 1 | 主从节点写 |
master2 | 192.168.1.13 | master2 | 2 | 从节点读 |
slave | 192.168.1.14 | slave | 3 | 从节点读 |
manager | 192.168.1.7 | manager | 无 | 管理节点 |
一、环境部署
①更改主机名,添加hosts文件
[root@192 ~]# hostnamectl set-hostname master1
[root@192 ~]# hostnamectl set-hostname master2
[root@192 ~]# hostnamectl set-hostname slave
[root@192 ~]# hostnamectl set-hostname manager
[root@master1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.12 master1
192.168.1.13 master2
192.168.1.14 slave
192.168.1.7 manager
[root@master1 ~]# for i in master2 slave manager ; do scp /etc/hosts $i:/etc/hosts;done
分发给其他主机
②配置密钥ssh(所有主机各配置相同操作)
例如master1:
[root@master1 ~]# ssh-keygen -t rsa
在所有主机都必须拷贝密钥于其他主机
[root@master1 ~]# for i in master1 master2 slave manager ; do ssh-copy-id $i ; done
拷贝密钥
[root@master1 ~]# for i in master1 master2 slave manager ; do ssh $i hostname ; done
测试登录
③关闭防火墙以及selinx
④集群主机时间同步
⑤安装环境插件
[root@192 ~]# yum -y install epel-release
[root@192 ~]# yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles ncftp perl-Params-Validate perl-CPAN perl-Test-Mock-LWP.noarch perl-LWP-Authen-Negotiate.noarch perl-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker
二、配置DB主从
步骤:
①检查mysql状态
mysql半同步是由谷歌提供,master是semisync_master.so,slave是semisync_slave.so,如不清楚路径可以先进行查询
mysql> show variables like ‘%plugin_dir%’;
Variable_name | Value |
---|---|
plugin_dir | /usr/local/mysql/lib/plugin/ |
再进行检查是否支持动态载入
mysql> show variables like ‘%have_dynamic_loading%’;
Variable_name | Value |
---|---|
have_dynamic_loading | YES |
②安装mysql半同步插件(在所有的mysql主机)
mysql> install plugin rpl_semi_sync_master soname ‘semisync_master.so’;
mysql> install plugin rpl_semi_sync_slave soname ‘semisync_slave.so’;
检查是否正确安装
mysql> show plugins;
rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL |
---|---|---|---|---|
rpl_semi_sync_slave | ACTIVE | REPLICATION | semisync_slave.so | GPL |
查看半同步相关信息
mysql> show variables like ‘rpl_semi_sync%’;
Variable_name | Value |
---|---|
rpl_semi_sync_master_enabled | OFF |
rpl_semi_sync_master_timeout | 10000 |
rpl_semi_sync_master_trace_level | 32 |
rpl_semi_sync_master_wait_for_slave_count | 1 |
rpl_semi_sync_master_wait_no_slave | ON |
rpl_semi_sync_master_wait_point | AFTER_SYNC |
rpl_semi_sync_slave_enabled | OFF |
rpl_semi_sync_slave_trace_level | 32 |
③配置mysql半同步复制
master1:
[root@master1 ~]# cat /etc/my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
server_id = 1
socket = /usr/local/mysql/mysql.sock
log-error=/usr/local/mysql/data/mysqld.err
log-bin = mysql-bin
log-bin-index = mysql-bin.index
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 1000
rpl_semi_sync_slave_enabled = 1
relay_log_purge = 0
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
[client]
host = 127.0.0.1
user = root
password = 123.com
master2:
[root@master2 ~]# cat /etc/my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
server_id = 2
socket = /usr/local/mysql/mysql.sock
log-error=/usr/local/mysql/data/mysqld.err
log-bin = mysql-bin
log-bin-index = mysql-bin.index
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 1000
rpl_semi_sync_slave_enabled = 1
relay_log_purge = 0
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
[client]
host = 127.0.0.1
user = root
password = 123.com
slave:
[root@slave ~]# cat /etc/my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
server_id = 3
socket = /usr/local/mysql/mysql.sock
log-error=/usr/local/mysql/data/mysqld.err
log-bin = mysql-bin
log-bin-index = mysql-bin.index
rpl_semi_sync_slave_enabled = 1
relay-log = relay-bin
relay-log-index = slave-relay-bin.index
read_only = 1
[client]
host = 127.0.0.1
user = root
password = 123.com
全部重启服务
[root@master1 ~]# systemctl restart mysqld
mysql> show variables like ‘%rpl_semi_sync%’;
Variable_name | Value |
---|---|
rpl_semi_sync_master_enabled | ON |
rpl_semi_sync_master_timeout | 1000 |
rpl_semi_sync_master_trace_level | 32 |
rpl_semi_sync_master_wait_for_slave_count | 1 |
rpl_semi_sync_master_wait_no_slave | ON |
rpl_semi_sync_master_wait_point | AFTER_SYNC |
rpl_semi_sync_slave_enabled | ON |
rpl_semi_sync_slave_trace_level | 32 |
这是主从master的半同步状态
mysql> show variables like ‘%rpl_semi_sync%’;
Variable_name | Value |
---|---|
rpl_semi_sync_master_enabled | OFF |
rpl_semi_sync_master_timeout | 10000 |
rpl_semi_sync_master_trace_level | 32 |
rpl_semi_sync_master_wait_for_slave_count | 1 |
rpl_semi_sync_master_wait_no_slave | ON |
rpl_semi_sync_master_wait_point | AFTER_SYNC |
rpl_semi_sync_slave_enabled | ON |
rpl_semi_sync_slave_trace_level | 32 |
这是slave的半同步状态,发现主插件是关闭的
④权限与change
master1:
mysql> grant replication slave on *.* to rep@'%' identified by '123.com';
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 436 | | | |
+------------------+----------+--------------+------------------+-------------------+
master2:
mysql> grant replication slave on *.* to rep@'%' identified by '123.com';
mysql> change master to master_host='192.168.1.12',master_user='rep',master_password='123.com',master_log_file='mysql-bin.000001',master_log_pos=436;
mysql> start slave;
slave:
mysql> change master to master_host='192.168.1.12',master_user='rep',master_password='123.com',master_log_file='mysql-bin.000001',master_log_pos=436;
mysql> start slave;
回到master1:
mysql> grant all privileges on *.* to manager@'%' identified by '123.com';
mysql> show status like '%rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 2 |
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
+--------------------------------------------+-------+
因为已经配置主从复制,在master1上创建manager用户同步到其他主机授权manager监控节点工作
三、配置MHA
步骤:
①安装MHA(manager节点两个包都需要安装,三台DB只需要安装node包)
例如master1:
[root@master1 ~]# tar zxf mha4mysql-node-0.56.tar.gz
[root@master1 ~]# cd mha4mysql-node-0.56/
[root@master1 mha4mysql-node-0.56]# perl Makefile.PL
[root@master1 mha4mysql-node-0.56]# make && make install
manager:
[root@manager ~]# tar zxf mha4mysql-node-0.56.tar.gz
[root@manager ~]# cd mha4mysql-node-0.56/
[root@manager mha4mysql-node-0.56]# perl Makefile.PL
[root@manager mha4mysql-node-0.56]# make && make install
[root@manager mha4mysql-node-0.56]# cd ..
[root@manager ~]# tar zxf mha4mysql-manager-0.56.tar.gz
[root@manager ~]# cd mha4mysql-manager-0.56/
[root@manager mha4mysql-manager-0.56]# perl Makefile.PL
[root@manager mha4mysql-manager-0.56]# make && make install
创建一些可用目录
[root@manager ~]# cd /
[root@manager /]# mkdir etc/masterha
[root@manager /]# mkdir -p masterha/app1
[root@manager /]# mkdir scripts
[root@manager /]# cd ~/mha4mysql-manager-0.56/
[root@manager mha4mysql-manager-0.56]# cp samples/conf/* /etc/masterha/
[root@manager mha4mysql-manager-0.56]# cp samples/scripts/* /scripts/
[root@manager mha4mysql-manager-0.56]#
②配置MHA
[root@manager ~]# vim /etc/masterha/app1.cnf
[server default]
manager_workdir=//masterha/app1
manager_log=/masterha/app1/manager.log
user=manager
password=123.com
ssh_user=root
repl_user=rep
repl_password=123.com
ping_interval=1
[server1]
hostname=192.168.1.12
port=3306
master_binlog_dir=/usr/local/mysql/data
candidate_master=1
[server2]
hostname=192.168.1.13
port=3306
master_binlog_dir=/usr/local/mysql/data
candidate_master=1
[server3]
hostname=192.168.1.14
port=3306
master_binlog_dir=/usr/local/mysql/data
no_master=1
③ssh有效性验证
[root@manager ~]# masterha_check_ssh –global_conf=/etc/masterha/masterha_default.cnf –conf=/etc/masterha/app1.cnf
.......
Tue Apr 24 16:46:58 2018 - [debug] ok.
Tue Apr 24 16:46:58 2018 - [debug] Connecting via SSH from root@192.168.1.14(192.168.1.14:22) to root@192.168.1.13(192.168.1.13:22)..
Tue Apr 24 16:46:58 2018 - [debug] ok.
Tue Apr 24 16:46:58 2018 - All SSH connection tests passed successfully.
如有报错,务必检查无密ssh连接情况,确保其他主机已经分享公钥
④集群复制有效性验证
[root@manager ~]# masterha_check_ssh –global_conf=/etc/masterha/masterha_default.cnf –conf=/etc/masterha/app1.cnf
..........
Tue Apr 24 16:50:24 2018 - Connecting to root@192.168.1.13(192.168.1.13:22)..
Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 106.
mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options
at /usr/local/bin/apply_diff_relay_logs line 492.
Tue Apr 24 16:50:25 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln201] Slaves settings check failed!
Tue Apr 24 16:50:25 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln390] Slave configuration failed.
Tue Apr 24 16:50:25 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln401] Error happend on checking configurations. at /usr/local/bin/masterha_check_repl line 48.
Tue Apr 24 16:50:25 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln500] Error happened on monitoring servers.
Tue Apr 24 16:50:25 2018 - Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
发现报错Can’t exec “mysqlbinlog”: No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 106.
解决方法在所有DB服务器进行软连接执行文件操作
[root@master1 ~]# ln -s /usr/local/mysql/bin/* /usr/local/bin
⑤启动manager
[root@manager ~]# nohup masterha_manager –conf=/etc/masterha/app1.cnf &> /tmp/mha_manager.log &
[1] 8126
[root@manager ~]# masterha_check_status –conf=/etc/masterha/app1.cnf
app1 (pid:8126) is running(0:PING_OK), master:192.168.1.12
四、MHA高可用验证
步骤:
①将master1中mysql服务宕掉
[root@master1 ~]# systemctl stop mysqld
②在manager查看MHA日志
[root@manager ~]# cat /masterha/app1/manager.log
.......
Started automated(non-interactive) failover.
The latest slave 192.168.1.13(192.168.1.13:3306) has all relay logs for recovery.
Selected 192.168.1.13 as a new master.
192.168.1.13: OK: Applying all logs succeeded.
192.168.1.14: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.1.14: OK: Applying all logs succeeded. Slave started, replicating from 192.168.1.13.
192.168.1.13: Resetting slave info succeeded.
Master failover to 192.168.1.13(192.168.1.13:3306) completed successfully
发现master failover已经成功
③查看slave复制状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.13
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 436
......
发现master IP已经转为192.168.1.13,io线程与sql线程正常运行,MHA搭建成功
④MHA manager端日常操作
1)发生主从切换后,manager服务会自动停掉,却在manager_workdir(/masterha/app1)目录生成文件app1.failover.complate,若要启动MHA,必须确保此文件已经删除
[root@manager ~]# cd /masterha/
[root@manager masterha]# ls
app1
[root@manager masterha]# cd app1/
[root@manager app1]# ls
app1.failover.complete manager.log
[root@manager app1]# mv app1.failover.complete /tmp
2)重新定义master1位置,需要把master1设置为master2的slave服务器
mysql> change master to master_host=’192.168.1.13′,master_user=’rep’,master_password=’123.com’,master_log_file=’mysql-bin.000001′,master_log_pos=436;
mysql> start slave;
3)停止MHA
[root@manager app1]# masterha_stop –conf=/etc/masterha/app1.cnf
4)启动MHA
[root@manager app1]# nohup masterha_manager –conf=/etc/masterha/app1.cnf &> /tmp/mha_manager.log &
[1] 8613
[root@manager app1]# masterha_check_status –conf=/etc/masterha/app1.cnf
app1 (pid:8613) is running(0:PING_OK), master:192.168.1.13
5)主从切换后续工作
重构计划,切换后的master2成为主库,想要master1重夺主库,重新执行以上五步,包括停止master2库。原主库数据文件完整情况可以通过以下方式得到change master命令
[root@manager app1]# grep -i ‘change master to master’ /masterha/app1/manager.log | tail -1
Tue Apr 24 17:16:59 2018 - All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.1.12', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=154, MASTER_USER='rep', MASTER_PASSWORD='xxx';
mysql> CHANGE MASTER TO MASTER_HOST=’192.168.1.12′, MASTER_PORT=3306, MASTER_LOG_FILE=’mysql-bin.000002′, MASTER_LOG_POS=154, MASTER_USER=’rep’, MASTER_PASSWORD=’123.com’;
mysql> start slave;
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.12
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 154
[root@manager ~]# cd /masterha/app1/
[root@manager app1]# ls app1.failover.complete
manager.log
[root@manager app1]# mv app1.failover.complete /tmp
mv: overwrite ‘/tmp/app1.failover.complete’? y
[root@manager app1]# nohup masterha_manager –conf=/etc/masterha/app1.cnf &> /tmp/mha_manager.log &
[1] 9015
[root@manager app1]# masterha_check_status –conf=/etc/masterha/app1.cnf
app1 (pid:9015) is running(0:PING_OK), master:192.168.1.12
6)定期删除中继日志
在配置主从复制中,slave设置了参数relay_log_purge=0,所以slave节点需要定期删除中继日志,建议每个slave节点删除中继日志时间错开
[root@slave ~]# crontab -e
0 5 * * * /usr/local/bin/purge_relay_logs --user=root --password=123.com --port=3306 --disable_relay_log_purge >> /var/log/purge_
relay.log 2>&
五、配置VIP
vip配置可以采用两种方式,一种通过keepalived方式管理虚拟ip浮动;另一种通过脚本启动虚拟ip方式(就是不需要任何第三方应用程序)
1.keepalived方式管理虚拟ip
步骤:
①在两台master安装keepalived
[root@master1 ~]# wget http://www.keepalived.org/sof…
[root@master1 ~]# yum -y install openssl openssl-devel
[root@master1 ~]# cd keepalived-1.4.3/
[root@master1 keepalived-1.4.3]# ./configure –prefix=/ && make && make install
[root@master1 keepalived-1.4.3]# whereis keepalived
keepalived: /usr/sbin/keepalived /etc/keepalived
检查cent7启动脚本中执行程序位置
[root@master1 ~]# vim /usr/lib/systemd/system/keepalived.service
[Unit]
Description=LVS and VRRP High Availability Monitor
After= network-online.target syslog.target
Wants=network-online.target
[Service]
Type=forking
PIDFile=/var/run/keepalived.pid
KillMode=process
EnvironmentFile=-//etc/sysconfig/keepalived
ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
②配置keepalived
master1:
[root@master1 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id mysql-ha1
}
vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100
}
}
master2:
[root@master2 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id mysql-ha2
}
vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100
}
}
③启动keepalived
[root@master1 ~]# systemctl start keepalived
[root@master2 ~]# systemctl start keepalived
查看master1虚拟ip状态
[root@master1 ~]# ip a
......
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:db:f7:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.12/24 brd 192.168.1.255 scope global dynamic eno16777736
valid_lft 77805sec preferred_lft 77805sec
inet 192.168.1.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fedb:f7b8/64 scope link
valid_lft forever preferred_lft forever
④MHA引入keepalived
mysql服务进程挂掉后通过MHA停止keepalived,想要引入MHA,我们只需要修改触发脚本文件,添加master发生宕机后对keepalived的处理
[root@manager /]# vim /scripts/master_ip_failover
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
#use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
my $vip = '192.168.1.100';
my $ssh_start_vip = "/usr/bin/systemctl start keepalived";
my $ssh_stop_vip = "/usr/bin/systemctl stop keepalived";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
print"\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master:$orig_master_host \n";
&stop_vip();
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
# my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
#$new_master_handler->connect( $new_master_ip, $new_master_port,
# $new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
#$new_master_handler->disable_log_bin_local();
#print "Set read_only=0 on the new master.\n";
#$new_master_handler->disable_read_only();
## Creating an app user on the new master
#print "Creating app user on the new master..\n";
#FIXME_xxx_create_user( $new_master_handler->{dbh} );
#$new_master_handler->enable_log_bin_local();
#$new_master_handler->disconnect();
## Update master ip on the catalog database, etc
#FIXME_xxx;
print"Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the status of the script..OK \n";
#`ssh $ssh_user\@cluster1\"$ssh_start_vip\"`;
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip(){
`ssh $ssh_user\@$new_master_host\"$ssh_start_vip\"`;
}
sub stop_vip(){
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host\"$ssh_stop_vip\"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
⑤停止MHA
[root@manager ~]# masterha_stop –conf=/etc/masterha/app1.cnf
⑥添加参数
[root@manager ~]# vim /etc/masterha/app1.cnf
[server default]
master_ip_failover_script=/scripts/master_ip_failover
manager_workdir=//masterha/app1
manager_log=/masterha/app1/manager.log
........
⑦启动MHA,并且查看状态
[root@manager ~]# nohup masterha_manager –conf=/etc/masterha/app1.cnf &> /tmp/mha_manager.log &
[1] 10520
[root@manager ~]# masterha_check_status –conf=/etc/masterha/app1.cnf
app1 (pid:10520) is running(0:PING_OK), master:192.168.1.12
再次查看集群状态是否会报错
[root@manager ~]# masterha_check_repl –conf=/etc/masterha/app1.cnf
.......
Checking the status of the script..OK
Tue Apr 24 18:39:12 2018 - OK.
Tue Apr 24 18:39:12 2018 - shutdown_script is not defined.
Tue Apr 24 18:39:12 2018 - Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
可以看到没有报错
master_ip_failover内容意思当主库放生故障,会触发MHA切换,manager会停掉主库上的keepalived服务,触发vip飘逸到从库
⑧测试
宕掉master1中mysql服务
[root@master1 ~]# systemctl stop mysqld
查看master1中网卡绑定情况
[root@master1 init.d]# ip a
.......
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:db:f7:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.12/24 brd 192.168.1.255 scope global dynamic eno16777736
valid_lft 74322sec preferred_lft 74322sec
inet 192.168.1.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fedb:f7b8/64 scope link
valid_lft forever preferred_lft forever
发现vip地址并没飘逸
回到manager查看manager.log文件,发现并没有报错,排查到为master_ip_failover脚本并有杀死keepalived进程
解决方法将两台主机中拷贝keepalived脚本执行文件,在keepalived的tar包解压后keepalived/etc/init.d/keepalived拷贝到/etc/init.d/下,修改master_ip_failover脚本文件参数项
......
my $ssh_start_vip = "/etc/init.d/keepalived start";
my $ssh_stop_vip = "/etc/init.d/keepalived stop";
......
删除app1.failover.complete,重新定义master,启动mha,再一次宕掉master1服务进行验证
[root@master1 ~]# ps -ef | grep keep
root 26079 3646 0 19:47 pts/1 00:00:00 grep –color=auto keep
发现keepalived已经被关闭
再去观察master2的vip地址
[root@master2 ~]# ip a
......
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:4b:6a:1e brd ff:ff:ff:ff:ff:ff
inet 192.168.1.13/24 brd 192.168.1.255 scope global dynamic eno16777736
valid_lft 70169sec preferred_lft 70169sec
inet 192.168.1.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe4b:6a1e/64 scope link
valid_lft forever preferred_lft forever
vip已经转移验证成功,注意后续重构mysql服务器
2.通过脚本实现虚拟IP切换
步骤:
①手动绑定vip
[root@master1 ~]# ifconfig eno16777736:0 192.168.1.100/24
[root@master1 ~]# ifconfig
eno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.12 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::20c:29ff:fedb:f7b8 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:db:f7:b8 txqueuelen 1000 (Ethernet)
RX packets 31589 bytes 8780116 (8.3 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 25911 bytes 5726902 (5.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eno16777736:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.100 netmask 255.255.255.0 broadcast 192.168.1.255
ether 00:0c:29:db:f7:b8 txqueuelen 1000 (Ethernet)
②修改app1.cnf文件和master_ip_failover文件
在[server default]下添加master_ip_failover_script=/scripts/master_ip_failover
在之上keepalived的master_ip_failover脚本中修改并添加四项参数
......
my $vip = '192.168.1.100';
my $key = '0';
my $ssh_start_vip = "/usr/sbin/ifconfig eno16777736:$key $vip";
my $ssh_stop_vip = "/usr/sbin/ifconfig eno16777736:$key down";
......
③重新启动MHA
[root@manager ~]# masterha_stop –conf=/etc/masterha/app1.cnf
[root@manager ~]# nohup masterha_manager –conf=/etc/masterha/app1.cnf &> /tmp/mha_manager.log &
④测试
宕掉master1
[root@master1 ~]# systemctl stop mysqld
查看slave复制状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.13
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000005
Read_Master_Log_Pos: 154
查看master2 vip状态
[root@master2 ~]# ifconfig
eno16777736: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.13 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::20c:29ff:fedb:f7b8 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:db:f7:b8 txqueuelen 1000 (Ethernet)
RX packets 32959 bytes 8939121 (8.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 26418 bytes 5872252 (5.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eno16777736:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.100 netmask 255.255.255.0 broadcast 192.168.1.255
ether 00:0c:29:db:f7:b8 txqueuelen 1000 (Ethernet)
vip已经被master2接管,验证成功
六、总结
MHA软件由两部分组成,manager工具包和node工具包
Manager工具包工具:
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manager 启动MHA
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移
masterha_conf_host 添加或删除配置的server信息
Node工具包工具:
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件
purge_relay_logs 清除中继日志