文章

MySQL MHA 集群搭建

MHA 集群搭建

通过 MHA 软件实现 MySQL 高可用集群搭建,需要4台服务器

IP地址类型安装软件hostname
172.16.22.165MySQL主MySQL8.0.33、MHA nodemaster
172.16.22.166MySQL从MySQL8.0.33、MHA nodenode1
172.16.22.167MySQL从MySQL8.0.33、MHA nodenode2
172.16.22.168MHA管理节点MHA node、MHA managermanager
172.16.22.169VIP

官方wiki

1.同步时间

配置 chronyd 时间同步

yum install chrony
vi /etc/chrony.conf
# 修改 server
server ntp.aliyun.com iburst
# 立即执行一次同步
chronyc makestep
systemctl start chronyd
systemctl enable chronyd

2.安装 MySQL 软件

参考 MySQL8 安装部署,在my.cnf中修改server-id,参考如下

[mysql]
no-beep
prompt="\u@mysqldb \R:\m:\s [\d]> "
auto-rehash
default-character-set=utf8
[mysqld]
########basic settings########
server-id=1
port=3306
user = mysql
datadir=/opt/mysql/data/
character-set-server=utf8mb4
log_bin = master-bin
binlog_format='MIXED'
log-slave-updates=true
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
default_time_zone='+8:00'
lower_case_table_names=1
relay_log_purge = 0
relay-log=relay-log-bin
relay_log_recovery = 1
# 
[client]
socket=/var/lib/mysql/mysql.sock

3.配置 MySQL 集群

# master 创建同步账号
# 如果备库从主库恢复在主库建就行,否则3个节点都要创建
create user mhaslave@'%' identified with mysql_native_password by '990815@XXy';
grant replication slave on *.* to mhaslave@'%';
# 查看 binlog 名称及位置
show master status;

# 从服务器连接到主服务器
change master to master_host='172.16.22.165', master_user='mhaslave', master_password='990815@XXy', master_log_file='master-bin.000007', master_log_pos=660;
start slave;
show slave status\G

# master 创建供 mha 使用的账户,这里只在主库创建是因为集群已经搭建,账号信息会同步
create user mha@'%' identified with mysql_native_password by '990815@XXy';
grant all privileges on *.* to mha@'%';

# 从服务器设置为只读
set global read_only=1;

4.安装 MHA

# 各节点执行
# 1.安装依赖
yum install epel-release -y
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
# 2.安装 node 包
tar -zxvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make
make install
# 3.配置环境变量
vim /etc/profile
# 添加
if [ -d "/usr/local/bin" ] ; then
    PATH="/usr/local/bin:$PATH"
fi

source /etc/profile

# manager 安装 mha4mysql-manager
yum -y install perl-Module-Install
perl Makefile.PL
make
make install

manager 安装成功后 /usr/local/bin 下生成的文件:

masterha_check_ssh            #检查 MHA 的 SSH 配置状况
masterha_check_repl           #检查 MySQL 复制状况
masterha_manger               #启动 manager的脚本
masterha_check_status         #检测当前 MHA 运行状态
masterha_master_monitor       #检测 master 是否宕机
masterha_master_switch        #控制故障转移(自动或者 手动)
masterha_conf_host            #添加或删除配置的 server 信息
masterha_stop                 #关闭manager

node 安装成功后 /usr/local/bin 下生成的文件:

save_binary_logs              #保存和复制 master 的二进制日志
apply_diff_relay_logs         #识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog            #去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
purge_relay_logs              #清除中继日志(不会阻塞 SQL 线程)

5.配置服务器免密登录

# manager
# 一直回车
ssh-keygen -t rsa
# 复制 SSH 公钥复制到远程服务器
ssh-copy-id 172.16.22.165
ssh-copy-id 172.16.22.166
ssh-copy-id 172.16.22.167
# 测试
ssh 172.16.22.165

# master
ssh-keygen -t rsa
ssh-copy-id 172.16.22.166
ssh-copy-id 172.16.22.167
# node1
ssh-keygen -t rsa
ssh-copy-id 172.16.22.165
ssh-copy-id 172.16.22.167
# node2
ssh-keygen -t rsa
ssh-copy-id 172.16.22.165
ssh-copy-id 172.16.22.166

6.配置 manager

6.1 复制脚本

master_ip_failover  		#自动切换时 VIP 管理的脚本
master_ip_online_change 	#在线切换时 vip 的管理脚本
power_manager 				#故障发生后关闭主机的脚本
send_report 				#因故障切换后发送报警的脚本
cp -rp /opt/mha4mysql-manager-0.58/samples/scripts/ /usr/local/bin/

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin/

vim /usr/local/bin/master_ip_failover
#完整内容

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);
#############################添加内容部分#########################################
my $vip = '172.16.22.169';              #指定vip的地址
my $brdc = '172.16.22.255';             #指定vip的广播地址
my $ifdev = 'ens33';                    #指定vip绑定的网卡
my $key = '1';                          #指定vip绑定的虚拟网卡序列号
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";   #代表此变量值为ifconfig ens33:1 172.16.22.169
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";    #代表此变量值为ifconfig ens33:1 172.16.22.169 down
my $exit_code = 0;                                                                                      #指定退出状态码为0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
###################################################################################
GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
#############################添加内容部分#########################################
  print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
###################################################################################
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {
#############################添加内容部分#########################################
      print "Disabling the VIP on old master: $orig_master_host \n";
      &stop_vip();
###################################################################################
      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
elsif ( $command eq "start" ) {

    # all arguments are passed.
    # If you manage master ip address at global catalog database,
    # activate new_master_ip here.
    # You can also grant write access (create user, set read_only=0, etc) here.
    my $exit_code = 10;
    eval {
#############################添加内容部分#########################################
      print "Enabling the VIP - $vip on the new master - $new_master_host \n";
      &start_vip();
###################################################################################
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print "Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master
      print "Creating app user on the new master..\n";
#      FIXME_xxx_create_user( $new_master_handler->{dbh} );
      $new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc
#      FIXME_xxx;

      $exit_code = 0;
    };
    if ($@) {
      warn $@;

      # If you want to continue failover, exit 10.
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {
    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}
#############################添加内容部分#########################################
sub start_vip() {
  `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
  `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
###################################################################################
sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

6.2 创建 MHA 软件目录并拷贝配置文件

参数说明

参数说明
manager_log指定manager日志路径
manager_workdir指定manager工作目录
master_binlog_dir指定master保存binlog的位置
master_ip_failover_script设置自动failover时候的切换脚本
master_ip_online_change_script设置手动切换脚本
usermha访问数据库的账号
passwordmha访问数据库的密码
ping_interval心跳检测间隔,默认是3秒,尝试三次没有回应的时候自动进行failover
remote_workdirmha在远程节点上的工作目录
repl_user设置主从复制的用户
repl_password设置主从复制的用户密码
report_script设置发生故障切换的时候发送邮件提醒
secondary_check_script指定检查的从服务器IP地址
shutdown_script设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂)
ssh_user设置ssh的登录用户名
candidate_master设置为候选master,设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个从库不是集群中最新的slave
check_repl_delay默认情况下如果一个slave落后master 超过100M的relay logs的话,MHA将不会选择该slave作为一个新的master, 因为对于这个slave的恢复需要花费很长时间;通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
# 所有节点执行
mkdir -p /opt/mysql-mha/mha-node
# manager 节点执行
mkdir -p /opt/mysql-mha/mha
# 复制 /opt/mha4mysql-manager-0.58/samples/conf/app1.cnf 到 /opt/mysql-mha/ 或直接新建 mysql_mha.cnf 配置文件,写入:

[server default]
manager_log=/opt/mysql-mha/manager.log
manager_workdir=/opt/mysql-mha/mha
master_binlog_dir=/opt/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/scripts/master_ip_online_change
user=mha
password=990815@XXy
port=3306
ping_interval=1
remote_workdir=/opt/mysql-mha/mha-node
repl_user=mhaslave
repl_password=990815@XXy
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 172.16.22.166 -s 172.16.22.167
shutdown_script=""
ssh_user=root

[server1]
hostname=172.16.22.165
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=172.16.22.166
port=3306

[server3]
hostname=172.16.22.167
port=3306

# 第一次设置需要在 master 节点手动设置 VIP
/sbin/ifconfig ens33:1 172.16.22.169/24

6.3 测试连接

manager 节点执行

  1. ssh 连接测试
masterha_check_ssh -conf=/opt/mysql-mha/mysql_mha.cnf

  1. mysql 主从连接
masterha_check_repl -conf=/opt/mysql-mha/mysql_mha.cnf

6.4 启动 MHA

# nohup 启动
nohup masterha_manager \
--conf=/opt/mysql-mha/mysql_mha.cnf \
--remove_dead_master_conf \
--ignore_last_failover < /dev/null > /var/log/mha_manager.log 2>&1 &

# systemd 托管
cat <<EOF >/etc/systemd/system/mha.service
[Unit]
Description=MySQL MHA Manager
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/bash -c '/usr/bin/masterha_manager --conf=/opt/mysql-mha/mysql_mha.cnf --remove_dead_master_conf --ignore_last_failover >> /var/log/mha_manager.log 2>&1'
ExecStop=/usr/local/bin/masterha_stop --conf=/opt/mysql-mha/mysql_mha.cnf
Restart=always

[Install]
WantedBy=multi-user.target
EOF

6.5 一些命令

# 停止 MHA Manager
masterha_stop --conf=/opt/mysql-mha/mysql_mha.cnf
# 查看状态
masterha_check_status --conf=/opt/mysql-mha/mysql_mha.cnf

6.6 报错处理

报错:bash: apply_diff_relay_logs: command not found,检查发现节点配置正常,通过以下方法解决报错

# 3个节点都执行
ln -s /usr/local/bin/apply_diff_relay_logs  /usr/bin/
ln -s /usr/local/bin/save_binary_logs   /usr/bin/

报错Can't exec "mysqlbinlog": No such file or directory

# 3个节点都执行
ln -s /opt/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /opt/mysql/bin/mysql /usr/bin/mysql

报错Bareword "FIXME_xxx" not allowed while "strict subs" in use at /usr/local/bin/master_ip_failover line 93. 注释 /usr/local/bin/master_ip_failover 存在 FIXME_xxx 的行

警告relay_log_purge=0 is not set on slave 在数据库节点my.cnf添加 relay_log_purge=0,重启数据库。 当 relay_log_purge=1时,旧relay logs会在SQL线程执行完毕后被自动删除,当 relay_log_purge=0 时,旧的 relay log 则会被保留。

报错[/usr/local/share/perl5/MHA/MasterFailover.pm, ln1612] Failed to activate master IP address for 172.16.22.166(172.16.22.166:3306) with return code 10:0 注释 /usr/local/bin/master_ip_failover 脚本中 FIXME_xxx相关内容。

解决中英字不兼容报错的问题 dos2unix /usr/local/bin/master_ip_failover

License:  CC BY 4.0