文章目录
  1. 1. MHA Overview
    1. 1.1. INSTALLATION
      1. 1.1.1. INIT ENV
      2. 1.1.2. INSTALL MHA
    2. 1.2. CONFIGURE MHA
      1. 1.2.1. configure file
      2. 1.2.2. 配置多个HA组
        1. 1.2.2.1. 配置一个全局的配置文件
        2. 1.2.2.2. HA group 1:
        3. 1.2.2.3. HA group 2 :
    3. 1.3. 验证MHA
      1. 1.3.1. CHECK SSH
      2. 1.3.2. CHECK REPL
    4. 1.4. 问题和总结

MHA Overview

MHA performs automating master failover and slave promotion with minimal downtime, usually within 10-30 seconds. MHA prevents replication consistency problems and saves on expenses of having to acquire additional servers. All this with zero performance degradation, no complexity (easy-to-install) and requiring no change to existing deployments.

INSTALLATION

INIT ENV

DB node :

192.168.19.134   MASTER
192.168.19.135   SLAVE
192.168.19.136   SLAVE

MHA manager node :

192.168.19.133  manager 

Centos 6.x :

1
2
yum  -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles  ncftp perl-Params-Validate  perl-CPAN perl-Test-Mock-LWP.noarch perl-LWP-Authen-Negotiate.noarch perl-devel 
yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

Centos 7

1
2
3
4
yum -y install perl-CPAN
perl -MCPAN -e "install Config::Tiny"
perl -MCPAN -e "install Log::Dispatch"
perl -MCPAN -e "install Parallel::ForkManager"

INSTALL MHA

  • download the rpm packages
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
            mha4mysql-manager-0.56-0.el6.noarch.rpm
    mha4mysql-node-0.56-0.el6.noarch.rpm
    ```
    - rpm -ivh pk-name to install
    ```
    [root@mytaskdb ~]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
    Preparing... ########################################### [100%]
    1:mha4mysql-node ########################################### [100%]
    [root@mytaskdb ~]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
    Preparing... ########################################### [100%]
    1:mha4mysql-manager ########################################### [100%]
    [root@mytaskdb ~]#
  • Centos 7
    1
    2
    3
    4
    5
    tar zxvf mha4mysql-manager-0.56.tar.gz  
    cd mha4mysql-manager-0.56/
    perl Makefile.PL
    make
    make install

CONFIGURE MHA

每个DB节点创建用户 :

1
2
grant all privileges on *.* to mha_test@'192.168.19.%' identified by '123456' ; 
flush privileges ;

configure file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[server default]
manager_workdir=/root/mhatest
manager_log=/root/mhatest/manager.log
ssh_user=root
user=mha_test
password=123456
ssh_user=root
repl_user=repl
repl_password=repl
ping_interval=2
shutdown_script=""
master_ip_online_change_script=""
report_script=""
[server1]
hostname=192.168.19.134
master_binlog_dir=/log/mysql3306/binlog
candidate_master=1
[server2]
hostname=192.168.19.132
master_binlog_dir=/log/mysql3306/binlog
candidate_master=1
[server3]
hostname=192.168.19.135
master_binlog_dir=/log/mysql3306/binlog
candidate_master=1

配置多个HA组

配置一个全局的配置文件

file: /etc/masterha_default.cnf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[server default]
ssh_user=root
user=mha_test
password=123456
repl_user=repl
repl_password=4E1uJaIuc
ping_interval=2
master_ip_online_change_script=""
master_binlog_dir= /log/mysql3306/binlog
remote_workdir=/data/log/masterha
secondary_check_script= masterha_secondary_check -s remote_host1 -s remote_host2
master_ip_failover_script=/script/masterha/master_ip_failover
shutdown_script= /script/masterha/power_manager
report_script= /script/masterha/send_master_failover_mail
HA group 1:

cat /etc/app1.cnf

1
2
3
4
5
6
7
8
9
10
11
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/app1.log
[server1]
hostname=host1
candidate_master=1
[server2]
hostname=host2
candidate_master=1
[server3]
hostname=host3
HA group 2 :

cat /etc/app2.cnf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[server default]
manager_workdir=/var/log/masterha/app2
manager_log=/var/log/masterha/app2/app2.log
[server1]
hostname=host11
candidate_master=1
[server2]
hostname=host12
candidate_master=1
[server3]
hostname=host13
[server4]
hostname=host14
no_master=1

验证MHA

使用以下命令来进行验证:

1
2
3
masterha_check_ssh  --conf=/root/mhatest/app1.cnf
masterha_check_repl --conf=/root/mhatest/app1.cnf
masterha_manager --conf=/root/mhatest/app1.cnf

CHECK SSH

masterha_check_ssh –conf=/root/mhatest/app1.cnf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@masterham mhatest]# masterha_check_ssh  --conf=/root/mhatest/app1.cnf  
Wed Mar 4 09:12:48 2015 - [info] Reading default configuration from /etc/masterha_default.cnf..
Wed Mar 4 09:12:48 2015 - [info] Reading application default configuration from /root/mhatest/app1.cnf..
Wed Mar 4 09:12:48 2015 - [info] Reading server configuration from /root/mhatest/app1.cnf..
Wed Mar 4 09:12:48 2015 - [info] Starting SSH connection tests..
Wed Mar 4 09:12:49 2015 - [debug]
Wed Mar 4 09:12:48 2015 - [debug] Connecting via SSH from root@192.168.19.134(192.168.19.134:22) to root@192.168.19.135(192.168.19.135:22)..
Wed Mar 4 09:12:48 2015 - [debug] ok.
Wed Mar 4 09:12:48 2015 - [debug] Connecting via SSH from root@192.168.19.134(192.168.19.134:22) to root@192.168.19.136(192.168.19.136:22)..
Wed Mar 4 09:12:49 2015 - [debug] ok.
Wed Mar 4 09:12:49 2015 - [debug]
Wed Mar 4 09:12:48 2015 - [debug] Connecting via SSH from root@192.168.19.135(192.168.19.135:22) to root@192.168.19.134(192.168.19.134:22)..
Wed Mar 4 09:12:49 2015 - [debug] ok.
Wed Mar 4 09:12:49 2015 - [debug] Connecting via SSH from root@192.168.19.135(192.168.19.135:22) to root@192.168.19.136(192.168.19.136:22)..
Wed Mar 4 09:12:49 2015 - [debug] ok.
Wed Mar 4 09:12:50 2015 - [debug]
Wed Mar 4 09:12:49 2015 - [debug] Connecting via SSH from root@192.168.19.136(192.168.19.136:22) to root@192.168.19.134(192.168.19.134:22)..
Wed Mar 4 09:12:49 2015 - [debug] ok.
Wed Mar 4 09:12:49 2015 - [debug] Connecting via SSH from root@192.168.19.136(192.168.19.136:22) to root@192.168.19.135(192.168.19.135:22)..
Wed Mar 4 09:12:50 2015 - [debug] ok.
Wed Mar 4 09:12:50 2015 - [info] All SSH connection tests passed successfully.

CHECK REPL

masterha_check_repl –conf=/root/mhatest/app1.cnf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
[root@masterham mhatest]# masterha_check_repl --conf=/root/mhatest/app1.cnf
Wed Mar 4 09:13:49 2015 - [info] Reading default configuration from /etc/masterha_default.cnf..
Wed Mar 4 09:13:49 2015 - [info] Reading application default configuration from /root/mhatest/app1.cnf..
Wed Mar 4 09:13:49 2015 - [info] Reading server configuration from /root/mhatest/app1.cnf..
Wed Mar 4 09:13:49 2015 - [info] MHA::MasterMonitor version 0.56.
Wed Mar 4 09:13:50 2015 - [info] GTID failover mode = 0
Wed Mar 4 09:13:50 2015 - [info] Dead Servers:
Wed Mar 4 09:13:50 2015 - [info] Alive Servers:
Wed Mar 4 09:13:50 2015 - [info] 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:13:50 2015 - [info] 192.168.19.135(192.168.19.135:3306)
Wed Mar 4 09:13:50 2015 - [info] 192.168.19.136(192.168.19.136:3306)
Wed Mar 4 09:13:50 2015 - [info] Alive Slaves:
Wed Mar 4 09:13:50 2015 - [info] 192.168.19.135(192.168.19.135:3306) Version=5.6.19-67.0-log (oldest major version between slaves) log-bin:enabled
Wed Mar 4 09:13:50 2015 - [info] Replicating from 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:13:50 2015 - [info] 192.168.19.136(192.168.19.136:3306) Version=5.6.19-67.0-log (oldest major version between slaves) log-bin:enabled
Wed Mar 4 09:13:50 2015 - [info] Replicating from 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:13:50 2015 - [info] Current Alive Master: 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:13:50 2015 - [info] Checking slave configurations..
Wed Mar 4 09:13:50 2015 - [info] Checking replication filtering settings..
Wed Mar 4 09:13:50 2015 - [info] binlog_do_db= , binlog_ignore_db=
Wed Mar 4 09:13:50 2015 - [info] Replication filtering check ok.
Wed Mar 4 09:13:50 2015 - [info] GTID (with auto-pos) is not supported
Wed Mar 4 09:13:50 2015 - [info] Starting SSH connection tests..
Wed Mar 4 09:13:51 2015 - [info] All SSH connection tests passed successfully.
Wed Mar 4 09:13:51 2015 - [info] Checking MHA Node version..
Wed Mar 4 09:13:52 2015 - [info] Version check ok.
Wed Mar 4 09:13:52 2015 - [info] Checking SSH publickey authentication settings on the current master..
Wed Mar 4 09:13:52 2015 - [info] HealthCheck: SSH to 192.168.19.134 is reachable.
Wed Mar 4 09:13:52 2015 - [info] Master MHA Node version is 0.56.
Wed Mar 4 09:13:52 2015 - [info] Checking recovery script configurations on 192.168.19.134(192.168.19.134:3306)..
Wed Mar 4 09:13:52 2015 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/log/mysql3306/binlog --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=binlog.000005
Wed Mar 4 09:13:52 2015 - [info] Connecting to root@192.168.19.134(192.168.19.134:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /log/mysql3306/binlog, up to binlog.000005
Wed Mar 4 09:13:52 2015 - [info] Binlog setting check done.
Wed Mar 4 09:13:52 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Wed Mar 4 09:13:52 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_test' --slave_host=192.168.19.135 --slave_ip=192.168.19.135 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.19-67.0-log --manager_version=0.56 --relay_dir=/log/mysql3306/binlog --current_relay_log=relaylog.3794333 --slave_pass=xxx
Wed Mar 4 09:13:52 2015 - [info] Connecting to root@192.168.19.135(192.168.19.135:22)..
Checking slave recovery environment settings..
Relay log found at /log/mysql3306/binlog, up to relaylog.3794334
Temporary relay log file is /log/mysql3306/binlog/relaylog.3794334
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Mar 4 09:13:52 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_test' --slave_host=192.168.19.136 --slave_ip=192.168.19.136 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.19-67.0-log --manager_version=0.56 --relay_dir=/log/mysql3306/binlog --current_relay_log=relaylog.3794333 --slave_pass=xxx
Wed Mar 4 09:13:52 2015 - [info] Connecting to root@192.168.19.136(192.168.19.136:22)..
Checking slave recovery environment settings..
Relay log found at /log/mysql3306/binlog, up to relaylog.3794334
Temporary relay log file is /log/mysql3306/binlog/relaylog.3794334
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Mar 4 09:13:52 2015 - [info] Slaves settings check done.
Wed Mar 4 09:13:52 2015 - [info]
192.168.19.134(192.168.19.134:3306) (current master)
+--192.168.19.135(192.168.19.135:3306)
+--192.168.19.136(192.168.19.136:3306)
Wed Mar 4 09:13:52 2015 - [info] Checking replication health on 192.168.19.135..
Wed Mar 4 09:13:52 2015 - [info] ok.
Wed Mar 4 09:13:52 2015 - [info] Checking replication health on 192.168.19.136..
Wed Mar 4 09:13:52 2015 - [info] ok.
Wed Mar 4 09:13:52 2015 - [warning] master_ip_failover_script is not defined.
Wed Mar 4 09:13:52 2015 - [warning] shutdown_script is not defined.
Wed Mar 4 09:13:52 2015 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.

masterha_manager –conf=/root/mhatest/app1.cnf
日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
Wed Mar  4 09:15:00 2015 - [info] MHA::MasterMonitor version 0.56.
Wed Mar 4 09:15:01 2015 - [info] GTID failover mode = 0
Wed Mar 4 09:15:01 2015 - [info] Dead Servers:
Wed Mar 4 09:15:01 2015 - [info] Alive Servers:
Wed Mar 4 09:15:01 2015 - [info] 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:15:01 2015 - [info] 192.168.19.135(192.168.19.135:3306)
Wed Mar 4 09:15:01 2015 - [info] 192.168.19.136(192.168.19.136:3306)
Wed Mar 4 09:15:01 2015 - [info] Alive Slaves:
Wed Mar 4 09:15:01 2015 - [info] 192.168.19.135(192.168.19.135:3306) Version=5.6.19-67.0-log (oldest major version between slaves) log-bin:enabled
Wed Mar 4 09:15:01 2015 - [info] Replicating from 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:15:01 2015 - [info] 192.168.19.136(192.168.19.136:3306) Version=5.6.19-67.0-log (oldest major version between slaves) log-bin:enabled
Wed Mar 4 09:15:01 2015 - [info] Replicating from 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:15:01 2015 - [info] Current Alive Master: 192.168.19.134(192.168.19.134:3306)
Wed Mar 4 09:15:01 2015 - [info] Checking slave configurations..
Wed Mar 4 09:15:01 2015 - [info] Checking replication filtering settings..
Wed Mar 4 09:15:01 2015 - [info] binlog_do_db= , binlog_ignore_db=
Wed Mar 4 09:15:01 2015 - [info] Replication filtering check ok.
Wed Mar 4 09:15:01 2015 - [info] GTID (with auto-pos) is not supported
Wed Mar 4 09:15:01 2015 - [info] Starting SSH connection tests..
Wed Mar 4 09:15:02 2015 - [info] All SSH connection tests passed successfully.
Wed Mar 4 09:15:02 2015 - [info] Checking MHA Node version..
Wed Mar 4 09:15:03 2015 - [info] Version check ok.
Wed Mar 4 09:15:03 2015 - [info] Checking SSH publickey authentication settings on the current master..
Wed Mar 4 09:15:03 2015 - [info] HealthCheck: SSH to 192.168.19.134 is reachable.
Wed Mar 4 09:15:03 2015 - [info] Master MHA Node version is 0.56.
Wed Mar 4 09:15:03 2015 - [info] Checking recovery script configurations on 192.168.19.134(192.168.19.134:3306)..
Wed Mar 4 09:15:03 2015 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/log/mysql3306/binlog --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=binlog.000005
Wed Mar 4 09:15:03 2015 - [info] Connecting to root@192.168.19.134(192.168.19.134:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /log/mysql3306/binlog, up to binlog.000005
Wed Mar 4 09:15:03 2015 - [info] Binlog setting check done.
Wed Mar 4 09:15:03 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Wed Mar 4 09:15:03 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_test' --slave_host=192.168.19.135 --slave_ip=192.168.19.135 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.19-67.0-log --manager_version=0.56 --relay_dir=/log/mysql3306/binlog --current_relay_log=relaylog.3794333 --slave_pass=xxx
Wed Mar 4 09:15:03 2015 - [info] Connecting to root@192.168.19.135(192.168.19.135:22)..
Checking slave recovery environment settings..
Relay log found at /log/mysql3306/binlog, up to relaylog.3794334
Temporary relay log file is /log/mysql3306/binlog/relaylog.3794334
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Mar 4 09:15:03 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_test' --slave_host=192.168.19.136 --slave_ip=192.168.19.136 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.19-67.0-log --manager_version=0.56 --relay_dir=/log/mysql3306/binlog --current_relay_log=relaylog.3794333 --slave_pass=xxx
Wed Mar 4 09:15:03 2015 - [info] Connecting to root@192.168.19.136(192.168.19.136:22)..
Checking slave recovery environment settings..
Relay log found at /log/mysql3306/binlog, up to relaylog.3794334
Temporary relay log file is /log/mysql3306/binlog/relaylog.3794334
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Wed Mar 4 09:15:04 2015 - [info] Slaves settings check done.
Wed Mar 4 09:15:04 2015 - [info]
192.168.19.134(192.168.19.134:3306) (current master)
+--192.168.19.135(192.168.19.135:3306)
+--192.168.19.136(192.168.19.136:3306)
Wed Mar 4 09:15:04 2015 - [warning] master_ip_failover_script is not defined.
Wed Mar 4 09:15:04 2015 - [warning] shutdown_script is not defined.
Wed Mar 4 09:15:04 2015 - [info] Set master ping interval 2 seconds.
Wed Mar 4 09:15:04 2015 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Wed Mar 4 09:15:04 2015 - [info] Starting ping health check on 192.168.19.134(192.168.19.134:3306)..
Wed Mar 4 09:15:04 2015 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

问题和总结

遇到问题:

1
2
3
4
1. 使用的虚拟机 其中一台ssh非常慢,后来发现是 反向解析的问题 
修改 /etc/ssh/sshd_config 的 UseDNS YES 改为 no
2. 本次部署MySQL 环境的时候,进行过一次虚拟机的clone 网络进行了重新设置,db只修改了server_id ,
没有修改uuid ,导致check——repl 的时候,总是会有问题。

MHA的配置还是相对比较简单的,接下来会进行相关场景的模拟测试

文章目录
  1. 1. MHA Overview
    1. 1.1. INSTALLATION
      1. 1.1.1. INIT ENV
      2. 1.1.2. INSTALL MHA
    2. 1.2. CONFIGURE MHA
      1. 1.2.1. configure file
      2. 1.2.2. 配置多个HA组
        1. 1.2.2.1. 配置一个全局的配置文件
        2. 1.2.2.2. HA group 1:
        3. 1.2.2.3. HA group 2 :
    3. 1.3. 验证MHA
      1. 1.3.1. CHECK SSH
      2. 1.3.2. CHECK REPL
    4. 1.4. 问题和总结