[toc]
高可用mariadb拓扑图
一、设计前提
1、时间同步 # ntpdate 172.16.0.1 或者 # chronyc sources
2、所有的主机对应的IP地址解析可以正常工作, 主机名要与命令#uname -n 所得的结果一致
因此,/etc/hosts中的内容为以下内容
172.16.23.10 node1.rj.com node1
172.16.23.11 node2.rj.com node2
172.16.23.12 node3.rj.com node3
二、环境的建立及安装orosync ,pacemaker ,crmsh
三台机器都安装好ansible (对于ansible集群管理工具而言需要双机互信,其中node1当做堡垒机)
安装配置
# ssh-keygen
# for i in {10..12}; do ssh-copy-id -i 172.16.23.$i ; done ;
# 此条命令将公钥发送给三台机器,其中包括自己也就是堡垒机
# vim /etc/ansible/hosts
[mariadb]
172.16.23.10
172.16.23.11
172.16.23.12
# ansible mariadb -m ping
# 测试三台主机与堡垒机之间的连通性
# vim /etc/hosts 主机名解析配置
172.16.23.10 node1.rj.com node1
172.16.23.11 node2.rj.com node2
172.16.23.12 node3.rj.com node3
# ansible mariadb -m command -a "ntpdate 172.16.0.1" 同步三台主机的时间
# ansible mariadb -m yum -a "name=pacemaker,mariadb-server state=present" 在三台主机上安装 mariadb ,corosync 和 pacemaker
注:yum 安装pacemaker 的时候,其corosync也会自动安装上
corosync配置
# vim /etc/corosync/corosync.conf
加入以下信息
totem {
version: 2
crypto_cipher: aes256
crypto_hash: sha1
interface {
ringnumber: 0
bindnetaddr: 172.16.0.0
mcastaddr: 239.255.23.1
mcastport: 5405
ttl: 1
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
logfile: /var/log/cluster/corosync.log
to_syslog: no
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
quorum {
provider: corosync_votequorum
node {
ring0_addr: node1.rj.com
nodeid:1
}
node {
ring0_addr: node2.rj.com
nodeid:2
}
node {
ring0_addr: node3.rj.com
nodeid:3
}
}
# corosync-keygen
# ansible mariadb -m copy -a "src=/etc/corosync/authkey dest=/etc/corosync/"
# ansible mariadb -m copy -a "src=/etc/corosync/corosync.conf dest=/etc/corosync/"
# ansible mariadb -m service -a "name=corosync state=persent"
# ansible mariadb -m service -a "name=pacemaker state=persent"
# tcpdump -i eno16777736 -nn port 5405
抓包分析
使用tcpdump抓包工具可以来查看三台主机之间传递的心跳信息
注:mariadb在集群资源的配置中必需是开机自启动的
这样corosync才能实别其Unitfile 文件,而不能在配置前启动所以服务一定是关闭的
# ansible mariadb -m service -a "name=mariadb enabled=on"
# ansible mariadb -m service -a "name=corosync enabled=on"
# ansible mariadb -m service -a "name=pacemaker enabled=on"
# mkdir rpm && cd rpm
# wget 172.18.0.1/pub/Sources/7.x86_64/crmsh/crmsh-2.1.4-1.1.x86_64.rpm 下载crmsh及其所依赖的rpm包
# wget 172.18.0.1/pub/Sources/7.x86_64/crmsh/pssh*.rpm
# wget 172.18.0.1/pub/Sources/7.x86_64/crmsh/python-passh*.rpm
# ansible mariadb -m copy -a "src=/root/rpm dest=/root/"
# for i in {1..3}; do ssh node$i yum -y install /root/rpm/*; done
查看corosync引擎是否已经正常启动
# ansible mariadb -e command -a "grep -e 'Corosync Cluster Engine' -e 'configuration file' /var/log/messages"
Feb 13 11:27:35 node1 systemd: Stopped Corosync Cluster Engine.
Feb 13 14:08:03 node1 systemd: Starting Corosync Cluster Engine...
Feb 13 14:08:04 node1 corosync: Starting Corosync Cluster Engine (corosync): [ OK ]
Feb 13 14:08:04 node1 systemd: Started Corosync Cluster Engine.
Feb 13 14:28:16 node1 systemd: Mounted NFSD configuration filesystem.
Feb 13 14:28:44 node1 smartd[787]: Opened configuration file /etc/smartmontools/smartd.conf
Feb 13 14:32:12 node1 systemd: Starting Corosync Cluster Engine...
Feb 13 14:32:13 node1 corosync: Starting Corosync Cluster Engine (corosync): [ OK ]
Feb 13 14:32:13 node1 systemd: Started Corosync Cluster Engine.
Feb 13 14:43:03 node1 systemd: Started Corosync Cluster Engine.
查看其成员之间的结点通知信息是否正常
# ansible mariadb -e command -a "grep TOTEM /var/log/messages"
查看启动过程中是否有错误信息产生
# ansible mariadb -e command -a "grep 'ERROR' /var/log/messages"
查看pacemaker是否已经正常启动
# ansible mariadb -e command -a "grep 'pacemaker' /var/log/messages "
使用以下命令查看结点的状态
# crm status
Last updated: Mon Feb 13 15:43:08 2017
Last change: Mon Feb 13 14:33:58 2017 by hacluster via crmd on node3.rj.com
Stack: corosync Current DC: node3.rj.com (version 1.1.13-10.el7-44eb2dd) - partition with quorum
3 nodes and 0 resources configured Online: [ node1.rj.com node2.rj.com node3.rj.com ]
查看pacemaker 和与corosync所启动的进程
# ansible mariadb -m command -a "ps auxf " | grep pacemaker
root 1720 0.0 1.3 130484 6384 ? Ss 14:33 0:00 /usr/sbin/pacemakerd -f
haclust+ 1729 0.0 2.7 132816 13268 ? Ss 14:33 0:01 \_ /usr/libexec/pacemaker/cib
root 1730 0.0 1.4 133968 6956 ? Ss 14:33 0:00 \_ /usr/libexec/pacemaker/stonithd
root 1731 0.0 0.8 102936 4108 ? Ss 14:33 0:00 \_ /usr/libexec/pacemaker/lrmd
haclust+ 1732 0.0 1.3 124780 6736 ? Ss 14:33 0:00 \_ /usr/libexec/pacemaker/attrd
haclust+ 1733 0.0 0.7 114896 3668 ? Ss 14:33 0:00 \_ /usr/libexec/pacemaker/pengine
haclust+ 1734 0.0 1.5 143160 7484 ? Ss 14:33 0:00 \_ /usr/libexec/pacemaker/crmd
# ansible marriadb -m command -a "ps auxf" | grep corosync
root 1483 0.6 7.9 134848 38436 ? Ssl 14:32 0:28 corosync
三、利用crmsh来配置corosync的IP地址资源及mariadb资源
如果想要查看某种类别下的所用资源代理的列表,可以使用类似以下的命令来实现
安装配置
#crm ra list lsb
#crm ra list ocf heartbeat
#crm ra list ocf pacemaker
#crm ra list ocf stonith
配置vip
# crm configure property stonith-enabled=false #关闭stonsth设备
# crm configure primitive DBIP ocf:heartbeat:IPaddr params ip=172.16.23.23 添加vip资源代理
# crm configure verify 查看是否有错误
# crm configure commit 用来提交配置信息
# crm configure show 可以用来查看配置信息
node 1: node1.rj.com
node 2: node2.rj.com
node 3: node3.rj.com
primitive DBIP IPaddr \
params ip=172.16.23.23
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.13-10.el7-44eb2dd \
cluster-infrastructure=corosync \
stonith-enabled=false
# crm node standby node1.rj.com 当将node1结点成为备用结点时
# ansible mariadb -m command -a "ip addr list" 查看其vip的变化信息
# crm ra info systemd:mairadb 用来查看systemd类型的mariadb资源的语法格式
systemd unit file for mariadb (systemd:mariadb)
MariaDB database server
Operations' defaults (advisory minimum):
start timeout=15
stop timeout=15
status timeout=15
restart timeout=15
monitor timeout=15 interval=15 start-delay=15
vip资源
当node1 # crm node standby node1.rj.com 时
定义mariadbg资源并设定监控
# crm configure primitive MDB systemd:mariadb op start timeout=15s op stop timeout=15s op monitor interval=15s timeout=15s
# crm configure group DBservice DBIP MDB
# crm configure verify
# crm configure commit
# crm configure show
node 1: node1.rj.com \
attributes standby=on
node 2: node2.rj.com \
attributes standby=off
node 3: node3.rj.com \
attributes standby=off
primitive DBIP IPaddr \
params ip=172.16.23.23
primitive MDB systemd:mariadb \
op start timeout=15s interval=0 \
op stop timeout=15s interval=0 \
op monitor interval=15s timeout=15s
group DBservice DBIP MDB
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.13-10.el7-44eb2dd \
cluster-infrastructure=corosync \
stonith-enabled=false \
no-quorum-policy=ignore
为mariadb服务加入数据库资源,并配置远程用户远程接入进行测试
注:此时有一个资源已经启动,但其它的两个mariadb服务启动之后才能进行对数据库的更改
但更改完后旋得再将那两个结点的数据库停止调
node1 node2 node3上执行: # mysql -e "GRANT ALL ON *.* TO 'root'@'%.%.%.%' IDENTIFIED BY 'centos.123';"
node1 # mysql -e "CREATE DATABASE NODE1"
node2 # mysql -e "CREATE DATABASE NODE2"
node3 # mysql -e "CREATE DATABASE NODE3"
启用一台测试机进行测试
node1
# systemctl stop corosync.service pacemaker.service
# 当node1的服务停止时
# systemctl stop corosync.service pacemaker.service
# 当node2的服务停止时