RHCA436-基于CentOS8pacemaker+corosync 集群搭建
集群组件:
packmaker 集群软件
corosync 集群内部通讯的组件
pcs 命令行工具
pcsd 集群后台服务
注意:
1、主机名解析
集群之间相互访问,可以用ip可以用主机名,主机名最好用hosts列表提供,不要用dns提供,因为DNS可能down。该主机名也只是集群内部访问用,不需要对外。
dns如果宕机或者响应慢,则集群可能认为心跳有问题
[root@nodea ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
#外部网络
172.25.254.254 classroom.example.com classroom
172.25.254.254 content.example.com content
172.25.254.254 materials.example.com materials
172.25.250.9 workstation.lab.example.com workstation
172.25.250.254 bastion.lab.example.com bastion
172.25.250.10 nodea.lab.example.com nodea
172.25.250.11 nodeb.lab.example.com nodeb
172.25.250.12 nodec.lab.example.com nodec
172.25.250.13 noded.lab.example.com noded
172.25.250.15 storage.lab.example.com storage
#内部网络,外部无法ping通
192.168.0.10 nodea.private.example.com
192.168.0.11 nodeb.private.example.com
192.168.0.12 nodec.private.example.com
192.168.0.13 noded.private.example.com
#内部网络,外部无法ping通
192.168.1.10 nodea.san01.example.com
192.168.1.11 nodeb.san01.example.com
192.168.1.12 nodec.san01.example.com
192.168.1.13 noded.san01.example.com
192.168.1.15 storage.san01.example.com
#内部网络,外部无法ping通
192.168.2.10 nodea.san02.example.com
192.168.2.11 nodeb.san02.example.com
192.168.2.12 nodec.san02.example.com
192.168.2.13 noded.san02.example.com
192.168.2.15 storage.san02.example.com
# Service/VIP addresses
172.25.250.80 www.lab.example.com
172.25.250.81 db.lab.example.com
172.25.250.82 nfs.lab.example.com
# BMC IP addresses per node (for fencing)
192.168.0.101 bmc-nodea.private.example.com bmc-nodea
192.168.0.102 bmc-nodeb.private.example.com bmc-nodeb
192.168.0.103 bmc-nodec.private.example.com bmc-nodec
192.168.0.104 bmc-noded.private.example.com bmc-noded
192.168.0.100 bmc-chassis.private.example.com bmc-chassis
2、yum源配置
[root@nodea yum.repos.d]# ls
ha-sap.repo redhat.repo rhel-dvd.repo rhel-updates.repo
[root@nodea yum.repos.d]# cat ha-sap.repo #集群软件
rhel-dvd.repo #自带yum源
rhel-updates.repo #待更新的软件包
ha-sap.repo #集群软件
#集群软件:http://172.25.254.250/rhel8.3/x86_64/rhel8-additional/rhel-8-for-x86_64-highavailability-rpms/
#存储软件包:http://content.example.com/rhel8.3/x86_64/rhel8-additional/rhel-8-for-x86_64-sap-netweaver-rpms
3、创建集群
- 在所有集群节点安装,实验可以选择2-3台机器进行实验,此实验选择nodea,nodeb和nodec
[root@nodea ~]# yum -y install pcs fence-agents-all.x86_64
放行防火墙,在所有节点执行
集群之间心跳或者传递数据都需要放行防火墙
[root@nodeb ~]# firewall-cmd --permanent --add-service=high-availability
success
[root@nodeb ~]# firewall-cmd --reload
success
- 启动服务,在所有节点执行
[root@nodea ~]# systemctl enable --now pcsd
- 给用户hacluster设置密码,在所有节点执行
[root@nodea ~]# echo redhat | passwd --stdin hacluster
- 集群之间相互访问需要进行认证,认证后才能加入集群
[root@nodea ~]# pcs host auth nodea.private.example.com nodeb.private.example.com
Username: hacluster
Password:
nodeb.private.example.com: Authorized
nodea.private.example.com: Authorized
#可以用-u username -p password指定用户名和密码
- 启动集群,任意节点执行
[root@nodea ~]# pcs cluster setup mucluster --start nodea.private.example.com nodeb.private.example.com
No addresses specified for host 'nodea.private.example.com', using 'nodea.private.example.com'
No addresses specified for host 'nodeb.private.example.com', using 'nodeb.private.example.com'
Destroying cluster on hosts: 'nodea.private.example.com', 'nodeb.private.example.com'...
nodea.private.example.com: Successfully destroyed cluster
nodeb.private.example.com: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'nodea.private.example.com', 'nodeb.private.example.com'
nodea.private.example.com: successful removal of the file 'pcsd settings'
nodeb.private.example.com: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'nodea.private.example.com', 'nodeb.private.example.com'
nodea.private.example.com: successful distribution of the file 'corosync authkey'
nodea.private.example.com: successful distribution of the file 'pacemaker authkey'
nodeb.private.example.com: successful distribution of the file 'corosync authkey'
nodeb.private.example.com: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'nodea.private.example.com', 'nodeb.private.example.com'
nodea.private.example.com: successful distribution of the file 'corosync.conf'
nodeb.private.example.com: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
Starting cluster on hosts: 'nodea.private.example.com', 'nodeb.private.example.com'...
- 设置机器开机自动加入集群
[root@nodea ~]# pcs cluster enable --all
nodea.private.example.com: Cluster Enabled
nodeb.private.example.com: Cluster Enabled
- 查看集群状态
[root@nodea ~]# pcs cluster status
Cluster Status:
Cluster Summary:
* Stack: corosync
* Current DC: nodeb.private.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
* Last updated: Wed Feb 9 12:29:19 2022
* Last change: Wed Feb 9 12:26:27 2022 by hacluster via crmd on nodeb.private.example.com
* 2 nodes configured
* 0 resource instances configured
Node List:
* Online: [ nodea.private.example.com nodeb.private.example.com ]
PCSD Status:
nodea.private.example.com: Online
nodeb.private.example.com: Online
说明:如果在不同的节点看到的状态不一样,比如在nodea上看nodea是online,nodeb是offline,在nodeb上查看nodeb是online,nodea是offline,则集群脑裂了。
[root@nodea ~]# pcs status
Cluster name: mucluster
WARNINGS:
No stonith devices and stonith-enabled is not false
Cluster Summary:
* Stack: corosync
* Current DC: nodeb.private.example.com (version 2.0.4-6.el8-2deceaa3ae) - partition with quorum
* Last updated: Wed Feb 9 12:29:56 2022
* Last change: Wed Feb 9 12:26:27 2022 by hacluster via crmd on nodeb.private.example.com
* 2 nodes configured
* 0 resource instances configured
Node List:
* Online: [ nodea.private.example.com nodeb.private.example.com ]
Full List of Resources:
* No resources
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
- 启动集群
[root@nodea ~]# pcs cluster start --all
nodea.private.example.com: Starting Cluster...
nodec.private.example.com: Starting Cluster...
nodeb.private.example.com: Starting Cluster...
- 停止集群
[root@nodea ~]# pcs cluster stop --all
nodea.private.example.com: Stopping Cluster (pacemaker)...
nodec.private.example.com: Stopping Cluster (pacemaker)...
nodeb.private.example.com: Stopping Cluster (pacemaker)...
nodea.private.example.com: Stopping Cluster (corosync)...
nodeb.private.example.com: Stopping Cluster (corosync)...
nodec.private.example.com: Stopping Cluster (corosync)...