Linux Pacemaker – the new HA (High Availability)


In the world of running critical application, where we cannot afford downtime caused by hardware failures, it becomes necessary to run cluster of hardware, where the application can be automatically migrated to another running hardware, in case the initial hardware fails due to whatever reason.

  • Pacemaker is a high-availability cluster resource manager, that enables the following:
    Constantly check the status of all the hardware systems running on the cluster.
  • Start configured application (eg: httpd) in one of the running hardware based on the preference mentioned in the cluster configuration file.
  • Move application to another hardware in case the present hardware becomes unavailable due to reasons such as hardware failure, network failure, unavailability of the node due to resource (memory, CPU, etc) over-consumption, etc.
  • Enable administrator to move application to a different hardware in a seamless manner

Pacemaker can be considered as new version on old HA application that was available in older Linux versions.

Pacemaker – Installation procedure to create a TWO node cluster

Install the application in both nodes
[root@star ~]# yum install pacemaker pcs fence-agents -y


Verify the hacluster user is created in both nodes

[root@star ~]# cat /etc/passwd | grep  hacluster


Set a password for the hacluster used (eg: mypass) in both nodes

[root@star ~]# passwd hacluster


Start the pacemaker applications in both nodes
[root@star ~]#  systemctl start pcsd
[root@star ~]#  systemctl enable pcsd


As we will have two nodes in this cluster, and will be addressed as ha_node_1 and ha_node_2 in this example, we need to add the DNS entry in the local DNS file.

[root@star ~]#  vi /etc/hosts ha_node_1 ha_node_2


Disable NetworkManager during startup in both the nodes

[root@star ~]#  systemctl disable NetworkManager


Run the following commands in ha_node_1

[root@star ~]# pcs cluster auth ha_node_1 ha_node_2 -u hacluster

[root@star ~]# [root@localhost ~]# pcs cluster setup –name Cluster ha_node_1 ha_node_2

Destroying cluster on nodes: ha_node_1, ha_node_2…
ha_node_1: Stopping Cluster (pacemaker)…
ha_node_2: Stopping Cluster (pacemaker)…
ha_node_2: Successfully destroyed cluster
ha_node_1: Successfully destroyed cluster

Sending ‘pacemaker_remote authkey’ to ‘ha_node_1’, ‘ha_node_2’
ha_node_1: successful distribution of the file ‘pacemaker_remote authkey’
ha_node_2: successful distribution of the file ‘pacemaker_remote authkey’
Sending cluster config files to the nodes…
ha_node_1: Succeeded
ha_node_2: Succeeded

Synchronizing pcsd certificates on nodes ha_node_1, ha_node_2…
ha_node_1: Success
ha_node_2: Success
Restarting pcsd on the nodes in order to reload the certificates…
ha_node_1: Success
ha_node_2: Success
[root@localhost ~]#


Verify in both nodes:
[root@star ~]# cat /etc/corosync/corosync.conf


Run the command in ha_node_1

[root@star ~]# pcs cluster start –all
ha_node_1: Starting Cluster (corosync)…
ha_node_2: Starting Cluster (corosync)…
ha_node_1: Starting Cluster (pacemaker)…
ha_node_2: Starting Cluster (pacemaker)…


Check the status in both nodes
[root@star ~]# pcs status

====== =======
Cluster name: Cluster

No stonith devices and stonith-enabled is not false

Stack: corosync
Current DC: ha_node_2 (version 1.1.21-4.el7-f14e36fd43) – partition with quorum
Last updated: Mon Aug 24 13:37:59 2020
Last change: Mon Aug 24 13:37:48 2020 by hacluster via crmd on ha_node_2

2 nodes configured
0 resources configured

Online: [ ha_node_1 ha_node_2 ]

No resources
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
[root@localhost ~]#
pcs cluster enable -all

====== =======
Now configure the cluster with apache and virtual IP. Run command in 1st node

[root@star ~]# pcs resource create VirtIP IPAddr ip= cidr_netmask=24 op monitor interval=30
Assumed agent name ‘ocf:heartbeat:IPaddr’ (deduced from ‘IPAddr’)
[root@localhost ~]#
[root@star ~]# pcs resource create Httpd apache configuration=”/etc/httpd/conf/httpd.conf” op monitor interval=30

If the above does not work we can try this command
[root@star ~]#  pcs resource create Httpd apache configuration=”/etc/httpd/conf/httpd.conf” op monitor interval=30 –force

If apache httpd is already installed you may try the below

[root@star ~]#  pcs resource update Httpd apache configfile=”/etc/httpd/conf/httpd.conf” op monitor interval=30 –force


Verify the status of the Virtual IP and httpd

[root@star ~]#  pcs status resources

Now try accessing the web server by accessing the URL with the virtual IP. You may want to try switching off each node one at a time while the other node is running. The website should be accessible immaterial of the node that is dowm