Tuesday, 22 March 2016

Active/Passive Cluster with Pacemaker and GFS2 on Centos 7

This time I would like to share with you the procedure that I have followed to setup a Tomcat cluster on Centos 7. This is bit tricky but an hour job if you know the right procedure.

The environment I used was 2 Centos 7 VMs running on VMWare vSphere.
Node1: 192.168.1.10
Node2: 192.168.1.11
VMWare vSphere Server: 192.168.1.15

Here I have used CLVM with GFS2 to store application data that needs to be accesses from both the nodes for successful load balancing or fail-over. For this to function, you will need a shared raw storage such as SAN. However, I don't have a SAN in my test lab hence I used DRBD. You may skip section 2 and 3 if you have shared storage.

Section 1: DNS
Set the host name of the server as per the cluster configuration. Here we use the names as node1 and node2. Set the /etc/hostname with node names in respective servers. Reboot server after change.
Before you begin with cluster setup, make sure the /etc/hosts file is added with the right entries. Pacemaker is highly dependent on name resolution. Therefore, correct entries in the /etc/hosts is a key for the successful configuration. Here is my /etc/hosts looks like.

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.10 node1
192.168.1.10 node2


Section 2: Add additional Hard Disk to both the nodes. These hard disks will be used in DRBD. Do this on both the nodes.


Here is my configuration for the new Hard disk. I am using 16GB in each node.


Section 3: Setup DRBD
Run these commands on both nodes.

$ rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org 
$ rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm 
$ yum install -y kmod-drbd84 drbd84-utils bash-completion
$ fdisk -l
In my case the newly added Hard Disk is detected as /dev/sdb

$ vi /etc/drbd.d/clusterdisk.res
resource clusterdisk {
        protocol C;
        startup {
                become-primary-on both;
        }
        disk {
                fencing resource-and-stonith;
                resync-rate 500M;
        }
        handlers {
                fence-peer              "/usr/lib/drbd/crm-fence-peer.sh";
                after-resync-target     "/usr/lib/drbd/crm-unfence-peer.sh";
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "233hgfghfGHFHGF5665465465465";
                timeout 180;
                ping-int 3;
                ping-timeout 9;
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
        }
        on node1 {
                device /dev/drbd1;
                disk /dev/sdb;
                address 192.168.1.10:7788;
                meta-disk internal;
        }
        on node2 {
                device /dev/drbd1;
                disk /dev/sdb;
                address 192.168.1.11:7788;
                meta-disk internal;
        }
}

$ drbdadm create-md clusterdisk
$ drbdadm up clusterdisk
$ service drbd restart 
Run this command on node1 only.
$ drbdadm primary --force clusterdisk
$ chkconfig drbd on
$ service drbd status
Wait until the status shows UpToDate on both nodes. Note than one node shows "Secondary". This means that you cannot do any file operation in this disk on Node2 until it becomes Primary.
[root@node1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-12 13:27:11
m:res       cs         ro               ds                 p  mounted  fstype
1:clusterdisk  Connected  Primary/Secondary  UpToDate/UpToDate  C
Now go to Node2 and run this command
$ drbdadm primary --force clusterdisk
[root@node1 ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-12 13:27:11
m:res       cs         ro               ds                 p  mounted  fstype
1:clusterdisk  Connected  Primary/Primary  UpToDate/UpToDate  C

Section 4: Configure Pacemaker and CLVM
Run these commands on both the nodes.
$ yum install -y pacemaker pcs psmisc policycoreutils-python lvm2-cluster gfs2-utils yum install fence-agents-all
It is important that we disable SELinus and IPTables during the setup. Any network obstruction will create problems in cluster setup. Note that below command disables SELinux and IPTables temporarily. You will need to create exceptions or disable it completely. Helps are available in google.
$ setenforce 0
$ iptables --flush
$ systemctl start pcsd.service
$ systemctl enable pcsd.service
Set the password for hacluster account in both the nodes. Keep the same password in both the nodes
$ passwd hacluster
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
$ pcs cluster auth node1 node2
Username: hacluster
Password:
node1: Authorized
node2: Authorized
$ pcs cluster setup --name mycluster node1 node2
[root@node1 ~]# pcs cluster start --all
node1: Starting Cluster...
node2: Starting Cluster...

Now you may start creating the resources. First we will be creating a fencing device for STONITH. You may create any fencing device depending on available resources. Here I have created VMWare SOAP fencing device. Run these commands on node1 only:

$ pcs stonith create vmware_soap fence_vmware_soap ipaddr=192.168.1.15 ipport=443 ssl_insecure=1 inet4_only=1 login="root" passwd="vmwareorootpass" action=reboot pcmk_host_list="VM1_node1,VM2_node2" power_wait=3 op monitor interval=60s
In the above command ipaddr is the IP of vSphere Server, login= I have used root login of the vSphere here. It is recommended that you create a seperate user account with minimum permission possible. pcmk_host_list= is the names of the VMs in vSphere.

[root@node1 ~]# pcs status
Cluster name: mycluster
Last updated: Tue Mar 22 16:08:14 2016          Last change: Fri Mar 18 11:33:07 2016 by root via cibadmin on gfs2
Stack: corosync
Current DC: gfs2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 1 resource configured

Online: [ node1 node2 ]

Full list of resources:

 vmware_soap    (stonith:fence_vmware_soap):    Started node1

PCSD Status:
  node1: Online
  node2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
Once the vmware_soap resource is started, you may proceed with creating rest of the resources. Run these commands in node1 only.

$ pcs resource create dlm ocf:pacemaker:controld op monitor interval=30s on-fail=fence clone interleave=true ordered=true
$ pcs resource create clvmd ocf:heartbeat:clvm op monitor interval=30s on-fail=fence clone interleave=true ordered=true
Run 'pcs status' command on any node and wait until the resources are started. Once dlm and clvmd resource is started on both the nodes, you may create the clustered volume.
$ pvcreate /dev/drbd1
$ vgcreate -Ay -cy cluster_vg /dev/drbd1
$ lvcreate -L5G -n cluster_lv cluster_vg
$ mkfs.gfs2 -j2 -p lock_dlm -t mycluster:fs-data /dev/cluster_vg/cluster_lv
The next steps are based on the purpose of the cluster or the application that you want to configure. There are huge number of applications that are supported on pacemaker. I am using Tomcat in my test lab.
For the below resource make sure you use the available free IP from the same subnet as your server LAN.

$ pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=32 op monitor interval=30s
Here I have created a mount point for storing the variable files of my tomcat application. You can use this mount point to store website home folder in case you are planning to use Apache.
$ pcs resource create fs-data Filesystem device="/dev/cluster_vg/cluster_lv" directory="/var/apphome/application-data/varfiles/" fstype="gfs2" "options=noatime" op monitor interval=10s on-fail=fence clone interleave=true
$ pcs resource create tomcat ocf:heartbeat:tomcat params java_home="/opt/java/jre/" catalina_home="/opt/tomcat/" tomcat_user="tomcat" catalina_pid="/opt/tomcat/work/catalina.pid" op monitor interval="30s" on-fail=fence
In the above command, note on the java_home. You will need to change the paths as per actual in your environment. In case you need tomcat to run on both the nodes then just add 'clone interleave=true ordered=true' at the end of the above command. Ensure that you have installed tomcat and jre prior to creating this resource. Help is available in google.

Now lets create constraints so that the resources are started in right order and on the right node.

$ pcs constraint order start dlm-clone then clvmd-clone
$ pcs constraint order start clvmd-clone then fs-data-clone
$ pcs constraint order start fs-data-clone then tomcat

If you need apache then just run the below command. lets create constraints so that the resources are started in right order and on the right node. In case you need tomcat to run on both the nodes then just add 'clone interleave=true ordered=true' at the end of the below command.

$ pcs resource create Apachehttpd ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf op monitor timeout="1m" interval="10"
Now we will need to tell pacem maker to start Apache service on the same node as tomcat. (Doesn't apply if you are using Active/Active cluster).

$ pcs constraint colocation add Apachehttpd with tomcat
If everything goes well, you should be able to see something like this in 'pcs status'.
$ [root@node1 ~]# pcs status
Cluster name: mycluster
Last updated: Tue Mar 22 16:36:08 2016          Last change: Fri Mar 18 11:33:07 2016 by root via cibadmin on gfs2
Stack: corosync
Current DC: gfs2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
2 nodes and 9 resources configured

Online: [ node1 node2 ]

Full list of resources:
 
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started node1
 vmware_soap    (stonith:fence_vmware_soap):    Started node1
 Clone Set: dlm-clone [dlm]
     Started: [ node1 node2 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ node1 node2 ]
 Clone Set: JiraFS-clone [JiraFS]
     Started: [ node1 node2 ]
 JiraService    (ocf::heartbeat:tomcat):        Started node1
 Apachehttpd    (ocf::heartbeat:apache):        Started node1
PCSD Status:
  node1: Online
  node2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
The above setup with DRBD is for testing and not for the production. If you are planing to setup for production make sure you have a decent shared storage. The design is not very stable with DRBD. You should get a good decent shared storage.
To use Apache for tomcat you have 2 options. mod_proxy and mod_jk. Based upon your requirement, you can use the right one and configure accordingly. Help is available in google.