Installing Distributed Replicated Block Device -- DRBD in CentOS 7

replication

Distributed Replicated Block Device commonly called as DRBD is a distributed replicated storage system. This is used in Linux platform. It is a distributed replication (RAID 1) via network.

Listed below are the steps involved in creating system in which a partition in two hosts are replicated

Lab setup:

  • Two CentOS linux hosts (hostnames to be DRBD1 and DRBD2 for example). The result of hostname command in both the hosts should return the same
  • Both systems have a partition /dev/sda6 that will store data that will be replicated.
  • Hosts has two NICs. One connected to local network, and other to a separate network (drbd vlan)
  • Both the hosts has hostnames of both nodes in /etc/hosts file to connect to each other.
  • IPs of NICs connected to local lan are 20.10.0.29 and 20.10.0.30
  • IPs of NICs connected to DRBD vlans are 192.168.0.1 and 192.168.0.2
  • Ensure SeLinux is disabled

Steps:

Install elrepo in both the nodes in the system as DRBD packages does not come in
]# rpm -ivh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

Ensure the kernel is up-to-date in both the hosts
]# yum update kernel -y

 

Note:

Install the below packages in both the nodes
]# yum -y install drbd84-utils kmod-drbd84
Note: Some version of the drbd will not work well with certain new kernel. So we need to find the correct match and get the correct version of kmod-drbd84. For example “kmod-drbd84-8.4.7” instead of “kmod-drbd84”

Load the module in the kernel in both the nodes
]# /sbin/modprobe drbd

Verify if the module is loaded in both the hosts:
]# lsmod | grep drbd
drbd                  405309  0
libcrc32c              12644  2 xfs,drbd

Create a resource file for the clustered device in the first hosts:
DRBD1]# vi /etc/drbd.d/data1.res

resource data1 {
protocol C;
on DRBD1 {
                device /dev/drbd0;
                disk /dev/sda6;
                address 192.168.0.1:7788;
                meta-disk internal;
        }
on DRBD2 {
                device /dev/drbd0;
                disk /dev/sda6;
                address 192.168.0.2:7788;
                meta-disk internal;
        }
}

Copy the same file in the second host too.
DRBD1]# scp /etc/drbd.d/data1.res root@DRBD2:/etc/drbd.d/data1.res

Initialize the disks in both the nodes one by one
]# drbdadm create-md data1

You should see an output as below:

--==  Thank you for participating in the global usage survey  ==--
The server’s response is:
you are the 10680th user to install this version
initializing activity log
NOT initializing bitmap
Writing meta data…
New drbd meta data block successfully created.
success

Start DRBD in both the hosts:
]# systemctl start drbd
]# systemctl enable drbd

Both the nodes will be configured as secondary by default. Check this in both the hosts:
]# cat /proc/drbd
]# drbd-overview

Note: Do not panic if you see the message “cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r—--”

Now you should make one node the primary. Let us do this in node 1:
DRBD1]# drbdadm primary data1
If the above does shows error, you may want to try:
drbdadm primary data1 --force

Now check the status in both the host:
]# cat /proc/drbd
Identify which is Primary and Sencondary

Format the drbd partition in the PRIMARY node.
DRBD1]# mkfs.xfs /dev/drbd0

Create a folder in DRBD1 and mount the partition to it
DRBD1]# mkdir /data
DRBD1]# mount /dev/drbd0 /data
DRBD1]# touch /data/test1.txt

Now let us check if the file got replicated to DRBD2 node.
DRBD1]# umount /data
DRBD1]# drbdadm secondary data1

Perform the below in 2nd node
DRBD2]# drbdadm primary data1
DRBD2]# mkdir /data
DRBD2]# mount /dev/drbd0 /data
DRBD2]# ls -l /data

The file created in 1st node should be present here.

 

Manual split brain recovery

==================

At times, if there is a bad disturbance in the network and if both the node thinks they have the latest data, it gets into a split brain state.

This can be identified if the “cat /proc/drbd” gives a response with “Secondary/Unknown” or “Unknown/Primary” in the nodes.

Here we have to identify whch is the host that has the latest data and decide that as the primary. Let us say in this example DRBD2 has the latest data.Then we should do the following in the hosts:

DRBD1]# drbdadm secondary data1
DRBD1]# drbdadm — --discard-my-data connect data1

DRBD2]# drbdadm connect resource

==== ====== ====== === OR =========

In the one decided to have older data, run the following commands after umnounting the drbd0 device:

drbdadm secondary all
drbdadm disconnect all
drbdadm -- --discard-my-data connect all

In the one to be primary, run the following:

drbdadm primary all
drbdadm disconnect all
drbdadm connect all

======== ============ ======

There could be situation where DRBD is not enabled to start during boot-up, and have shutdown both hosts. When trying to boot up, if one node does not come up, we will not be able to start DRBD in the other node. In this case we have to do the following:

DRBD2]# modprobe drbd
DRBD2]# drbdadm create-md data1
DRBD2]# drbdadm up data1
DRBD2]# drbdadm primary data1 (or drbdadm primary data1 --force)