Linux

Dual Primary DRBD on CentOS 6: GFS2 & Pacemaker

by Justin Silver · Published February 27, 2014 · Updated February 8, 2018

This guide describes how to create a pair of redundant file servers using DRBD for replication, RedHat GFS2 (Global File System), and Pacemaker for cluster management. In this case we are also using RackSpace Cloud Servers and associated OpenStack features, so we will use the nova client to create the networks, servers, and storage before logging on to finish the configuration.

Once completed you will have dual primary DRBD configuration which allows reads and writes to both nodes at the same time – enabling load balanced NFS for example.

Network Architecture

This is a rough diagram of the network architecture. Each file server is attache to a block storage device, and the file servers will synchronize their disks using DRBD on a private storage network. On a different internal network the NFS servers are fronted by a load balancer (or multiple) which is then attached to your servers. Those servers are then fronted by load balancers, firewalls, etc, or serve content to your internal network. You could also have clients connect directly to the NFS share exposed on the load balancer.

Create RackSpace Cloud Network

First we will need to create a private Cloud Network that we can dedicate to replication. We will call the network “storage-replication” and give it 192.168.1.x. You will need to capture the id that is returned so that you can attach it to a Cloud Server.

nova network-create storage-replication 192.168.1.0/24

+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| cidr     | 192.168.1.0/24                       |
| id       | 7c99ba74-c28c-4c52-9c5a-xxxxxxxxxxxx |
| label    | storage-replication                  |
+----------+--------------------------------------+

Create RackSpace Cloud Servers

Now that we have a network setup, let’s create two servers and assign them to it. You will need to specify an --image (the one below is for CentOS 6.5 PVHVM) however you can get a list of them using nova image-list. We are going to use a 4GB standard Cloud Server, but again you can use a different --flavor value from the results of nova flavor-list. The --nic net-id is where you place your private network id, and the --file option lets us insert an SSH key so we don’t have to bother with the password after the server is built. We put the script to sleep for 30 seconds to encourage the VM to be provisioned on a different hypervisor to provide additional redundancy. When all is said and done we have two new cloud servers named fileserver-1 and fileserver-2, and once again you should capture the id‘s for later use.

for i in 1 2; \
	do nova boot \
	--image 41e59c5f-530b-423c-86ec-13b23de49288 \
	--flavor 5 \
	--nic net-id=7c99ba74-c28c-4c52-9c5a-xxxxxxxxxxxx \
	--file /root/.ssh/authorized_keys=/Users/justinsilver/.ssh/id_dsa.pub \
	fileserver-${i}; \
	sleep 30; \
done

+------------------------+--------------------------------------+
| Property               | Value                                |
+------------------------+--------------------------------------+
| status                 | BUILD                                |
| updated                | 2014-02-26T08:16:59Z                 |
| OS-EXT-STS:task_state  | scheduling                           |
| key_name               | None                                 |
| image                  | CentOS 6.5 (PVHVM)                   |
| hostId                 |                                      |
| OS-EXT-STS:vm_state    | building                             |
| flavor                 | 4GB Standard Instance                |
| id                     | 69f66617-44e2-4cbc-8a34-xxxxxxxxxxxx |
| user_id                | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx     |
| name                   | fileserver-1                         |
| adminPass              | xxxxxxxxxxxx                         |
| tenant_id              | xxxxxx                               |
| created                | 2014-02-26T08:16:58Z                 |
| OS-DCF:diskConfig      | MANUAL                               |
| accessIPv4             |                                      |
| accessIPv6             |                                      |
| progress               | 0                                    |
| OS-EXT-STS:power_state | 0                                    |
| config_drive           |                                      |
| metadata               | {}                                   |
+------------------------+--------------------------------------+
+------------------------+--------------------------------------+
| Property               | Value                                |
+------------------------+--------------------------------------+
| status                 | BUILD                                |
| updated                | 2014-02-26T08:17:32Z                 |
| OS-EXT-STS:task_state  | scheduling                           |
| key_name               | None                                 |
| image                  | CentOS 6.5 (PVHVM)                   |
| hostId                 |                                      |
| OS-EXT-STS:vm_state    | building                             |
| flavor                 | 4GB Standard Instance                |
| id                     | 5f0a5c3e-0dfa-4583-bddc-xxxxxxxxxxxx |
| user_id                | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx     |
| name                   | fileserver-2                         |
| adminPass              | xxxxxxxxxxxx                         |
| tenant_id              | xxxxxx                               |
| created                | 2014-02-26T08:17:31Z                 |
| OS-DCF:diskConfig      | MANUAL                               |
| accessIPv4             |                                      |
| accessIPv6             |                                      |
| progress               | 0                                    |
| OS-EXT-STS:power_state | 0                                    |
| config_drive           |                                      |
| metadata               | {}                                   |
+------------------------+--------------------------------------+

Create Cloud Block Storage

Next we can create a pair of Cloud Block Storage devices to attach to our newly created Cloud Servers. We are going to use SSD instead of SATA for improved read performance and size it to 200GB – you can choose whatever value you need for your purposes – and remember that since we will be using LVM we can resize later though some care should be take up front. Again pay attention to the id‘s that are returned so that we can attach them to the Cloud Servers.

for i in 1 2; \
	do nova volume-create \
	--display-name fileserver-${i} \
	--volume-type SSD 200; \
done;

+---------------------+--------------------------------------+
| Property            | Value                                |
+---------------------+--------------------------------------+
| status              | available                            |
| display_name        | fileserver-1                         |
| attachments         | []                                   |
| availability_zone   | nova                                 |
| bootable            | false                                |
| created_at          | 2014-02-26T07:11:37.000000           |
| display_description | None                                 |
| volume_type         | SSD                                  |
| snapshot_id         | None                                 |
| source_volid        | None                                 |
| size                | 200                                  |
| id                  | db75fdd8-da9f-48df-861a-xxxxxxxxxxxx |
| metadata            | {}                                   |
+---------------------+--------------------------------------+
+---------------------+--------------------------------------+
| Property            | Value                                |
+---------------------+--------------------------------------+
| status              | available                            |
| display_name        | fileserver-2                         |
| attachments         | []                                   |
| availability_zone   | nova                                 |
| bootable            | false                                |
| created_at          | 2014-02-26T07:11:40.000000           |
| display_description | None                                 |
| volume_type         | SSD                                  |
| snapshot_id         | None                                 |
| source_volid        | None                                 |
| size                | 200                                  |
| id                  | 28a2905e-49e0-426c-8b33-xxxxxxxxxxxx |
| metadata            | {}                                   |
+---------------------+--------------------------------------+

Now use the ID’s of the Cloud Servers and Cloud Block Storage to attach the storage device to the appropriate server.

nova volume-attach 69f66617-44e2-4cbc-8a34-xxxxxxxxxxxx db75fdd8-da9f-48df-861a-xxxxxxxxxxxx /dev/xvdb && \
nova volume-attach 5f0a5c3e-0dfa-4583-bddc-xxxxxxxxxxxx 28a2905e-49e0-426c-8b33-xxxxxxxxxxxx /dev/xvdb

+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| device   | /dev/xvdb                            |
| serverId | 69f66617-44e2-4cbc-8a34-xxxxxxxxxxxx |
| id       | db75fdd8-da9f-48df-861a-xxxxxxxxxxxx |
| volumeId | db75fdd8-da9f-48df-861a-xxxxxxxxxxxx |
+----------+--------------------------------------+
+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| device   | /dev/xvdb                            |
| serverId | 5f0a5c3e-0dfa-4583-bddc-xxxxxxxxxxxx |
| id       | 28a2905e-49e0-426c-8b33-xxxxxxxxxxxx |
| volumeId | 28a2905e-49e0-426c-8b33-xxxxxxxxxxxx |
+----------+--------------------------------------+

DRBD Cloud Server Configuration

Once the servers are provisioned, connect to them both to continue the configuration. Since this is a dual primary DRBD configuration you will eventually be able to read and write to both, however for the initial sync we will use fileserver-1 as the primary.

As these are brand new servers it’s recommended to fetch all updates, allow traffic on your private network via iptables and reboot in case there were any kernel patches. Make sure to save your iptables changes or they will be lost after rebooting.

DRBD is in the EL repository which is not included by default but it can be fetched via an RPM. Once this RPM has been installed you can use yum to install both the DRBD driver and utils.

rpm -ivh http://elrepo.org/elrepo-release-6-5.el6.elrepo.noarch.rpm
yum -y update
iptables -A INPUT -m iprange --src-range 192.168.1.1-192.168.1.255 -j ACCEPT
service iptables save
yum install -y kmod-drbd84 drbd84-utils
modprobe drbd

Protect yourself from DNS outages, lookups, and the pain of using the wrong interface by assigning a hostname in the hosts file of each server. By entering the hostnames here we can ensure that corosync and cman use the correct interface for cluster management.

192.168.1.1 fileserver-1
192.168.1.2 fileserver-2

Create a partition on both hosts of the same size. You can use the full Cloud Block Storage or a piece of it (hint: 1GB will sync a lot faster if you’re just doing a test) but for the rest of this guide make sure to assign it to /dev/xvdb1.

In this guide we are going to use LVM to manage the volume so that we can shrink/grow it as needed. To do this you will need to press “T” after the partition is created and then choose “8e” for “Linux LVM”. Press “w” to save and exit or just “q” to exit and lose your changes.

fdisk /dev/xvdb

Disk /dev/xvdb: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1568dcd9

	Device Boot      Start         End      Blocks   Id  System
/dev/xvdb1               1       26108   209712478+  8e  Linux LVM

pvcreate /dev/xvdb1
	Physical volume "/dev/xvdb1" successfully created
vgcreate fileserver /dev/xvdb1
	Volume group "fileserver" successfully created
lvcreate --name r0 --size 50G fileserver
	Logical volume "r0" created

Now we can create a DRBD resource called r0 using a configuration file called /etc/drbd.d/r0.res. Copy this file onto both nodes and make sure it is the same including whitespace. For the “on” directive you need to use the actual hostname of the server that was entered in /etc/hosts.

resource r0 {
        protocol C;
        startup {
        	become-primary-on both;
        }
        disk {
                fencing resource-and-stonith;
        }
        handlers {
                fence-peer              "/usr/lib/drbd/crm-fence-peer.sh";
                after-resync-target     "/usr/lib/drbd/crm-unfence-peer.sh";
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "DRBD Super Secret Password";
                timeout 180;
                ping-int 3;
                ping-timeout 9;
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
        }
        on fileserver-1 {
                device /dev/drbd1;
                disk /dev/fileserver/r0;
                address 192.168.1.1:7788;
                meta-disk internal;
        }
        on fileserver-2 {
                device /dev/drbd1;
                disk /dev/fileserver/r0;
                address 192.168.1.2:7788;
                meta-disk internal;
        }
}

We are now ready to start the DRBD service, but we want to make sure it doesn’t start on boot as it will be managed by pacemaker.

chkconfig drbd off
service drbd start
drbdadm create-md r0

Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success

From fileserver-1 only run the following command to force synchronization for its disk. Check the status and wait until it is 100% complete before continuing.

drbdadm primary --force r0
service drbd status

drbd driver loaded OK; device status:
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
m:res          cs          ro               ds                     p  mounted  fstype
...            sync'ed:    18.3%            (33656/41188)M
1:r0           SyncSource  Primary/Secondary UpToDate/Inconsistent  C

Now we are ready to promote both DRBD nodes to primary status. You can check the status again and once the sync has completed you are ready to move on.

drbdadm adjust r0
drbdadm primary r0
service drbd status

drbd driver loaded OK; device status:
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
m:res          cs         ro               ds                 p  mounted  fstype
1:r0           Connected  Primary/Primary  UpToDate/UpToDate  C

Create Clustered Filesystem

Now that DRBD is configured we need to install a clustered filesystem – GFS2 in this case. It handles the details of your filesystem being written and read on multiple nodes as the same time without getting trashed. Using EXT3/4 for example just won’t work properly.

yum -y install gfs2-utils cman pacemaker pacemaker-cli fence-agents resource-agents openais

We are now going to create a /etc/cluster/cluster.conf configuration for a cluster named “pacemaker” (note the max length is 15 characters for corosync.conf). Because DRBD only supports two nodes we can’t reach a quorum in the traditional sense – we must set the special two_node="1" parameter that lets the cluster reach quorum even if a node has failed. Specify that we want to use pacemaker for fencing and we should be good to go. Note the fileserver-1 and fileserver-2 from /etc/hosts.

<?xml version="1.0"?>
<cluster config_version="1" name="pacemaker">
    <cman two_node="1" expected_votes="1"/>
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
    <clusternodes>
        <clusternode name="fileserver-1" nodeid="1" votes="1">
            <fence>
                <method name="pcmk-redirect">
                    <device name="pcmk" port="fileserver-1"/>
                </method>
            </fence>
        </clusternode>
        <clusternode name="fileserver-2" nodeid="2" votes="1">
            <fence>
                <method name="pcmk-redirect">
                    <device name="pcmk" port="fileserver-2"/>
                </method>
            </fence>
        </clusternode>
    </clusternodes>
    <fencedevices>
        <fencedevice name="pcmk" agent="fence_pcmk"/>
    </fencedevices>
</cluster>

This is a fairly standard /etc/corosync/corosync.conf configuration however I had to remove the interface > mcastaddress parameter and replace it with broadcast: yes to get the nodes to see each other. We want to bind to 192.168.1.0.

totem {
    version: 2
    token: 3000
    token_retransmits_before_loss_const: 10
    join: 60
    consensus: 3600
    vsftype: none
    max_messages: 20
    clear_node_high_bit: yes
    secauth: off
    threads: 0
    rrp_mode: none
    interface {
        ringnumber: 0
        bindnetaddr: 192.168.1.0
        broadcast: yes
        mcastport: 5405
    }
}
amf {
    mode: disabled
}
service {
    ver:       1
    name:      pacemaker
}
aisexec {
    user:   root
    group:  root
}
logging {
    fileline: off
    to_stderr: yes
    to_logfile: no
    to_syslog: yes
    syslog_facility: daemon
    debug: off
    timestamp: on
    logger_subsys {
        subsys: AMF
        debug: off
        tags: enter|leave|trace1|trace2|trace3|trace4|trace6
    }
}

Starting pacemaker on one node at a time will allow you to monitor the status with cman_tool status and cman_tool nodes. Once everything is started up we can create the GFS2 filesystem on the DRBD resource. Make sure to specify -p lock_dlm for locking management and one journal per node with -j 2. Mounting the devices with -o noatime,nodiratime will give us a performance boost as we don’t really care about access times.

chkconfig pacemaker on
service pacemaker start
mkfs.gfs2 -t pacemaker:storage -p lock_dlm -j 2 /dev/drbd1
mkdir -p /mnt/storage
mount -t gfs2 -o noatime,nodiratime /dev/drbd1 /mnt/storage

STONITH: Shoot The Other Node In The Head

STONITH is not a suggestion when dealing with Dual Primaries, it is an absolute requirement.

– LINBIT support

So now that we have everything up and running this is the not so small task of making sure we are able to deliver uncorrupted data. What happens when the synchronization is lost between nodes? How can we ensure that the misbehaving node is taken out of service? This is what fencing and STONITH accomplish – making sure that data isn’t written to the good nodes until the bad node is out of service, ensuring data integrity.

Install CRM Shell

We are going to use the CRM Shell to control Pacemaker, and we can get it from yum if we install the proper repository.

cd /etc/yum.repos.d/
wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
yum -y install crmsh graphviz

crm configure
property stonith-enabled="false"
commit
exit

crm configure
primitive p_drbd_r1 ocf:linbit:drbd \
	params drbd_resource="drbd1" \
	op start interval="0" timeout="240" \
	op stop interval="0" timeout="100" \
	op monitor interval="29s" role="Master" \
	op monitor interval="31s" role="Slave"
ms ms_drbd_r1 p_drbd_r1 \
	meta master-max="2" \
	master-node-max="1" \
	clone-max="2" \
	clone-node-max="1" \
	notify="true"
primitive p_fs_r1 ocf:heartbeat:Filesystem \
	params device="/dev/drbd1" \
	directory="/mnt/storage" \
	fstype="gfs2" \
	op start interval="0" timeout="60" \
	op stop interval="0" timeout="60" \
	op monitor interval="60" timeout="40"
clone cl_fs_r1 p_fs_r1 meta interleave="true"
colocation co_fs_with_drbd inf: cl_fs_r1 ms_drbd_r1:Master
order o_drbd_before_fs inf: ms_drbd_r1:promote cl_fs_r1
primitive stonith_fence_virsh_fileserver1 stonith:fence_virsh \
	params action="reboot" ipaddr="192.168.1.1" \
	login="root" identity_file="/root/.ssh/id_rsa.pub" \
	port="fileserver-1"
primitive stonith_fence_virsh_fileserver2 stonith:fence_virsh \
	params action="reboot" ipaddr="192.168.1.2" \
	login="root" identity_file="/root/.ssh/id_rsa.pub" \
	port="fileserver-2"
location l_stonith_fence_virsh_machine1_noton_fileserver1 \
	stonith_fence_virsh_fileserver1 -inf: fileserver-1
location l_stonith_fence_virsh_machine1_noton_fileserver2 \
	stonith_fence_virsh_fileserver2 -inf: fileserver-2
property stonith-enabled="true"
commit
exit

Validate the Cluster

Check to make sure that everything is good to go. Now is a good time to make sure things are actually failing over properly before moving to production.

[root@fileserver-1]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
m:res    cs         ro               ds                 p  mounted           fstype
1:r0     Connected  Primary/Primary  UpToDate/UpToDate  C  /mnt/storage      gfs2

[root@fileserver-1]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M    152   2014-02-27 22:23:29  fileserver-1
   2   M     76   2014-02-27 22:23:29  fileserver-2

[root@fileserver-2]# cman_tool status
Version: 6.2.0
Config Version: 3
Cluster Name: pacemaker
Cluster Id: 62570
Cluster Member: Yes
Cluster Generation: 152
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Node votes: 1
Quorum: 1
Active subsystems: 8
Flags: 2node
Ports Bound: 0
Node name: fileserver-1
Node ID: 1
Multicast addresses: 239.192.244.95
Node addresses: 192.168.1.1

[root@fileserver-1]# fence_tool ls
fence domain
member count  2
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2

NFS Server

yum -y install nfs*
service rpcbind start
chkconfig rpcbind on
service nfs start
chkconfig nfs on

/mnt/storage     192.168.1.0/24(rw,async,no_root_squash,no_all_squash,no_subtree_check)

/mnt/storage – NFS mount point

192.168.1.0/24 – IP range of clients

rw – read / write access

async – faster asynchronous writes

no_root_squash – allow root

no_all_squash – allow user

no_subtree_check – increases performance but lowers security by preventing parent directory permissions to be checked when accessing shares.

mkdir -p /mnt/storage
mount -t nfs -o rsize=32768,wsize=32768,noatime,nodiratime,async fileserver-1:/mnt/storage /mnt/storage/

Tags: CentOS CentOS 6 cluster CMAN crmsh DRBD GFS2 HA high availability nova Pacemaker PVHVM RackSpace RHEL6

Igor says:

August 23, 2016 at 4:04 pm

You will need virtual ip resource as well, in current config if fileserver-1 dies the clients are screwed.

Reply
- Justin Silver says:
  
  August 27, 2016 at 5:24 pm
  
  Hi Igor, that’s correct, although the end of the post isn’t very clear since I just connect to fileserver-1. If you look at the diagram at the top of the post in this setup both NFS servers are fronted by a load balancer, this is what the clients should *actually* connect to. Thanks for your comment!
  
  Reply
norman choe says:

March 5, 2015 at 7:41 am

seriously an excellent resource that i’ve actually pointed the folks at DRBD to, and they agreed that you’ve done it right.

I have another Q though in the CRM section (and thanks for your explanation; I see that I don’t need to make changes on “both nodes”, and that when I make changes on one I can see them on the other):
```
location l_stonith_fence_virsh_machine1_noton_fileserver1 stonith_fence_virsh_fileserver1 -inf: host1
location l_stonith_fence_virsh_machine1_noton_fileserver2 stonith_fence_virsh_fileserver2 -inf: host2
```
what’s the virsh_machine1 mean there? I realize it’s just a name, and is it just arbitrary?

Also:
```
primitive stonith_fence_virsh_fileserver1 stonith:fence_virsh \
    params action="reboot" ipaddr="vm-host" \
```
… is the “vm-host” relevant? I don’t know the vm-host for a Rackspace machine.

OK, heading back to read more about CRMSH. RTFM, I know!

Reply
- Justin Silver says:
  
  May 4, 2015 at 11:50 am
  
  Hi Norman,
  
  Sorry for the delayed response – just got back from a 2 month road trip and I didn’t keep up with all my email. For your first question, the name should be arbitrary as it’s just an identifier.
  
  The second item is a bug in the guide – I was using a “real” configuration as a guide but copied and pasted to make sure nothing proprietary made it to the Internet and I forgot to update that section. The ipaddr attribute should be the IP address of each cluster, used to connect to and restart the machine if necessary.
  
  Thanks!
  
  Reply
norman choe says:

February 25, 2015 at 1:49 pm

you mention /dev/dvdb1 a couple of times but it’s /dev/xvdb1 in the snippets; also, could you fill in the STONITH part a bit? Does that configuration go on both nodes?

Reply
- Justin Silver says:
  
  February 25, 2015 at 2:53 pm
  
  Sorry, that should have been `/dev/xvdb1`, I’ve updated the post.
  
  As for STONITH, you are using CRMSH to manage the cluster, not a particular node. That means that when you save the configuration, it should be applied to all nodes in your cluster.
  
  Reply
  - norman says:
    
    February 26, 2015 at 10:57 am
    
    ahh, okay. however, since i don’t allow root logins, i’m going to have to edit that a bit.
    
    or i guess i could allow ssh logins from a single host, but that’s a bit of a pain. maybe something with sudo.
    
    All in all though, this was a super helpful howto!
    
    Reply
    - Justin Silver says:
      
      February 26, 2015 at 11:27 am
      
      It doesn’t have to be root per se, just a user that is allowed to execute `reboot`. This is how one node can “shoot the other one in the head” by taking it offline and assuming the master role until the other node can be synced up. If you only allow logins to this user via your local network interface, it should be fairly secure.
      
      Thanks for reading!
      
      Reply
Richard says:

November 2, 2014 at 1:43 pm

I’m about to build this out with bare hardware to play with it.
I guess the one thing I’m missing from the read through is if I want to use the cluster for HA as well as load balancing…
I like the idea of splitting the nfs load between the 2 servers. Instead of just having the second one mirroring.
But if I have server-a pointed to server-1 and server-b pointed to server-2. (1 and 2 being the nfs cluster) and server-1 goes down, how does server-a fail over to server-2 to continue operating without down time??

Reply
- Justin Silver says:
  
  November 3, 2014 at 10:05 am
  
  Richard – as I mentioned in my email the HA/failover is going to depend on your specific setup. If you use Heartbeat it will handle the assignment of shared IP addresses, so if Server1 is 10.0.0.10 and Server2 is 10.0.0.20 then they might have a shared IP address of 10.0.0.30 that the clients actually connect to. You can logically spread the load, but technically all clients are mapped to a single server. One big advantage here is that you don’t have to have dual-primaries – this can get pretty complicated.
  
  I ended up going with Pacemaker with a hardware load balancer in front of it since we needed the load balanced performance. Both servers were active in a dual primary configuration, with all the clients connected to an NFS share on the load balancer. If a server went down, it was removed from the pool by the load balancer and the clients were none the wiser.
  
  Some HA testing was done, but not extensively.
  
  Reply
Edward says:

May 28, 2014 at 6:28 am

hello, I need to assemble a cluster to increase processing for quantum computation. Could you instruct me how sicronizar machines for this. Use centos 6.5.
Yes you can be grateful. My email is willicamarg at yahoo dot com dot br.

Reply
- Justin Silver says:
  
  May 28, 2014 at 11:51 am
  
  Hi Edward, the guide above should work for CentOS 6.5 in terms of a clustered filesystem shared with NFS. I used a hardware load balancer to perform the health checks for the NFS servers that serve the DRBD replicated block storage.
  
  Reply

DRBD: Redundant NFS Storage on CentOS 6 - Justin Silver

June 2, 2014

[…] the NFS cluster, however it is omitted as I ended up going a different route and instead used Pacemaker to control DRBD in a Dual Primary configuration. Since you might have come here looking for a HOWTO with Heartbeat as well, the best I can do is […]

Dual Primary DRBD on CentOS 6: GFS2 & Pacemaker

Network Architecture

Create RackSpace Cloud Network

Create RackSpace Cloud Servers

Create Cloud Block Storage

DRBD Cloud Server Configuration

Create Clustered Filesystem

STONITH: Shoot The Other Node In The Head

Install CRM Shell

Validate the Cluster

NFS Server

You may also like...

13 Responses

Leave a Reply Cancel reply

Popular Posts, Recent Entries, Comments

Stack Exchange

Links

Powered by Redis!

Tag Cloud

Dual Primary DRBD on CentOS 6: GFS2 & Pacemaker

Network Architecture

Create RackSpace Cloud Network

Create RackSpace Cloud Servers

Create Cloud Block Storage

DRBD Cloud Server Configuration

Create Clustered Filesystem

STONITH: Shoot The Other Node In The Head

Install CRM Shell

Validate the Cluster

NFS Server

You may also like...

Install Jenkins on CentOS as a Service

Install ClamAV + Maldet on CentOS 7

Install PHP 7.0 (phpng) on CentOS & RedHat

13 Responses

Leave a Reply Cancel reply

Popular Posts, Recent Entries, Comments

Stack Exchange

Links

Powered by Redis!

Tag Cloud