Saturday 5 March 2016

Pacemaker: Observations on Building Clusters

Pacemaker is a cluster resource manager used in Red Hat/CentOS 6.5+/7. This is a short post on considerations for building a server cluster using Pacemaker. This is not a tutorial.

Clustering

A cluster is a democracy for computers. The primary purpose of a cluster member (node) is voting on the availibility and viability of other members. A cluster member does not necessarily have to do anything other than vote. Not all nodes in a cluster need to do the same thing.

Quorum

When a member of a cluster is quorate it means it has decided that over 50% of the other members are available to it. Any node that is inquorate should fence itself off from the cluster. The cluster should attempt to fence any resource that it decides is unavailable.

There is no quorum without at least three nodes and there should be an odd number of nodes.

Fencing and STONITH

Fencing removes "bad" nodes or resources from the cluster. "Shoot the other node in the head" (STONITH) is the most extreme form of fencing. STONITH will power off or reboot a node.

There is little point ensuring service continuity if the underlying data is toast.

Do I Need STONITH?

Almost always, the answer is Yes!

- Andrew Beekhof on Highly Available Data Corruption

STONITH devices range from power switches with an IP address (not to be confused with network switches) to the iDRAC interfaces built into DELL servers. STONITH may mean budgeting for hardware and rack units.

If your fence device is not supported you may need to write your own fence agent or source it from elsewhere.

Fence agents match on /usr/sbin/fence_* on CentOS 6.x.

Resource Agents

A resource can be practically anything. Typically resources are services like Apache HTTP Server or "virtual" IP addresses (VIPs) that move between nodes during failover. Resource agents are scripts that start, stop and monitor the state of resources.

Pacemaker supports three basic categories of resource:

  • Default - run on a single node at a time
  • Clones - run on multiple nodes at a time
  • Multi-state - a specialization of cloning for things like master/slave

It is important to understand how resource agents work when building complex clusters.

Consider an asymmetric 4-node cluster with two sets of master/slave that use the same service for different purposes:

  • foo - the init.d script started with service foo start
  • Bar - a resource defining a master/slave pair that relies on foo
    • Has constraints to prefer nodes 1 & 2
    • Has constraints to avoid nodes 3 & 4 at -INFINITY (never)
  • Baz - another resource defining a master/slave pair that relies on foo
    • Has constraints to prefer nodes 3 & 4
    • Has constraints to avoid nodes 1 & 2 at -INFINITY (never)

If all the resource does to ensure that Bar is stopped on nodes 3 & 4 is call service foo status it will mistakenly stop Baz's process.

For complex clusters you may need to write your own resource agents.

Resource agents are located under /usr/lib/ocf/resource.d/ on CentOS 6.x.

Pacemaker Resources

No comments:

Post a Comment

All comments are moderated