Reference

Q & A

How Crowbar automatically claims block devices for DRBD when applying the Pacemaker barclamp?

Unlike SBD where a specific block device needs to be specified, the Pacemaker barclamp will simply grab the first unclaimed block device it finds for use with DRBD. Unfortunately the bugzilla entries tracking this are currently only visible to SUSE employees.

https://bugzilla.suse.com/show_bug.cgi?id=1031065

https://bugzilla.suse.com/show_bug.cgi?id=924315

It's worth noting that this is not really a bug, but more of a UX/design issue, since it's working as intended. There is a workaround available where you can edit some Chef data on the node via the knife command to manually allocate a specific block device for use by DRBD.

Vagrant

Getting Started - Vagrant by HashiCorp (https://www.vagrantup.com/intro/getting-started/)

Getting Started with Vagrant (https://semaphoreci.com/community/tutorials/getting-started-with-vagrant)

OpenStack HA

OpenStack High Availability Guide (https://docs.openstack.org/ha-guide/)

Deployment topologies for HA with OpenStack (https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/)

High Availability for OpenStack (https://www.slideshare.net/kamesh001/high-available-for-openstack)

OpenStack High Availability – Controller Stack (http://behindtheracks.com/2014/04/openstack-high-availability-controller-stack/)

Getting Started - Vagrant by HashiCorp (https://www.vagrantup.com/intro/getting-started/)

OpenStack HA - Concept

https://godleon.github.io/blog/2015/05/14/OpenStack-HA-Concept

- To avoid SPOF (Single Point of Failure)

- Stateless vs. Stateful services

- Stateless Service Example: nova-api / nova-conductor / glance-api / keystone-api / neutron-api / nova-scheduler

- Stateful Service Example: database / message queue

- Active/Passive

- Stateless vs. Stateful (Pacemaker & Corosync)

- Active/Active

- Stateless (HAProxy) vs. Stateful

Terminologies

Distributed Replicated Block Device (DRBD)

A system for mirroring blockdevices (filesystems, VM images, whatever) across the Network, typically a dedicated LAN, but also capable of WAN replication. This is the general replication technology which is most commonly used by members of the Linux-HA community.

DRBD is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts. DRBD is traditionally used in high availability (HA) computer clusters, but beginning with DRBD version 9, it can also be used to create larger software defined storage pools with a focus on cloud integration.

DRBD layers logical block devices (conventionally named /dev/drbdX, where X is the device minor number) over existing local block devices on participating cluster nodes. Writes to the primary node are transferred to the lower-level block device and simultaneously propagated to the secondary node(s). The secondary node(s) then transfers data to its corresponding lower-level block device. All read I/O is performed locally unless read-balancing is configured.

Should the primary node fail, a cluster management process promotes the secondary node to a primary state. This transition may require a subsequent verification of the integrity of the file system stacked on top of DRBD, by way of a filesystem check or a journal replay. When the failed ex-primary node returns, the system may (or may not) raise it to primary level again, after device data resynchronization. DRBD's synchronization algorithm is efficient in the sense that only those blocks that were changed during the outage must be resynchronized, rather than the device in its entirety.

DRBD is often deployed together with the Pacemaker or Heartbeat cluster resource managers, although it does integrate with other cluster management frameworks. It integrates with virtualization solutions such as Xen, and may be used both below and on top of the Linux LVM stack.

Pacemaker

Pacemaker is an open source high availability resource manager software used on computer clusters since 2004. Until about 2007, it was part of the Linux-HA project, then was split out to be its own project.

It implements several APIs for controlling resources, but its preferred API for this purpose is the Open Cluster Framework resource agent API.

Pacemaker relies on the Corosync messaging layer for reliable cluster communications. Corosync implements the Totem single-ring ordering and membership protocol. It also provides UDP and InfiniBand based messaging, quorum, and cluster membership to Pacemaker.

Pacemaker does not inherently understand the applications it manages. Instead, it relies on resource agents (RAs) that are scripts that encapsulate the knowledge of how to start, stop, and check the health of each application managed by the cluster.

These agents must conform to one of the OCF, SysV Init, Upstart, or Systemd standards.

Pacemaker allows services to recover from hardware and software failure automatically, often within seconds! Pacemaker coordinates this recovery. It is the “Director” of the Linux High Availability Cluster Stack. The software actively watches and controls services in a cluster. It handles starting/ stopping/ and monitoring of services. Pacemaker uses tools like Heartbeat or Corosync to communicate failures automatically.

High-availability clusters

High-availability clusters (also known as HA clusters or failover clusters) are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail.

Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover.

HA cluster implementations attempt to build redundancy into a cluster to eliminate single points of failure, including multiple network connections and data storage which is redundantly connected via storage area networks.

Node configurations

Active/active — Traffic intended for the failed node is either passed onto an existing node or load balanced across the remaining nodes. This is usually only possible when the nodes use a homogeneous software configuration.

Active/passive — Provides a fully redundant instance of each node, which is only brought online when its associated primary node fails.[2] This configuration typically requires the most extra hardware.

N+1 — Provides a single extra node that is brought online to take over the role of the node that has failed.

HAProxy

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for very high traffic websites and powers quite a number of the world's most visited ones. Over the years it has become the de-facto standard open- source load balancer, is now shipped with most mainstream Linux distributions, and is often deployed by default in cloud platforms.

Libvirt

Libvirt is an open source API, daemon and management tool for managing platform virtualization. It can be used to manage KVM, Xen, VMware ESX, QEMU and other virtualization technologies. These APIs are widely used in the orchestration layer of hypervisors in the development of a cloud-based solution.

Libvirt is a toolkit to manage virtualization hosts.

Nova

Nova is the OpenStack project that provides a way to provision compute instances (aka virtual servers). Nova supports creating virtual machines, baremetal servers (through the use of ironic), and has limited support for system containers. Nova runs as a set of daemons on top of existing Linux servers to provide that service.

It requires the following additional OpenStack services for basic function:

1. Keystone: This provides identity and authentication for all OpenStack services.

2. Glance: This provides the compute image repository. All compute instances launch from glance images.

3. Neutron: This is responsible for provisioning the virtual or physical networks that compute instances connect to on boot

OpenStack Nova is a component within the OpenStack open source cloud computing platform developed to provide on-demand access to compute resources by provisioning and managing large networks of virtual machines (VMs).

OCF Resource Agents

A resource agent is an executable that manages a cluster resource. No formal definition of a cluster resource exists, other than "anything a cluster manages is a resource." Cluster resources can be as diverse as IP addresses, file systems, database services, and entire virtual machines — to name just a few examples.

Any Open Cluster Framework (OCF) compliant cluster management application is capable of managing resources using the resource agents described in this document. At the time of writing, two OCF compliant cluster management applications exist for the Linux platform:

1. Pacemaker, a cluster manager supporting both the Corosync and Heartbeat cluster messaging frameworks. Pacemaker evolved out of the Linux-HA project.

2. RGmanager, the cluster manager bundled in Red Hat Cluster Suite. It supports the Corosync cluster messaging framework exclusively.

An OCF compliant resource agent can be implemented in any programming language. The API is not language specific. However, most resource agents are implemented as shell scripts, which is why this guide primarily uses example code written in shell language.

A resource agent is a standardized interface for a cluster resource. In translates a standard set of operations into steps specific to the resource or application, and interprets their results as success or failure. The OCF specification is basically an extension of the definitions for an LSB Resource Agents. OCF Resource Agents are those found in /usr/lib/ocf/resource.d/provider.

ICMP

The Internet Control Message Protocol (ICMP) is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating, for example, that a requested service is not available or that a host or router could not be reached. ICMP differs from transport protocols such as TCP and UDP in that it is not typically used to exchange data between systems, nor is it regularly employed by end-user network applications (with the exception of some diagnostic tools like ping and traceroute).