IBM Cloud Docs
Implementing a RHEL HA Add-On cluster on IBM Power Virtual Server

Implementing a RHEL HA Add-On cluster on IBM Power Virtual Server

Use the following information and procedures to implement a Red Hat Enterprise Linux (RHEL) High Availability (HA) cluster. The cluster uses instances in IBM® Power® Virtual Server as cluster nodes.

The information describes how to transform the individual virtual server instances into a cluster.

These procedures include installing the high availability packages and agents on each cluster node and configuring the fencing devices.

This information is intended for architects and specialists who are planning a high availability deployment of SAP applications on Power Virtual Server. It is not intended to replace existing SAP or Red Hat documentation.

Before you begin

Review the general requirements, product documentation, support articles, and SAP notes listed in Implementing High Availability for SAP Applications on IBM Power Virtual Server References.

Creating virtual server instances for the cluster

Use the instructions in Creating instances for a high availability cluster on IBM® Power® Virtual Server to create the virtual server instances that you want to use as cluster nodes.

Gathering parameters for the cluster configuration

Parameters that are required for fencing agent configuration include the Cloud Resource Name (CRN) of the Power Virtual Server workspace and the instance IDs of the virtual server instances. Some extra parameters need to be derived from the CRN. The fencing agent also uses the API Key of the Service ID to authenticate with the Power Virtual Server API.

The uppercase variables in the following section indicate that these parameters need to be set as environment variables on the virtual server instances to simplify the setup of the cluster.

  1. Log in to Workspaces - Power Virtual Server.

  2. The list contains the name and CRN of the workspaces.

    Locate your Workspace. Click Copy next to the CRN and paste it into a temporary document.

    A CRN has multiple sections that are divided by a colon. The base format of a CRN is:

    crn:version:cname:ctype:service-name:location:scope:service-instance:resource-type:resource

    service-name
    The fifth field of the CRN of the workspace is always power-iaas, the service name.
    location
    The sixth field is the location that needs to be mapped to the region.
    scope
    The seventh field is the Tenant ID.
    service_instance
    The eighth field is the Cloud Instance ID or GUID.
  3. Set IBMCLOUD_CRN to the full CRN and GUID to the content of the service_instance field.

  4. Set the CLOUD_REGION to the prefix that represents the geographic area of your service instance to target the correct Power Cloud API endpoint.

    CLOUD_REGION when you use a public network
    For a public network, map the location to its respective geographic area (us-east, us-south, eu-de, lon, tor, syd, or tok).
    CLOUD_REGION when you use a private network
    For a private network, map the location to its respective geographic area (us-east, us-south, eu-de, eu-gb, ca-tor, au-syd, jp-tok, jp-osa, br-sao, or ca-mon).
  5. On the tile for the workspace, click View Instances.

  6. In the list of the virtual server instances, click each of the cluster nodes and take a note of each ID.

  7. Set these IDs as POWERVSI_01 and POWERVSI_02.

Preparing the nodes for RHEL HA Add-On installation

The following section describes basic preparation steps on the cluster nodes. Make sure that you follow the steps on both nodes.

Log in as the root user to each of the cluster nodes.

Populating entries for each node in the hosts file

On both nodes, use the following information to populate entries.

Add the IP addresses and hostnames of both nodes to the hosts file /etc/hosts.

For more information, see Setting up /etc/hosts files on RHEL cluster nodes.

Preparing environment variables

To simplify the setup process, prepare the following environment variables for the root user on both nodes.

On both nodes, create a file with the following environment variables and update to your environment.

export CLUSTERNAME=SAP_CLUSTER         # Cluster Name

export APIKEY=<APIKEY>                 # API Key of the ServiceID
export IBMCLOUD_CRN=<IBMCLOUD_CRN>     # CRN of workspace
export GUID=<GUID>                     # GUID of workspace
export CLOUD_REGION=<CLOUD_REGION>     # Region of workspace
export PROXY_IP=<IP_ADDRESS>           # IP address of proxy server

export NODE1=<HOSTNAME_01>             # <Hostname of virtual server instance 1
export NODE2=<HOSTNAME_02>             # <Hostname of virtual server instance 2

export POWERVSI_01=<POWERVSI_01>       # ID virtual server instance 1
export POWERVSI_02=<POWERVSI_02>       # ID virtual server instance 2

Installing and configuring a RHEL HA Add-On cluster

Use the following steps to set up a two-node cluster for an IBM Power Virtual Server.

The instructions are based on the Red Hat product documentation and articles that are listed in Implementing High Availability for SAP Applications on IBM Power Virtual Server References.

You need to perform some steps on both nodes and some steps on either NODE1 or on NODE2.

Installing RHEL HA Add-On software

Install the required software packages.

Checking the RHEL HA repository

Check that the RHEL High Availability repository is enabled.

On both nodes, use the following command.

dnf repolist

Use the following command to enable the HA repository if it is missing.

subscription-manager repos \
    --enable="rhel-8-for-ppc64le-highavailability-e4s-rpms"

For RHEL 9, use this command.

subscription-manager repos \
    --enable="rhel-9-for-ppc64le-highavailability-e4s-rpms"
dnf clean all
dnf repolist

Installing the RHEL HA Add-On software packages

Install the required software packages.

On both nodes, run the following command.

dnf install -y pcs pacemaker fence-agents-ibm-powervs

Make sure that you install the minimal version of the fence-agents-ibm-powervs package dependent on your Red Hat Enterprise Linux release:

RHEL 8
fence-agents-ibm-powervs-4.2.1-121.el8
RHEL 9
fence-agents-ibm-powervs-4.10.0-55.el9

Configuring a RHEL HA Add-On cluster

Configuring firewall services

Add the high availability service to the RHEL firewall if firewalld.service is installed and enabled.

On both nodes, run the following commands.

firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload

Starting the PCS daemon

Start the PCS daemon that is used for controlling and configuraing RHEL HA Add-On clusters through PCS.

On both nodes, run the following commands.

systemctl enable --now pcsd.service

Make sure that the PCS service is running:

systemctl status pcsd.service

Setting a password for hacluster user ID

Set the password for the hacluster user ID.

On both nodes, run the following command.

passwd hacluster

Authenticating the cluster nodes

Use the following command to authenticate user hacluster to the PCS daemon on the nodes in the cluster. The command prompts you for the password that you set in the previous step.

On NODE1, run the following command.

pcs host auth ${NODE1} ${NODE2} -u hacluster

If you get an error message similar to Error: Unable to communicate with {NODE2}, check whether you have any proxy variables set in your environment (env | grep -i proxy). You need to unset these variables or define a no_proxy variable to exclude the cluster nodes: export no_proxy=${NODE1},${NODE2},$no_proxy

Configuring and starting the cluster nodes

Configure the cluster configuration file and synchronize the configuration to the specified nodes.

The --start option also starts the cluster service on the nodes.

On NODE1, run the following command.

pcs cluster setup ${CLUSTERNAME} --start ${NODE1} ${NODE2}
pcs status

Creating the fencing device

STONITH is an acronym for "Shoot The Other Node In The Head" and protects your data from corruption in a split-brain situation.

You must enable STONITH (fencing) for a RHEL HA Add-On production cluster.

Fence agent fence_ibm_powervs is the only supported agent for a STONITH device on Power Virtual Server clusters.

The fence agent connects to the Power Cloud API by using parameters APIKEY, IBMCLOUD_CRN, CLOUD_REGION, GUID, and the instance IDs POWERVSI_01 and POWERVSI_02. You can test the agent invocation by using the parameters that you gathered in the Gathering required parameters for the cluster configuration section.

Identifying the virtual server instances for fencing

Use the list option of fence_ibm_powervs to identify and or verify the instance IDs of the two cluster nodes:

On any node, run the following command.

fence_ibm_powervs \
    --token=${APIKEY} \
    --crn=${IBMCLOUD_CRN} \
    --instance=${GUID} \
    --region=${CLOUD_REGION} \
    --api-type=public \
    -o list

If the virtual server instances have access to only a private network, you must use the --api-type=private option, which also requires an extra --proxy option.

Example:

fence_ibm_powervs \
    --token=${APIKEY} \
    --crn=${IBMCLOUD_CRN} \
    --instance=${GUID} \
    --region=${CLOUD_REGION} \
    --api-type=private \
    --proxy=http://${PROXY_IP}:3128 \
    -o list

Continue by using --api-type=private in the following examples.

Checking the status of both virtual server instances

On both nodes, run the following commands.

time fence_ibm_powervs \
    --token=${APIKEY} \
    --crn=${IBMCLOUD_CRN} \
    --instance=${GUID} \
    --region=${CLOUD_REGION} \
    --plug=${POWERVSI_01} \
    --api-type=private \
    --proxy=http://${PROXY_IP}:3128 \
    -o status
time fence_ibm_powervs \
    --token=${APIKEY} \
    --crn=${IBMCLOUD_CRN} \
    --instance=${GUID} \
    --region=${CLOUD_REGION} \
    --plug=${POWERVSI_02} \
    --api-type=private \
    --proxy=http://${PROXY_IP}:3128 \
    -o status

The status action of the fence agent against a virtual server instance {pvm_instance_id} displays its power status.

On both nodes, the two commands must report Status: ON.

The output of the time command might be useful later when you choose timeouts for the STONITH device.

You can add the -v flag for verbose output, which shows more information about connecting to the Power Cloud API and querying virtual server power status.

Creating a stonith device

The following command shows the device-specific options for the fence_ibm_powervs fencing agent.

pcs stonith describe fence_ibm_powervs

Create the stonith device for both virtual server instances.

On NODE1, run the following command.

pcs stonith create res_fence_ibm_powervs fence_ibm_powervs \
    token=${APIKEY} \
    crn=${IBMCLOUD_CRN} \
    instance=${GUID} \
    region=${CLOUD_REGION} \
    api_type=private \
    proxy=http://${PROXY_IP}:3128 \
    pcmk_host_map="${NODE1}:${POWERVSI_01};${NODE2}:${POWERVSI_02}" \
    pcmk_reboot_timeout=600 \
    pcmk_monitor_timeout=600 \
    pcmk_status_timeout=60

Although the fence_ibm_powervs agent uses api-type as an option when started from the command line, the stonith resource needs to be created by using api_type.

Verify the configuration with the following commands.

pcs config
pcs status
pcs stonith config
pcs stonith status

Setting the stonith-action cluster property

To speed up failover times in an IBM Power Virtual Server cluster, you can change the cluster property stonith-action to off. When the cluster performs a fencing action, it triggers a power off operation instead of a reboot for the fenced instance.

After this change, you always need to log in to the IBM Cloud Console, and manually start an instance that was fenced by the cluster.

pcs property set stonith-action=off

Verify the change.

pcs config

Testing fencing operations

To test the STONITH configuration, you need to manually fence the nodes.

When fencing is manually triggered through pcs stonith fence, the stonith-action cluster attribute is not used and the node is restarted.

On NODE1, run the following commands.

pcs stonith fence ${NODE2}
pcs status

As a result, NODE2 restarts.

After NODE2 is running again, start the cluster on NODE2 and try to fence NODE1.

On NODE2, run the following commands.

pcs cluster start
pcs status
pcs stonith status
pcs stonith fence ${NODE1}

NODE1 restarts.

After the node is running, start the cluster on NODE1 again.

On NODE1, run the following command.

pcs cluster start
pcs status
pcs stonith status