IBM Cloud Docs
Host failure recovery policies

Host failure recovery policies

If a host fails unexpectedly and cannot be recovered, virtual server instances on the failed host will be automatically restarted on a healthy host. The restart policy can be configured to not restart the virtual server instance when you create an instance or on an existing instance.

IBM Cloud® uses continual monitoring procedures and operations that help prevent disruption to your workloads during unplanned host failures. IBM Cloud® continuously monitors your infrastructure to ensure that all hosts are healthy and responsive. Detection of a host issue occurs within 30 seconds of occurrence. In cases where a host failure is detected and the host cannot be immediately recovered, IBM Cloud® initiates the selected failure recovery policy within 5 minutes for all virtual servers on the affected host.

Available recovery policies

The default setting for the host failure policy of an instance is 'restart'. The policy can be changed at any time without disrupting the instance and is only used when recovering from a host failure event.

Table 1. Host failure recovery policies
Host failure policy Policy description
restart The instance is automatically restarted on another compute host
stop The instance will not be restarted on another compute host

Restart

When the host failure policy is set to restart and the host fails, the instance is relocated to another available host and restarted. The restarted instance uses the same boot volume and the same data volumes as the original instance. The restarted instance is assigned the same floating IP, static IPs, and dynamic IP addresses on the new host.

Stop

When the host failure policy is set to stop when the host fails, the instance is not restarted. The status of the instance is changed to stopped. The user can choose to restart the stopped instance by issuing an instance start action. When restarted, the instance is placed on an available host.

Setting the recovery policy using the UI

  • To set the failure recovery policy using the UI when provisioning an instance, find Advanced options on the provisioning page.

    1. In Advanced options, find 'Host failure auto restart'. This can be toggled on or off.
  • To set the failure recovery policy for an existing instance, complete the following steps.

    1. In IBM Cloud console, navigate to Navigation Menu icon menu icon > VPC Infrastructure VPC icon > Compute > Virtual server instances.
    2. On the Virtual server instances page, click the Actions icon More Actions icon for the instance that you want to manage.
    3. From the instance details page, locate 'Host failure auto restart'. Click the pencil icon and choose Enabled or Disabled to toggle the status of the host recovery policy on or off.

For more information, see Creating virtual server instances by using the UI and Managing virtual server instances.

Setting the recovery policy by using the CLI

During instance create or patch, the following attributes can be used to set the host failure policy.

For more information, see Creating virtual server instances by using the CLI and Managing virtual server instances.

Table 2. Recovery policy attributes
Host failure policy Policy attribute
restart '--host-failure-policy restart'
stop '--host-failure-policy stop'

Create an instance with host failure policy

You can create an instance in your IBM Cloud VPC and change the availability policy on host failure by using the command-line interface (CLI). Run the ibmcloud is instance-create command and set the --host-failure-policy property to restart or stop. The host failure policy service is set to restart by default.

ibmcloud is inc test r006-a0162c41-6a75-4a04-aabb-da1c78539531 us-south-2  bx2-2x8  7284-47efd8c6-0efc-462e-89c0-e0457119f90b --image r006-63363662-a4ee-4ba4-a6c4-92e6c78c6b58 --host-failure-policy stop
Creating instance test under account VPC1 as user myuser@mycompany.com...

ID                                    7284_683902df-85ce-4546-808c-3675247074d8
Name                                  test
CRN                                   crn:v1:bluemix:public:is:us-south-2:a/a1234567::instance:7284_683902df-85ce-4546-808c-3675247074d8
Status                                pending
Availability policy on host failure   stop
Startable                             true
Profile                               bx2-2x8
Architecture                          amd64
vCPU Manufacturer                     Intel
vCPUs                                 2
Memory(GiB)                           8
Bandwidth(Mbps)                       4000
Image                                 ID                                          Name
                                      r006-63363662-a4ee-4ba4-a6c4-92e6c78c6b58   ibm-centos-7-9-minimal-amd64-3

VPC                                   ID                                          Name
                                      r006-a0162c41-6a75-4a04-aabb-da1c78539531   cli-vpc-1

Zone                                  us-south-2
Resource group                        ID                                 Name
                                      11caaa983d9c4beb82690daab08717e9   Default

Created                               2021-10-25T16:39:30+05:30
Boot volume                           ID   Name           Attachment ID                               Attachment name
                                      -    PROVISIONING   7284-69923add-65e2-4b93-bee4-a4bca3836696   collector-reverb-exiting-swinging

Update an instance with host failure policy

You can update an instance in your IBM Cloud VPC with and change the availability policy on host failure by using the command-line interface (CLI). Run the ibmcloud is instance-update command and set the --host-failure-policy property to start or stop. The host failure policy service is set to restart by default.

ibmcloud is inu 7284_683902df-85ce-4546-808c-3675247074d8 --host-failure-policy restart
Updating instance 7284_683902df-85ce-4546-808c-3675247074d8 under account VPC1 as user myuser@mycompany.com...

ID                                    7284_683902df-85ce-4546-808c-3675247074d8
Name                                  test
CRN                                   crn:v1:bluemix:public:is:us-south-2:a/a1234567::instance:7284_683902df-85ce-4546-808c-3675247074d8
Status                                running
Availability policy on host failure   restart
Startable                             true
Profile                               bx2-2x8
Architecture                          amd64
vCPU Manufacturer                     Intel
vCPUs                                 2
Memory(GiB)                           8
Bandwidth(Mbps)                       4000
Image                                 ID                                          Name
                                      r006-63363662-a4ee-4ba4-a6c4-92e6c78c6b58   ibm-centos-7-9-minimal-amd64-3

VPC                                   ID                                          Name
                                      r006-a0162c41-6a75-4a04-aabb-da1c78539531   cli-vpc-1

Zone                                  us-south-2
Resource group                        ID                                 Name
                                      11caaa983d9c4beb82690daab08717e9   Default

Created                               2021-10-25T16:39:30+05:30
Boot volume                           ID                                          Name                            Attachment ID                               Attachment name
                                      r006-780e6d41-b8c0-4023-b81f-2dcabf0b834f   aardvark-matrix-tidy-fragment   7284-69923add-65e2-4b93-bee4-a4bca3836696   collector-reverb-exiting-swinging

Setting the recovery policy by using the API

During instance create or update operations, the host_failure sub-property can be used to set the host failure availability_policy of the virtual server instance. If the compute host experiences a failure, specify restart or stop to set the policy.

Table 2. Recovery policy API
Host failure policy Attribute
restart 'restart'
stop 'stop'

For more information, see Create an instance and Managing virtual server instances.

Next steps

For more information about planned and unplanned host outages, see the FAQ In what cases is my virtual server migrated to a different host?.