Understanding high availability for Transit Gateway

High availabilityThe ability of a service or workload to withstand failures and continue providing processing capability according to some predefined service level. For services, availability is defined in the Service Level Agreement. Availability includes both planned and unplanned events, such as maintenance, failures, and disasters. (HA) is a core discipline in an IT infrastructure to keep your apps up and running, even after a partial or full site failure. The main purpose of high availability is to eliminate potential points of failures in an IT infrastructure.

IBM Cloud® Transit Gateway is highly available within any IBM Cloud region (for example, Dallas or Washington, DC). However, recovering from disasters that affect an entire region requires planning and preparation.

You are responsible for understanding your configuration, customization, and usage of the service. You are also responsible for being ready to recreate an instance of the service in a new region and restore your data in a new region.

IBM Cloud supports high availability with no single point of failure. The service achieves high availability automatically and transparently by the multi-zone region (MZR) feature provided by IBM Cloud.

When you create a transit gateway instance in a particular region, the system automatically enables multiple zones, which do not share a single point of failure.

Transit gateway GRE connections require the gateway owner to configure HA to meet their specific needs. When you configure a GRE connection on a transit gateway, you must specify the availability zone. For a more robust solution, use a Redundant GRE that requires you to configure at least 2 tunnels or configure multiple GRE connections that use different availability zones.

Transit gateway GRE connections require the gateway owner to specifically configure HA for their needs. A GRE connection is a point-to-point connection, has no built-in redundancy, and is a single point of failure. When you configure a GRE connection on a transit gateway, you must specify the availability zone. For a robust HA solution, configure multiple GRE connections that use different availability zones.

See How IBM Cloud ensures high availability and disaster recovery to learn more about the high availability and disaster recovery standards in IBM Cloud. You can also find information about Service Level Agreements.

Responsibilities

To find out more about responsibility ownership for using Transit Gateway between IBM and the customer, see Understanding your responsibilities when using Transit Gateway.

What level of availability do I need?

You can achieve high availability on different levels in your IT infrastructure and within different components of your cluster. The level of availability that is right for you depends on several factors, such as your business requirements, the service level agreements (SLAs) that you have with your customers, and the resources that you want to expend.

What level of availability does IBM Cloud offer?

The level of availability that you set up for your cluster impacts your coverage under the IBM Cloud high availability service level agreement terms.

Service level objectives (SLOs) describe the design points that the IBM Cloud services are engineered to meet. Transit Gateway is designed to achieve the following availability target.

SLO for Transit Gateway
Availability target	Target value
Availability %	99.999%

The SLO is not a warranty and IBM will not issue credits for failure to meet an objective. Refer to the SLAs for commitments and credits that are issued for failure to meet any committed SLAs. For a summary of all SLOs, see IBM Cloud service level objectives.

Locations

For more information about service availability within regions and data centers, see Service and infrastructure availability by location.