IBM Cloud Docs
Event Notifications high availability and disaster recovery

Event Notifications high availability and disaster recovery

High availabilityThe ability of IT services to withstand all outages and continue providing processing capability according to some predefined service level. Covered outages include both planned events, such as maintenance and backups, and unplanned events, such as software failures, hardware failures, power failures, and disasters. (HA) is a core discipline in an IT infrastructure to keep your apps up and running, even after a partial or full site failure. The main purpose of high availability is to eliminate potential points of failures in an IT infrastructure.

IBM Event Notifications is a highly available, multi-tenant, regional service.

Service High Availability (HA)

An availability zone is a logically and physically isolated location within an IBM Cloud region where your data is processed and hosted.

  • An availability zone has independent power, cooling, and network infrastructures that are isolated from other zones to strengthen fault tolerance by avoiding single points of failure between zones.
  • An availability zone offers high bandwidth and low inter-zone latency within a region.

A region (location) is a geographically and physically separate group of one or more availability zones with independent electrical and network infrastructures that are isolated from other regions.

  • Regions are designed to remove shared single points of failure with other regions and provide low inter-zone latency within the region.
  • Each region has three different data centers (DC) for redundancy.
  • In each supported region, traffic is load balanced across infrastructure in multiple availability zones, with no single point of failure.
  • If all the data centers in a region fail, Event Notifications becomes unavailable in that region.

Availability zones for Event Notifications

IBM Cloud Event Notifications service is a highly available, regional service. In each supported region, the service exists in multiple availability zones with no single point of failure.

The following table lists the high-availability (HA) status for the regions (locations) where the IBM Cloud Event Notifications service is available:

HA status for the regions
Geography Region HA Status
Asia-Pacific Sydney (au-syd) MZR
Europe London (eu-gb) MZR
Europe Frankfurt (eu-de) MZR
Europe Madrid (eu-es) MZR
North America Dallas (us-south) MZR

Where:

  • A geography is a geographic area or larger political body that contains one or more regions.
  • A region is a defined geographic territory.
    • A region might be a specific postal code area, a town, a city, a state, a group of states, or even a group of countries.
    • A region contains multiple availability zones to meet local access, low latency, and security requirements for the region.
  • MZR means multi-zone region. Learn more.

Locations

For more information about service availability within regions and data centers, see Service and infrastructure availability by location.

Responsibilities

For more information about your responsibilities when using Event Notifications, see Shared responsibilities.

What level of availability do I need?

You can achieve high availability on different levels in your IT infrastructure and within different components of your cluster. The level of availability that is correct for you depends on several factors, such as your business requirements, the service level agreements (SLAs) that you have with your customers, and the resources that you want to expend.

What level of availability does IBM Cloud offer?

The level of availability that you set up for your cluster impacts your coverage under the IBM Cloud high availability service level agreement terms.

Service level objectives (SLOs) describe the design points that the IBM Cloud services are engineered to meet. Event Notifications is designed to achieve the following availability target.

SLO for Event Notifications
Availability target Target Value
Availability % 99.99%

The SLO is not a warranty and IBM will not issue credits for failure to meet an objective. Refer to the SLAs for commitments and credits that are issued for failure to meet any committed SLAs. For a summary of all SLOs, see IBM Cloud service level objectives.

Disaster recovery (DR) for Event Notifications service in a region

Event Notifications is a regional service, and does not offer automatic cross-regional failover or cross-regional disaster recovery. If all of the availability zones in a region fail, Event Notifications becomes unavailable in that region.

If an entire MZR becomes inoperative (usually due to a catastrophic disaster or failure), IBM runs disaster-recovery plans to restore service within 24 hours or less. IBM restores the service in an alternative MZR from an IBM-managed backup. Existing DNS names are migrated to the backup deployment. When the disaster recovery process is complete, API traffic resumes automatically.

When the primary MZR is restored, the secondary deployment is migrated back to the primary site. After the migration is complete, the DNS is restored to its original routing.

If you need zero downtime during a regional disaster recovery, create and maintain backup instances in other regions. To synchronize a service instance in one region with an instance in a different region, you can use the APIs mentioned here.