Creating a routing configuration resilient to a regional disaster

IBM Cloud Metrics Routing is a highly available, multi-tenant, regional service. However, you can also configure a routing configuration to a backup instance to mitigate data loss if a regional disaster occurs.

For more information about IBM Cloud Metrics Routing availability and recovery that is provided by the service, see Understanding high availability and disaster recovery.

Understanding targets, routes, and rules

Before creating a backup region, you need to understand targets, routes, and rules.

Targets are created within a region but are global resources. For more information, see Managing targets.
Routes are global under an account and are evaluated in all regions where IBM Cloud Metrics Routing is deployed. For more information, see Managing routes.
Rules specify what metrics are routed in a region and where to route the metrics. For more information, see Defining routing rules.
The account settings configuration defines information such as default targets where metrics are collected in the account, types of endpoints that are allowed to manage the configuration, configuration metadata locations, and allowed locations to store the data in the account. For more information, see Configuring account settings.

If both the primary metadata region and the backup metadata region configured in the account settings are unavailable, no metrics will be routed.

Routing to a backup target in a different region

You can configure a backup target for data that is routed by your IBM Cloud Metrics Routing instance to a target that is running in a different region. You can then route all data to both your primary and backup targets. Configuring a backup target gives you targets that are in sync. You can switch to the backup with no downtime and minimal data loss if a regional disaster occurs.

Creating a second target for backup purposes results in extra charges for running the backup target instance.

Example of a routing configuration that creates a backup of all metrics to a second target in a different region. — Example of a routing configuration that creates a backup of all metrics to a second target in a different region

In this example, the source of the metrics is in the Toronto region (ca-tor). Metrics from the IBM Cloud service are sent by IBM Cloud Metrics Routing to an IBM Cloud Monitoring instance in Dallas (us-south). A regional disaster resilient routing configuration is created to route metrics to an IBM Cloud Monitoring instance (Target 2) in the Washington region (us-east) as well. All metrics are sent to both the target in the Dallas region (us-south) and Washington region (us-east).

Target 2 provides the user with historical metrics in the Washington region (us-east). If the Dallas region (us-south) is not available, users have Toronto (ca-tor) metrics available in the Washington region (us-east).

For users without a disaster resilient routing configuration, no historical metrics are available in a second region.

For more information about configuring routes, see Managing routes. When configuring routing rules associated with your routes, your rules must be configured to route the same metrics to both target locations so each location has the same data.

In addition, you must define a backup metadata region for your metadata backup. The backup metadata region must be a different region from your primary metadata region.

Security considerations in an environment with two targets

When you configure an environment with a backup target, you need to consider the following:

Context-based restrictions give account owners and administrators the ability to define and enforce access restrictions for IBM Cloud resources based on a rule's criteria. The criteria includes the network location of access requests, the endpoint type from where the request is sent, and sometimes the API that the request tries to access. These restrictions work with traditional IAM policies, which are based on identity, to provide an extra layer of protection. For more information, see What are context-based restrictions?

If context-based rules are configured in the account, make sure that the context-based restriction rules are defined for both the primary and backup locations.

You can configure context-based restrictions rules for IBM Cloud Monitoring targets.

For a full list of services supporting context-based restrictions, see Services integrated with context-based restrictions.
IBM Cloud® Identity and Access Management (IAM) enables you to securely control access to all cloud resources consistently in the IBM Cloud. The IAM permissions and authorizations must allow the service to route metrics to both the primary and backup targets.

Automatic disaster management

You can choose to allow IBM Cloud Metrics Routing to handle a regional disaster as described in Understanding high availability and disaster recovery.

In this case, no extra charges for a second target instance are charged. However, you also have the following risks:

No access is available to any historical data from the region that incurred the disaster.
Data is lost while you configure a new instance while the existing instance is not available.
Any metrics that are routed to an IBM Cloud Monitoring target that are then streamed to an IBM® Event Streams for IBM Cloud® instance are only maintained up to the buffer size for 24 hours. Data can then be lost. For more information, see Understanding your responsibilities when you use Event Streams