IBM Cloud Docs
Resiliency Design for VPC virtual servers

Resiliency Design for VPC virtual servers

See the Resiliency in IBM Cloud Solution Guide which is general guide on resiliency in IBM Cloud. The guide focuses on the perspective of IBM clients, their solution planners, architects, and builders and the resilient solutions that they create on the IBM Cloud platform. This guide focuses on specific information for VPC VSIs.

The key backup and restore architecture elements are shown in the following diagram.

IBM Cloud VPC VSI Backup and Restore
IBM Cloud VPC VSI Backup and Restore

IBM Cloud VPC Block Storage Snapshots

IBM Cloud VPC Block Storage Snapshots provide point-in-time copies of Block Storage volumes attached to virtual server instances. Snapshots are stored regionally in IBM Cloud Object Storage and can be used for data protection, disaster recovery, and creating new volumes from a known good state.

The following tables lists the key capabilities for IBM Cloud VPC Block Storage Snapshots.

IBM Cloud VPC Block Storage Snapshots features
Feature Description
Fast snapshot creation Snapshots are created quickly using copy-on-write technology without impacting volume performance
Space-efficient storage Only changed blocks are stored after the initial full snapshot, minimizing storage costs
Regional availability Snapshots are stored within the same region as the source volume and can be used to create volumes in any zone within that region
Bootable volume support Snapshots of boot volumes can be used to create new virtual server instances with identical configurations
Consistency groups Create crash-consistent snapshots across multiple volumes attached to the same instance (available for certain configurations)
Fast restore snapshot clones By keeping a clone of the data in a zone within your VPC region and not in a separate regional storage repository, this feature can achieve a recovery time objective (RTO) quicker than restoring from a regular snapshot
Cross-regional snapshot copies Copies a snapshot from one region to another region. This feature can be used in disaster recovery scenarios when you need to start your virtual server instance and data volumes in a different region. The snapshot is created as normal, and when the snapshot is stable, a copy of the snapshot is created in the regional storage repository in the target region. When the snapshot copy in the remote region is stable, you can use and manage it independently from the parent volume or the original snapshot. The creation of the copy in the remote region takes time, for example, the creation of a full snapshot of a 3 TB volume in a remote region can take up to 12.5 hours.

The use cases applicable to these features include the following.

  • Pre-change backups before system updates or configuration changes
  • Creating golden images for rapid virtual server deployment
  • Cross-region disaster recovery using snapshot copies
  • Development and test environment provisioning from production snapshots
  • Multi-volume application-consistent backups using consistency groups

IBM Cloud VPC Block Storage Snapshots Limitations

  • Cumulative size of all snapshots for a volume cannot exceed 10 TB
  • Creating crash-consistent snapshots of multiple volumes leads to short-lived I/O suspension that can last from a few milliseconds to a few seconds, depending on the size and quantity of volumes
  • No application-aware quiescing (snapshots are crash-consistent)
  • Individual snapshot management (not policy-driven without Backup for VPC service)

For more information, see About Block Storage for VPC snapshots.

IBM Cloud Backup for VPC

IBM Cloud Backup for VPC provides a policy-driven approach to snapshot lifecycle management, allowing automated backup of VPC Block Storage volumes with configurable schedules and retention policies.

The following table details each backup policy component for IBM Cloud Backup for VPC.

IBM Cloud Backup for VPC backup policy components
Backup policy component Description
Backup plan Defines the cron-based schedule and retention rules for backups
Backup policy Container for one or more backup plans with target resource selection via user tags
Backup jobs Automated execution of snapshot operations based on defined schedules

The following tables lists the key capabilities for IBM Cloud Backup for VPC.

IBM Cloud Backup for VPC features
Feature Description
Backup policies Create backup policies with up to four plans to automate backups on daily, weekly, monthly, or yearly schedules
Automated retention management Configure retention of backups based on age or total count, with automatic deletion of expired backups
Tag-based automation Target Block Storage volumes for backup using tags configured in backup policy that match user-provided tags on volumes
Consistency group support Automate multi-volume snapshot consistency groups for crash-consistent backups across multiple volumes
Cross-region snapshot copies Integrate with cross-region snapshot copy feature for geographic disaster recovery
Centralized management Manage backup policies and monitor backup status through IBM Cloud console, CLI, API, or Terraform

The use cases applicable to these features include the following.

  • Automated daily/weekly/monthly backups for production virtual server instances
  • Compliance and regulatory data retention requirements
  • Operational recovery from accidental deletion or corruption
  • Multi-volume application backups using consistency groups
  • Geographic disaster recovery with cross-region snapshot copies

Comparison with Manual Snapshots:

Comparison with Manual Snapshots
Feature Manual Snapshots Backup for VPC
Automation Manual or external scripting Policy-based automation
Scheduling External orchestration required Built-in cron-based scheduling
Retention management Manual deletion Automatic retention policy enforcement
Volume targeting Direct volume selection Tag-based automatic targeting
Use case Ad-hoc backups, golden images Ongoing operational backups
Management Per-snapshot management Policy-driven centralized management

The following list is the best practice for IBM Cloud Backup for VPC.

  • Schedule automated backup policies during off-peak hours to minimize performance impact from I/O suspension
  • Use consistent tagging strategy across volumes to simplify backup policy application
  • Combine Backup for VPC with cross-region snapshot copies for comprehensive disaster recovery
  • Monitor backup job status and configure alerts for failed backup operations
  • Test restore procedures regularly to validate recovery time objectives (RTO)
  • Consider combining with IBM Cloud Backup and Recovery for application-aware backups and file-level recovery

See About Backup for VPC

IBM Cloud Backup and Recovery

IBM Cloud Backup and Recovery is a provider managed backup service for file, folder and database servers (MS SQL Server and SAP HANA) in VPC environments running on IBM Cloud. This service lets you define backup schedules to routinely protect data sources using a secure, agent-based, application-consistent backup service. Backup infrastructure is managed by IBM. The service is comprised of:

IBM Cloud Backup and Recovery services
Service Description
IBM Cloud Backup and Recovery service Managed by IBM, once provisioned via the IBM Cloud catalog, you access via a web browser to manage your backup policies, download the agents and restore.
VPE Gateway To improve performance it is recommended to use a VPE gateway to access the service instead of the native connection. To create one or more VPE gateways use the IBM Cloud catalog to order a VPC gateway and configure it to use the Backup and Recovery service.
Data Source Connector Installed via the IBM Cloud Catalog which install a VSI in your VPC. Install one or more (at least two recommended for HA) data connectors and increase as need to increase backup throughput. Data source connectors are used to establish connectivity between your source VSI and the service. The data source connectors also interacts with the service's IBM Cloud Object Service bucket where the backups are located. This bucket is managed by the provider and is not contained within your account.
Agent An agent is IBM Cloud Backup and Recovery software installed on the VSI that interacts locally with the operating system and source data being protected. The agent communicates with the Data Source Connector and Backup and Recovery instance during backup and recovery operations. Windows and Linux agents are currently available, with support for additional agent types planned for the future.

The following list is the key capabilities for IBM Cloud Backup and Recovery.

  • Agent-based backup for virtual server instances
  • Support for file-level and folder-level backups
  • Integration with IBM Cloud Object Storage for long-term retention
  • Scheduled and on-demand backup operations
  • Centralized management through IBM Cloud console:
    • Scheduled backups - Customize backup plans to run at daily, weekly or custom intervals
    • Policy-based backup - Use policies to define how and when the objects and files in a source are protected based on your use case. Define parameters such as the data to be protected, backup frequency, and how long to retain the backup copy
    • Security - Take advantage of granular role-based access control to stop unauthorized actors from modifying or deleting data
    • Application-consistent backup - Capture backups of your application data in a consistent state, allowing for clean restoration to a specific point in time without data corruption or loss.

For more information, see Getting started with Backup and Recovery

Veeam Backup & Replication with Agent-Based Backup

Veeam Backup & Replication (VBR) is an enterprise-grade backup and disaster recovery solution that provides comprehensive data protection for physical servers, virtual machines, and cloud workloads. For IBM Cloud VPC Virtual Server Instances, Veeam utilizes agent-based backup to provide application-aware, image-level backup and recovery capabilities.

Veeam Backup & Replication with agents delivers centralized management, flexible recovery options, and advanced data protection features including immutable backups, ransomware protection, and cloud-native integration with IBM Cloud Object Storage.

Veeam is not available directly in the IBM Cloud catalog, however, you can use Veeam software to back up your data on a VPC VSI and protect the following resources:

  • Individual volumes
  • Folders and files
  • Veeam Plug-Ins for Enterprise Applications further enhance Veeam Backup & Replication by enabling transactionally consistent backups of SAP HANA, Oracle, and Microsoft SQL Server databases
Veeam services
Service Description
Veeam Licenses You can order a Veeam license for the use of Veeam Agent and Veeam Backup and Replication software through the Veeam website or through the process described at Ordering Veeam stand-alone licenses from the IBM Cloud console.
Veeam Backup Server The core component that serves as the configuration and control center for the entire backup infrastructure. The backup server manages job scheduling, resource allocation, and centralized administration of all backup operations.
The Veeam Backup and Replication software can be installed only on a Microsoft Windows operating system. See Installing and operating the Veeam Backup and Replication software.
Veeam Agents Lightweight software installed on protected computers that perform data backup operations such as creating volume snapshots, reading backed-up data, and transferring data to target locations. Supported Linux® distributions include CentOS, RHEL, Ubuntu, and Debian. With the Veeam Agent for Linux® and the Veeam Agent for Microsoft™ Windows™ you can create backups and perform restores, see Installing and operating the Veeam Agent.

For more information, see About Veeam.

Protection Groups - Containers in the Veeam Backup & Replication inventory that specify computers on which Veeam Agents should be installed and managed, with selection based on individual IP addresses, Active Directory objects, or CSV files.

Backup Repository - Storage location where backup files are stored. Repositories can be:

  • VPC Block storage
  • IBM Cloud Object Storage (S3-compatible)

Veeam features - Veeam Backup & Replication offers automated deployment and management of Veeam Agents, allowing administrators to perform deployment, administration, data protection, and disaster recovery tasks remotely from the Veeam Backup & Replication console without installing and configuring agents on every computer individually.

Veeam features
Feature Description
Centralized Deployment
  • Automatic agent discovery and deployment to VPC virtual server instances
  • Manual deployment using installation packages for restricted environments
  • Pre-installed agent management for third-party deployment tools
    Remote upgrade and patch management for deployed agents
Centralized Job Management

= Agent backup jobs run on the backup server, via a schedule, allocating infrastructure resources, and managing job execution

  • Single backup job can process multiple protection groups and individual computers
  • Backup policies for scheduling agent jobs directly on protected computers
  • Unified console for monitoring all backup operations
Centralized Backup Management
  • Restore data from agent backups through the Veeam console
  • Copy backups to secondary repositories for 3-2-1 compliance
  • Export backups to standalone files for archival
  • Import existing agent backups into Veeam infrastructure
Application-Aware Processing
  • For Windows-based computers, Veeam Agent leverages Microsoft VSS technology to create VSS snapshots for transactionally consistent backups
  • Support for VSS-aware applications including Microsoft SQL Server, Exchange, Active Directory, and Oracle databases
  • For Linux systems, support for Oracle, MySQL, and PostgreSQL database processing to create transactionally consistent backups
  • Application item-level restore (database, mailbox, Active Directory objects)
Advanced Data Protection
  • Immutable, direct-to-object storage backups that naturally scale with needs
  • Inline malware detection during backup operations
  • Encryption in-flight and at-rest with AES-256
  • Built-in deduplication and compression to reduce backup file sizes and data traffic
  • WAN acceleration for remote site backups
Flexible Backup Targets
  • Local backup repositories (fast local recovery)
  • IBM Cloud Object Storage for long-term retention and offsite protection
  • Copy jobs to create secondary backup copies (3-2-1 rule compliance)
Granular Recovery
  • File-level restore from image-level backups without full system restore. Note image-level means the OS image, not the VSI image.
  • Application item-level restore including databases, mailboxes, and specific application objects
  • Volume-level restore for partial system recovery
  • Restore to original or alternate locations

Third-Party Backup Solutions

Various third-party backup solutions provide alternatives for VPC VSIs backup, available as self-managed with bring-your-own-license (BYOL) models. Additional third-party backup solutions compatible with IBM Cloud VPC include:

  • Commvault - Enterprise backup and recovery with application-aware capabilities
  • Rubrik - Cloud data management and ransomware protection
  • Veritas NetBackup - Enterprise data protection across hybrid environments
  • Cohesity - Data management platform with backup, DR, and archival capabilities

These solutions typically support both agent-based backup for virtual server instances, providing flexibility based on organizational requirements and existing tool investments.

Wanclouds VPC+ DRaaS (Disaster Recovery as a Service)

Wanclouds VPC+ DRaaS is a comprehensive SaaS-based Disaster Recovery as a Service (DRaaS) solution that enables IBM Cloud customers to backup their entire Virtual Private Cloud resources including network, compute, and storage, and restore them across different regions in IBM Cloud. This approach eliminates the need for expensive standby environments, replacing them with flexible on-demand recovery. The service is available directly from IBM Cloud Catalog. Key differentiators:

  • On-Demand Recovery Model: Instead of maintaining a constantly running replica of your production environment, you can create an on-demand disaster recovery scenario, minimizing costs and maximizing operational efficiency
  • Single Pane of Glass: Consolidated view of all cloud accounts and backups (VPC configs, VSIs, Data, COS buckets) under a single management interface

Wanclouds VPC+ DRaaS can backup and restore the entire IBM Cloud Virtual Private Cloud construct, configurations, and resources including:

Wanclouds VPC+ DRaaS components
Component Description
Network Components
  • VLANs and Subnets
  • Security Groups and Network ACLs
  • VPN Gateways
  • Load Balancers (ALB and NLB)
  • Floating IPs
  • Public Gateways
  • Virtual Private Endpoints (VPE)
Compute Resources
  • Virtual Server Instances (VSIs) - backup different versions of Windows and Linux OS flavors including Ubuntu, Red Hat Enterprise Linux (RHEL), Debian, and CentOS
  • Attached Data Volumes (Block Storage)
  • Instance configurations and metadata
Storage
  • Cloud Object Storage Buckets (COS Buckets) - backup data from one COS bucket to other buckets on demand
  • Volume attachments and configurations

Wanclouds VPC+ DRaaS provides flexible restore options:

Wanclouds VPC+ DRaaS restore options
Restore option Description
Cross-Region Restore
  • Restore or replicate infrastructure on demand in the same or across different regions
  • Restore data on-demand within or across regions for ultimate flexibility and security
  • Restore backed-up workloads, resources, applications, and data into an existing VPC or create a new VPC and restore on-demand in any region across IBM Cloud
Granular Restore
  • Restore entire VPC infrastructure
  • Individual resource restoration
  • Restore on-demand in the same VPC, Region, or across different VPCs and regions

Wanclouds VPC+ DRaaS also includes:

Wanclouds VPC+ DRaaS other features
Other features Description
Automated Discovery
  • Automatically discover VPC resources and track infrastructure across multiple regions
  • Real-time inventory of all cloud resources
Topology Visualization
  • Visualize VPC topology and inter-resource relationships
  • Visualize resource relationships for faster diagnostics
  • Dependency mapping for impact analysis
Compliance Management
  • Apply compliance policies and set up compliance policies
  • Discover and track all resources with compliance and visualization features
  • Audit trails for backup and restore operations

For more information, see IBM documentation and VPC+ DRaaS (VPC+ Disaster Recovery as a Service)

RackWare DR

RackWare RMM for VPC VSI Cross-Region Disaster Recovery

RackWare Management Module (RMM) platform integrates with IBM Cloud VPC to enable intelligent provisioning, workload mobility, and cross-region DR planning. RackWare enables both hot and warm standby deployments with rapid failover and rollback capabilities across IBM Cloud VPC with policy-driven DR strategies.

The RackWare Management Module (RMM) is deployed in IBM Cloud VPC and provides a centralized interface for managing, scheduling, and automating migration and disaster recovery tasks. The following list includes the features of RMM.

  • Deployed as VSI in IBM Cloud VPC
  • Uses Floating IP for external GUI access
  • SSH connectivity to source VSIs (primary region)
  • API access to IBM Cloud VPC for auto-provision in DR region
  • Available in IBM Cloud Catalog with seamless deployment
  • Agentless approach supporting any current version of Windows and Linux
  • Autoprovision capabilities with matching source specifications
  • TNG sync for improved RPO
  • Policy-driven automation for failover and failback
  • GUI-based management with real-time monitoring

The following tables lists the disaster recovery configuration models.

RMM disaster recovery configuration models
Disaster recovery configuration model Description

Passthrough where RMM acts as proxy between source and target servers

  • enabled by default
  • Source and target VSIs cannot communicate directly
  • Cross-region networking not configured
  • Additional security layer desired
  • Centralized traffic monitoring required
    No direct connectivity required between regions
    Centralized control and monitoring
  • Simplified firewall rules
    All data flows through RMM
    RMM bandwidth becomes bottleneck for large datasets
    Additional network hop adds latency
Direct Mode
  • Source and target VSIs both use Floating IPs
  • Transit Gateway configured between regions
  • High-bandwidth requirements
  • Minimize network latency
  • Better performance for large datasets
  • Reduced load on RMM server
  • Lower latency for data transfer
  • Network connectivity between regions (Transit Gateway or Floating IPs)
  • Firewall rules allowing SSH between source and target
  • Security considerations for cross-region traffic

RackWare provides flexible scheduling with options for hot and cold standby of replicated workloads.

Rackware capabilities
Aspect Hot Standby Warm Standby
DR VSIs State Running continuously Powered off
Failover Time Minutes 5-15 minutes (boot time)
Cost Higher (running compute) Lower (storage only)
Use Cases Mission-critical, RTO < 5 min Important workloads, RTO < 30 min
Sync Impact Faster (VSI already running) Must boot before final sync

For more information, see the following:

Next steps

Now that you understand the resiliency design options for VPC virtual servers, explore these related topics: