Securing multiple landing zones with a transit VPC and advanced security capabilities

A few common approaches can be used in IBM Cloud to provide centralized traffic inspection capabilities that use firewall appliances in a transit VPC, an approach that is referred to as "hub-and-spoke".

"Hub-and-spoke" is also known as "hub-spoke" or "spoke-hub" architecture.

Benefits and concept of a transit VPC design

Using a transit VPC in a hub-and-spoke architecture is a popular architectural pattern for several reasons:

  • Improved security posture: By routing all VPC-to-VPC and external traffic through a central inspection point, a transit VPC enables consistent enforcement of security policies and simplifies audit and compliance across environments.

  • Simplified network management: A hub-and-spoke design reduces the complexity of managing appliances in each VPC. Instead, network policies can be configured centrally and new VPCs can be onboarded as extra spokes with minimal changes, avoiding exponential growth in routing configurations.

  • Cost efficiency: Centralizing security and networking appliances in the transit VPC eliminates the need for duplicating them in each spoke VPC, reducing cost for cloud consumption and management of the environment.

Transit VPC Concept

The following drawing illustrates the high-level view of a hub-and-spoke design that uses a transit VPC.

  • All traffic into the cloud environment flows through a transit VPC that sits between external components and the VPCs hosting customer workloads.
  • Firewall appliances are hosted in the transit VPC to centrally inspect traffic.
  • The transit VPC is connected by using a transit gateway to the other VPCs hosting the actual workload. These connections are the "spokes". Inter-VPC traffic (flowing between spoke VPCs) can optionally also be directed through the firewall for inspection.

Transit VPC high level view
Transit VPC high level view

Architecture approaches

A few different approaches can be used to provide the described traffic inspection in the IBM Cloud transit VPC.

  1. FortiGate Virtual firewalls (VSIs) with SDN Connector.
  2. Virtual firewalls (VSIs) with BGP-based traffic management.
  3. Virtual firewalls (VSIs) with Network Load Balancer for traffic management.
  4. Virtual firewalls on VPC Bare Metal instances (SmartNIC VNI mobility).

Each option has its own capabilities and major considerations. You can find a comparison table of all approaches here.

Option 1: FortiGate Virtual firewalls with SDN connector

This approach uses Fortinet’s FortiGate next-generation firewall appliances to provide centralized traffic and security management. It uses Fortinet’s SDN connector to provide dynamic updates for changing IP addresses. For example, where a failover between two appliances occurs.

Architecture overview

This drawing shows a high-level view of the architecture and components that are used for this approach:

  • External resources are connected to a transit VPC.
  • Two FortiGate firewall appliances are deployed on VPC VSIs in the transit VPC.
  • The FortiGate appliances are configured as an active/passive HA (high availability) pair.
  • The transit VPC is connected with TGW to spoke VPCs (hosting workloads).

High level architecture option 1
High level architecture option 1

Capabilities

Using the FortiGate Virtual firewalls with SDN connector architecture allows users to take advantage of the following capabilities.

Supported firewall vendors

Here we identify which firewall vendors are supported.

This pattern is intended to be used with Fortinet (FortiOS) virtual appliances. It relies on the use of a Fortinet SDN connector that integrates with the IBM Cloud API to provide seamless failover between appliances. For details on using Fortigate next-generation firewalls in IBM cloud see About FortiGate for IBM Cloud. You can use an automation asset to deploy a pair for Fortigate appliances in an active/passive HA pair by using Terraform.

Supported Connectivity Options

This pattern supports inspection for the following connectivity options:

  • Public connectivity: the active firewall appliance is configured to use a public floating IP address or Public Address Range, allowing the inspection of public traffic.
  • Private connectivity with Direct Link: inspection of private traffic that is entering the environment with Direct Link is supported.
  • Private connectivity with VPN: inspection of private traffic that is entering the environment with VPN is supported.
  • East/West traffic: inspection of traffic that is flowing between the spokes VPC is supported.

Architecture option 1
Architecture option 1

Connectivity guidance

Many possible variations that use the preceding connectivity options are possible. Each variation requires its own considerations and adjustments to the configuration. Covering all these variations is outside of the scope containable here. Instead, a working example and provide general guidance is shown.

East-west inspection

East-west traffic inspection can be achieved by using a combination of an egress route in the spokes and an ingress route on the TGW.

When a failover of the firewalls occurs, only route tables of the VPC hosting the firewalls are updated. The TGW ingress route of the transit VPC is updated to reflect the new active firewall, while egress routes of the spoke VPCs are not updated.

With the combination of the two route tables:

  • The egress route sends traffic toward the transit VPC (by specifying an IP of the transit VPC as the next hop).
  • The TGW ingress route of the transit VPC then forwards the traffic to the correct (active) firewall as next hop.

Without the TGW ingress route, communication works until a firewall failover occurs.

High availability

IBM VPC Cloud users can deploy their FortiGate virtual appliances in an active/passive HA configuration.

The HA configuration uses FortiGate’s native clustering mechanism together with FortiGate’s SDN connector for IBM Cloud integration. Using this connector, a failover automatically triggers routing changes and floating IP address reassignment on the IBM Cloud with the IBM Cloud API to reflect the new path. For details on this integration see FortiGate SDN connector. This failover mechanism enables:

  • Fast failover between FortiGate nodes.
  • Automatic route updates and FIP reassignment by using SDN connector.
  • Stateful failover by using session sync (session pickup) can be configured.
  • While zonal and regional (cross-AZ) redundancy configurations are supported, regional configurations require Public Address Ranges to be used for public traffic.

As of 10 September 2025, the SDN connector does not officially support regional failover. Check out FortiGate SDN connector for updated information.

Other considerations

Other criteria to consider for your architecture include performance and operational considerations.

Performance

Using virtual server instances to inspect traffic provides you with flexible performance options of good levels. You can select instance profiles up to 25 Gbps. The actual performance depends on multiple factors like enabled features (for example, TLS termination) or licensing restrictions.

Operational considerations

Each approach that is described comes with their own set of operational considerations. For this deployment option, keep in mind:

  • FortiGate Skills. This option works with FortiGate appliances only.
  • SDN plug-in and manual routes. While failover is automated, configuration and maintenance of manual routes is typically required.

Option 2: Virtual firewall appliances with BGP Over GRE

This solution approach uses virtual firewall appliances that are deployed within the Virtual Private Cloud (VPC) to enable centralized traffic orchestration, inspection, and security policy enforcement. It offers enhanced flexibility by supporting a wide range of network firewall appliances available on IBM Cloud. The deployment of firewall appliances across multiple availability zones helps ensure HA and fault tolerance, strengthening the resilience of multizone workloads against potential zone-level failures.

The solution primarily implements BGP-over-GRE tunneling between the IBM Cloud Transit Gateway and independently deployed virtual firewall appliance across each availability zone. This architecture enables highly resilient and scalable connectivity between workloads that are hosted on IBM Cloud and customer environments, including on-premises infrastructure and remote external users. The use of BGP-over-GRE facilitates dynamic route exchange and fault-tolerant communication paths, enhancing overall network reliability and performance.

You can find an "at-a-glance" comparison table of all approaches here.

Architecture overview

The following diagram illustrates the high-level architecture and key components of the proposed solution:

  • External connectivity. External resources, including customer on-premises infrastructure and remote users, are connected to the IBM Cloud with a dedicated transit VPC.
  • Security. A firewall appliance is deployed in each availability zone within the transit VPC. A minimum of two availability zones is recommended to help ensure fault tolerance and service continuity.
  • Deployment model. Each firewall appliance operates as an autonomous instance in its respective zone.
  • Workload integration. The transit VPC is connected to multiple spoke VPCs that host application workloads. This connectivity is established through IBM Cloud transit gateway, enabling secure and scalable east-west and north-south traffic flow in the environment.

High level architecture option 2
High level architecture option 2

Supported firewall vendors

The pattern that is covered in this approach can universally be used with any firewall vendor that supports BGP. IBM Cloud provides a comprehensive range of virtual firewall solutions and vendor-neutral appliances, which are designed to address a wide spectrum of enterprise security and connectivity needs.

Supported connectivity options

This pattern supports inspection of traffic for multiple connectivity models that are tailored to distinct workload and security requirements:

  • Public inbound access is securely managed through a firewall that is hosted in a transit VPC, with a front end by Cloud Internet Services (CIS) for enhanced edge protection (DDoS protection, WAF, TLS, CDN, policy enforcement). It also has traffic control (global load balancing). External-facing firewall appliances in the transit VPC are configured with Floating IPs or Public Address Ranges (PAR) to enforce ingress/egress policies and integrate with CIS for centralized edge protection.
  • Private connectivity to on-premises environments through BGP over GRE tunnels on IBM Cloud Direct Link.
  • Secure access for remote users with SSL/TLS VPN terminated at the firewall appliances.
  • East-west traffic flow between spoke VPCs within the IBM Cloud environment, enabling controlled communication across VPCs by using Transit Gateway, segmentation, inter-VPC routing, and firewall zoning.

Architecture option 2
Architecture option 2

Connectivity guidance

Each connectivity model can be implemented in different ways, and each variation might require specific configuration changes and architectural considerations that depend on workload characteristics, security policies, and operational requirements.

VPN connectivity

This pattern benefits from an end-to-end BGP-managed (dynamic routing) approach. While the other options utilize VPNaaS approaches with a VPC VPN Gateway, this pattern uses VPN connections that are terminated on the firewall appliances themselves. This termination is due to the nature of the VPC VPN Gateway, which currently relies on a combination of static route tables and policies. Mixing static and dynamic routing in this scenario introduces complexity that can be avoided by terminating the VPN connection on the firewall appliances and by using BGP for all traffic.

GRE tunnel termination and ASNs

This pattern uses GRE tunnels between the firewalls and on-premises environment that terminate on the on-premises network devices. It is also technically feasible to terminate these tunnels on the outbound transit gateway. However, doing so results in GRE tunnels connecting from the same firewalls to two separate transit gateways (inbound between the firewalls and the spoke VPCs and outbound between the firewalls and the on-premises environment). Because all transit gateways in a zone have the same ASN, this configuration is seen as loop by the firewall and requires rewriting the ASN to a unique value. To avoid this complexity, the GRE tunnel is terminated at the on-premises device.

High availability

High availability for the firewall functionality is achieved through the inherent capability of the BGP protocol to detect path failures and dynamic rerouting of traffic through alternative paths. This approach does not rely on extra mechanisms like SDN plug-ins, static routes, or firewall clustering. Instead, you provide multiple possible paths and use BGP to prioritize traffic over these paths and allow BGP to dynamically detect path failures and select alternative paths based on BGP attributes.

BGP path failure detection times can vary greatly depending on the type of failure and implementation. For example, whether bidirectional forwarding detection (BFD) is configured, testing failover scenarios, and associated times is always advised.

Performance

Using Virtual Server Instances to inspect traffic provides you with flexible performance options of good levels. You can select instance profiles up to 25 Gbps. The actual performance depends on multiple factors like enabled features (for example, TLS termination) or licensing restrictions.

Operational considerations

BGP allows for dynamic routing, allowing the user to dynamically adjust to expansions and changes in your network environment, reducing manual configuration and making connections more reliable. It facilitates dynamic path selection and policy-based control over traffic flows.

At the same time, it requires BGP skills to, for example, configure BGP peering, path filtering, and setting ASN attributes. In addition, the exchange of BGP routing information relies on GRE tunnels that require extra configuration effort in the cloud environment and on on-premises networks.

Option 3: Virtual firewall appliances with network load balancer for traffic management

This approach uses an IBM Cloud VPC Network Load Balancer (NLB) in front of a Virtual Network Function (VNF) device to achieve a fully high resilient architecture. This setup requires enabling the routing mode feature of the IBM Cloud Network Load Balancer (NLB) for VPC.

You can find an "at-a-glance" comparison table of all approaches here.

Architecture overview

This drawing shows a high-level view of the architecture and components that are used for this approach:

  • A transit VPC hosts a pair of transparent VNFs (for example, a pair of Fortigate). An NLB in routing mode is also deployed in the same VPC.
  • A couple of spoke VPCs are deployed. These VPCs host the VSIs where the workloads run. In each VPC, a VPC Application Load Balancer with the VSIs as members of the backend pool exists.
  • A Transit Gateway is used to interconnect the VPCs.
  • The VNF appliances are configured as active/active HA pair.

High level architecture option 3
High level architecture option 3

Capabilities

The virtual firewall appliances and connectivity options that are supported by this option has the following capabilities.

Supported virtual firewall appliances

IBM Cloud currently offers a diverse catalog of virtual firewall solutions that are designed to meet diverse enterprise security and connectivity requirements.

Supported connectivity options

Here we identify which firewall vendors are supported.

This pattern supports private connectivity only. Public connectivity is not supported with this option as a Network Load Balancer in route mode can be provisioned as instance with private connectivity only.

  • Private connectivity. Inspection of private traffic that is entering the environment with Direct Link or VPN is supported.
  • East-west traffic. Inspection of traffic that is flowing between the spokes VPC is supported as well.

Architecture option 3
Architecture option 3

Connectivity guidance

High availability

IBM VPC Cloud users can deploy their VNF appliances in both active/active, HA, and active/passive configuration.

If you are using an active/passive configuration, each VNF appliance should be added to a specific NLB back-end pool. Active VNF should be added to the primary pool and then a failsafe policy should be created so to route the traffic to the passive VNF.

In both cases, the VNF appliances should be configured in transparent mode so that the traffic inspection is achieved but without applying routing or network address conversion. Because this VNF is transparent, the user makes a TCP request to the target virtual server instance (destination), for example a 10.1.0.2 in spoke VPC1, instead of the firewall IP address.

Because the NLB is configured with routing mode enabled, TCP requests on all ports are forwarded automatically to their destination. Because the VNFs are in the NLB pool, they are the next hop after the NLB.

In an active/active configuration, an ingress route for the transit gateway (as a source) is also required to help ensure that the return packet from the target will hop through the NLB on the return trip, then through the same VNF it was sent through, and finally back to the client.

Performance

Using the Network Load Balancer in route mode is not a good solution if high network traffic is expected because the bandwidth is limited by the performance of NLB is limited to about 10 Gbps in route mode. This limitation is worth considering when you are evaluating this option for workloads with intense performance requirements.

Option 4: Virtual firewalls on VPC Bare Metal servers (SmartNIC VNI mobility)

This approach uses two IBM Cloud VPC Bare Metal servers to run two virtual firewall appliances in active/passive failover mode. This approach is vendor neutral, providing the highest level of performance and a transparent failover mechanism. If the active virtual firewall appliance fails, the bare metal’s data plane virtual network interface (VNI) IP address moves seamlessly to the new active firewall appliance through gratuitous address resolution protocol (ARP).

You can find an "at-a-glance" comparison table of all approaches here.

Architecture overview

This drawing shows a high-level view of the architecture and components that are used for this approach:

  • External resources are connected to a transit VPC.
  • A transit VPC hosts a pair of VPC bare metal servers that are acting as hypervisors.
  • The virtual firewall appliances, which are configured as active/passive run on the transit VPC bare metal hypervisors.
  • The transit VPC bare metal’s smart NIC provides the VNIs needed by the firewall virtual appliances, including the floating data plane VNI.
  • Spoke VPCs host the actual workloads (VSIs or other).
  • A transit gateway interconnects the transit and spoke VPCs.

High level architecture option 4
High level architecture option 4

Supported firewall vendors

Here we identify which firewall vendors are supported.

This pattern is vendor neutral. It relies on the use of VPC Bare Metals to host the virtual firewall appliances. It provides a transparent failover mechanism through the move of the data plane VNI assigned to the bare metal appliance and the virtual appliance firewall from the active to the passive appliance if there is a failover.

Currently, the VPC bare metal (hypervisor) virtual firewall appliance stack has to be built and configured manually. The recommended hypervisor to use is QEUMU-KVM as it has the necessary functions and is an industry standard for which most firewall vendors provide virtual appliances images.

There are different ways to pass the VNIs to the virtual firewall appliances. With the recommended QEUMU-KVM hypervisor, either MacVTap or PCIe pass-through can be used. The PCIe passthrough approach allows for higher performances. However, it requires a special “ionic” driver in the virtual firewall appliance that is not offered with each provider.

Supported connectivity options

This pattern supports inspection for the following connectivity options:

  • Public connectivity. The active firewall appliance is configured to use a public floating IP address or a public address range that targets the data plane VNI associated to the active virtual firewall appliance, allowing the inspection of public traffic.
  • Private connectivity with Direct Link. Inspection of private traffic that is entering the environment with Direct Link is supported.
  • Private connectivity with VPN. Inspection of private traffic that is entering the environment with VPN is supported.
  • East-west traffic. Inspection of traffic that is flowing between the spokes VPC is supported.

Architecture option 4
Architecture option 4

Connectivity guidance

Many possible variations that use the preceding connectivity options can be used. Each variation requires their own considerations and adjustments to the configuration. Let's look at a working example of one method and provides general guidance.

East-West inspection

East-west traffic inspection is achieved by defining a custom egress route in the spokes VPCs, defining the data plane VNI IP address as the next hop.

High availability

IBM VPC Cloud users can deploy their virtual appliances in an active/passive and HA configurations. The HA configuration uses the virtual firewall native clustering mechanism together with IBM Cloud VPC Bare Metal SmartNIC VNI floating capability. If there is a failover, the passive virtual appliance sends a gratuitous ARP packet with the data plane VNI IP address. This gratuitous ARP is detected by the IBM Cloud VPC layer 2 networking, which then associates the data plane VNI with the interface of the bare metal server where the previously passive virtual firewall appliance is running.

This failover mechanism enables:

  • Rapid failover between the virtual firewall appliances.
  • Automatic failover with no need for any route update.
  • Session synchronization between the active and active (“stateful” failover).
  • Intra-zone failover.

Performance

VPC Bare Metal servers are equipped with 100 Gbps smart NIC interfaces and are available with multiple CPU and memory options. Among the scenarios presented here, this one has the highest known level of performance.

An important factor in the performance level that can be achieved is the way the data plane VNI is passed to the virtual firewall appliance. With the recommended QEMU-KVM hypervisor, two main ways to pass the VNI to the virtual firewall appliance are possible:

  • MacVTap – compatible with any virtual firewall appliance type.
  • PCIe pass-through, where the second physical SmartNIC present on the bare metal server is directly passed to the virtual firewall appliance. This option provides higher performances but requires a special “ionic” driver present in the virtual firewall appliance. Check your virtual firewall appliance to ensure that it supports an ionic driver.

The PCIe pass-through approach allows for better performances as the virtual firewall appliance manages directly the physical network interface of the host.

The actual performance also depends on factors like the enabled features (for example, TLS termination or Layer 7 inspection, among others) or any licensing restrictions.

Operational considerations

For this deployment option, keep in mind:

  • Vendor neutral. It relies on the use of VPC Bare Metals to host the virtual firewall appliances.
  • Zonal resiliency only. The active and passive virtual firewall appliance.

Additional considerations and use cases

This section covers considerations for general use cases.

Use case - Inspection of VPE traffic (East-West)

Special consideration is required for scenarios that require inspection of VPE traffic that flows between spokes.

Requirement

East-west inspection must include inspection of VPE traffic between spokes

Problem

VPEs currently do not consider VPC custom egress routes that are normally utilised to force east-west traffic through the firewall. As shown in figure 10, this results in return traffic bypassing the firewall in the transit VPC.

VPE Return Traffic Issue (bypassing the firewall)
VPE Return Traffic Issue

Solution

In order to avoid this problem, you can front the VPE with a route mode NLB (in bypass mode, i.e. with no backend pool configured) and let the route mode NLB handle (and honour) VPC custom egress route as shown in figure 11. A custom ingress route for the VPC hosting the VPE will route traffic destined for the VPE through the NLB, ensuring the NLB will handle the traffic before leaving the VPC, respecting the egress routes and forwarding traffic correctly to the firewall.

e.g. VPC Route Table with Source: Transit Gateway (and/or) Direct Link. Destination: VPE IP Next hop: NLB IP

VPE return traffic flowing through FW using NLB in routing mode
VPE return traffic flowing through FW using NLB in routing mode

Example

A common scenario is the access of ROKS master nodes through a VPE from a system in another spoke where return traffic will not be inspected by the transit VPC firewall.

Comparison at a glance

This table provides a side-by-side comparison of the described options to help select the appropriate approach for your deployment based on the listed criteria.

Comparison table
Features Option 1
Fortigate (SDN connector)
Option 2
BGP failover
Option 3
NLB failover
Option 4
VPC Bare Metal with SmartNIC VNI mobility
Supported firewall Vendor Fortinet only Vendor independent Vendor independent Vendor independent
Public traffic Yes
(PAR required for regional failover)
Yes
(use a GLB for load balancing)
No
(route mode NLB is private only)
Yes
Private traffic with Direct Link* Yes Yes Yes Yes
Private traffic with VPNaaS* Yes No
VPNaaS: not yet integrated with BGP, use VPN to firewall
Yes Yes
East-west (spoke VPC to spoke VPC) traffic inspection Yes Yes Yes Yes
Zonal firewall failover Yes Yes Yes Yes
Regional firewall failover (cross zones) Yes
(with upcoming SDN connector and PAR)
Yes No No
Route failover mechanism Fortigate cluster and SDN connector BGP NLB VNI mobility with gratuitous ARP
Failover speed Fast Slower (BGP, can use BFD) Fast Fastest
HA mode
Active/active (A-A) or Active/passive (A-P)
A-P A-A with BGP path prioritization A-A and A-P A-P
Setup and operational considerations static routes and plug-in configuration GRE tunnels setup, BGP peering configuration static routes QEMU-KVM setup on bare metal, static routes
Firewall performances (throughput) High (VSI dependent) Lower (VSI dependent but GRE overhead) Lower (NLB route mode limitations) Highest
Terraform provisioning support Yes Yes Yes partial (Ansible needed for hypervisor configuration)

References

Direct Link Dedicated Direct Link Connect