| Description | This article describes how a FortiGate HA failover in Google Cloud can delete SDN routes without recreating them if the failover is interrupted, and provides guidance on why it happens and how to prevent it. |
| Scope | FortiGate, FortiOS v7.0+. |
| Solution | FortiGate High Availability in Google Cloud Platform. FortiGate High Availability (HA) in Google Cloud Platform (GCP) operates fundamentally differently from on-premises environments because the underlying GCP Virtual Private Cloud (VPC) networking fabric does not support Layer 2 (L2) features critical for traditional HA, such as Gratuitous ARP (GARP), shared MAC addresses, or floating IPs. To manage failovers, FortiGate utilizes the GCP SDN Connector (Software-Defined Network Connector). This feature integrates with the GCP API to dynamically update custom static VPC routes so that traffic is always directed to the active FortiGate instance. In an Active-Passive (A-P) cluster, the Primary (Active) node is the only one authorized to make these route changes. When a failover occurs, the newly promoted Primary node executes an API call via the SDN Connector to modify the existing custom route. It updates the route's Next Hop to its own internal IP address, effectively steering traffic away from the failed instance. This dynamic routing mechanism ensures fast failover but carries the risk of a split-brain condition, where a communication failure between the two nodes leads both to briefly assume the Primary role and compete for control over the VPC routes. Short Description of FortiGate HA in GCP.
When a node becomes primary:
During a FortiGate HA failover in GCP, the newly promoted primary node begins with the SDN routing update workflow:
Because of this, HA behavior and routing stability depend on:
Problem Scenario.
Because the route-creation step is interrupted:
Because FortiOS does not keep a complete local record of the routes it deleted, it cannot automatically recreate the missing routes once the HA state returns to normal. As a result, the routes remain missing and traffic becomes blackholed.
Workarounds to mitigate this issue.
Slowing down failovers prevents half-completed SDN updates.
If the cluster experiences temporary CPU spikes, short hbdev interruptions, or brief network instability, secondary may temporarily promote itself to primary, trigger route deletions, and then return to secondary before the route-creation step completes.
Note: AWS, Azure, and OCI support true route updates via their APIs, so they are not impacted by this specific route deletion/insertion edge case.
Related documents: |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.