Connectivity loss during HA failover (FortiGate 100F – v7.2.8)
Good morning everyone,
I’m currently facing an issue with two FortiGate 100F clusters, both configured in HA (Active–Passive) mode, running firmware version 7.2.8.
Scenario
Each FortiGate pair is configured as an Active–Passive HA cluster.
Both clusters are interconnected through two point-to-point IPsec tunnels (using WAN1 and WAN2 interfaces).
Dynamic routing is handled via OSPF, while multicast traffic is managed using IGMP + PIM-Sparse Mode.
In the HA configuration, HA1 and HA2 are defined as heartbeat interfaces (both with the same priority — I’m not sure if that’s ideal), and WAN1 is configured as the only monitored interface to trigger failover.
Port 1 on both clusters is used for the internal LAN, connected to a switch distributing local traffic.
For OSPF, both IPsec tunnels are used for route propagation, with higher priority assigned to the tunnel over WAN1, and LAN interfaces set as passive.
Observed behavior
Under normal conditions everything works as expected:
OSPF adjacency forms correctly.
Routes are properly learned at both ends.
Security policies allow unicast and multicast traffic without issue.
However, during failover testing, I observed the following:
When WAN2 (secondary) is disconnected, there is no noticeable impact on traffic.
But when WAN1 (primary, monitored for HA) goes down, the failover occurs, and the backup becomes the new master.
Multicast traffic shows minimal disruption (1–2 seconds).
Unicast traffic (tested via ICMP ping between remote hosts) initially behaves similarly, losing only 1–2 packets, but after a few seconds (~5 ICMP packets), the connection drops again for about 20–30 seconds before recovering and stabilizing.
When performing the same test with one FortiGate from each cluster powered off (no HA role change possible), this issue does not occur — traffic switches between WAN links within ~2 seconds. This strongly suggests the issue is related to HA role switching or session synchronization.
Question
Could anyone help me understand the reason for this unicast connectivity loss during HA master failover?
Is there a way to mitigate or tune this behavior (e.g., adjusting convergence timers, session sync parameters, or OSPF handling during failover)?
Is it expected that connectivity briefly restores and then drops again a few seconds after the failover event?
Any insights or similar experiences would be greatly appreciated.
Best regards.