Good morning everyone,
I’m currently facing an issue with two FortiGate 100F clusters, both configured in HA (Active–Passive) mode, running firmware version 7.2.8.
Each FortiGate pair is configured as an Active–Passive HA cluster.
Both clusters are interconnected through two point-to-point IPsec tunnels (using WAN1 and WAN2 interfaces).
Dynamic routing is handled via OSPF, while multicast traffic is managed using IGMP + PIM-Sparse Mode.
In the HA configuration, HA1 and HA2 are defined as heartbeat interfaces (both with the same priority — I’m not sure if that’s ideal), and WAN1 is configured as the only monitored interface to trigger failover.
Port 1 on both clusters is used for the internal LAN, connected to a switch distributing local traffic.
For OSPF, both IPsec tunnels are used for route propagation, with higher priority assigned to the tunnel over WAN1, and LAN interfaces set as passive.
Under normal conditions everything works as expected:
OSPF adjacency forms correctly.
Routes are properly learned at both ends.
Security policies allow unicast and multicast traffic without issue.
However, during failover testing, I observed the following:
When WAN2 (secondary) is disconnected, there is no noticeable impact on traffic.
But when WAN1 (primary, monitored for HA) goes down, the failover occurs, and the backup becomes the new master.
Multicast traffic shows minimal disruption (1–2 seconds).
Unicast traffic (tested via ICMP ping between remote hosts) initially behaves similarly, losing only 1–2 packets, but after a few seconds (~5 ICMP packets), the connection drops again for about 20–30 seconds before recovering and stabilizing.
When performing the same test with one FortiGate from each cluster powered off (no HA role change possible), this issue does not occur — traffic switches between WAN links within ~2 seconds. This strongly suggests the issue is related to HA role switching or session synchronization.
Could anyone help me understand the reason for this unicast connectivity loss during HA master failover?
Is there a way to mitigate or tune this behavior (e.g., adjusting convergence timers, session sync parameters, or OSPF handling during failover)?
Is it expected that connectivity briefly restores and then drops again a few seconds after the failover event?
Any insights or similar experiences would be greatly appreciated.
Best regards.
Check Below
1) Hope have "set session-pickup" enable enabled under HA
2) Hope if have udp-connectionless sync under ha
3) For IPsec enable "set ha-sync-esp-seqno enable" configure this under phase1
Hello again,
In my initial configuration I did include the set session-pickup enable command under the HA settings, and the tunnels also have set ha-sync-esp-seqno enable configured. However, I haven’t been able to find the “udp-connectionless sync” option — I’m not sure where that specific setting should be applied. Additionally, under HA I currently have the following parameters set:
set session-pickup-connectionless enable
set session-pickup-expectation enable
set session-pickup-delay disable
Thank you very much for your time and assistance.
Regards.
Your solution would be to enable graceful restart under "config router ospf":
set restart-mode graceful-restart
Hello AEK.
I haven’t configured set restart-mode graceful-restart in OSPF. I’ll review the documentation you shared and test with that configuration to see if it helps the failover process work properly.
Again, thank you for your time and assistance.
Regards.
User | Count |
---|---|
2637 | |
1400 | |
810 | |
677 | |
455 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.