Created on
03-27-2025
03:57 AM
Edited on
09-09-2025
12:04 AM
By
Jean-Philippe_P
Description |
This article describes the failover-hold-time function for High Availability (HA) FortiGate clusters as well as an expected behavior regarding HA failovers when the following conditions are met:
Important: The failover-hold-time setting is not meant to prevent HA failovers from occurring when monitored interfaces are flapping (rapidly going down and coming back online), and instead it is meant to reduce the frequency with which failovers occur. The article discusses the exact mechanisms for this feature in further detail below. |
Scope | FortiGate. |
Solution |
As a primer, the failover-hold-time setting under config system ha was first added in FortiOS v7.0.0 and is used to specify a length of time in seconds after a monitored interface changes state (either going down OR coming back up) that the Primary FortiGate should wait before it should potentially trigger an HA failover. See also: Technical Tip: How to configure HA failover delay for monitored ports.
However, it is important to understand that two separate actions occur whenever an HA monitored interface changes state, and these actions are triggered after the failover-hold-time expires following the most recent change in interface state:
Additionally, as per the HA primary unit selection criteria documentation, the default process for electing HA Primary units is based on Monitored Interfaces -> Uptime -> Priority -> Serial Number, but the order can be changed to Monitored Interfaces -> Priority -> Uptime -> Serial Number if override is enabled for one or more of the FortiGate cluster members. What this ultimately means is that the failover-hold-time setting can produce different effects depending on whether or not override has been enabled for one of the cluster members.
Consider the following scenario where a) failover-hold-time is set to 5 seconds, b) a monitored interface on the Primary FortiGate goes down and then recovers within 2 seconds, and c) the Primary has a superior/higher priority to the Secondary:
Example Scenario:
The example below further demonstrates how the FortiGate behaves when a monitored interface goes down and then recovers before the failover hold timer. For reference, the following HA configuration has override disabled and a failover-hold-time set to 60 seconds:
config system ha set group-id 2 end
In the following System Event logs, the monitored interface can be seen going down and then recovering within 17 seconds:
date="2025-02-06" time="14:03:45" id=7468291105147062624 bid=48863959 dvid=107 itime=1738847025 euid=3 epid=3 dsteuid=3 dstepid=3 logver=700140601 logid="0100020099" type="event" subtype="system" level="warning" action="interface-stat-change" msg="Link monitor: Interface LACP-1 was turned down" logdesc="Interface status changed" status="DOWN" eventtime=1738847025636361195 tz="+0100" devid="FG100XXX" vd="root" devname="FR530"
Likewise, the output of diagnose sys ha history read will show similar information. Note how a failover was triggered 60 seconds after the monitored interface came back online. This is expected, as the failover-hold-time can be triggered by general state changes for monitored interfaces (going down or coming back online).
<2025-02-06 14:05:03> FG100FTKXXXX is elected as the cluster primary of 2 member <---- Failover happens exactly 1 minute after the monitored interface comes up.
Conclusion: As per the above demonstration, failover-hold-time is not meant to directly prevent failovers from occurring when a monitored interface goes down and then recovers quickly (aka 'interface flapping'). Instead, it is only meant to add a delay to the HA cluster so that it does not constantly perform a cluster election assessment. To fully prevent HA failovers from occurring due to monitored interface flapping, the override setting must also be set so that the Primary FortiGate can retain the HA primary role after the failover-hold-time expires (even though the override setting and failover-hold-time are not directly related to one another). |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.