Created on
01-13-2021
09:56 AM
Edited on
10-03-2025
06:40 AM
By
Jean-Philippe_P
Description
This article describes how to change HA heartbeat timers to prevent false or unwanted failover from occurring.
Scope
FortiGate, HA clusters.
Solution
If a cluster unit CPU in an HA cluster becomes very busy, the cluster unit may not be able to send heartbeat packets before the heartbeat timer elapses.
When heartbeat packets are not sent in time, the cluster may experience a failover as other units report that the busy cluster unit did not respond.
A cluster unit CPU may become very busy if the cluster is subject to a syn flood attack, if network traffic is very heavy, or for other similar reasons.
Use the configuration parameters in this article to configure how the timer for HA heartbeat packets:
hb-lost-threshold <threshold_integer>
The lost heartbeat threshold is the number of consecutive heartbeat packets that must not be received from another cluster unit before the unit is assumed to have failed.
The default value is 6 (this can differ by model. For example, for VM models, the default can be 20), meaning that if 6 heartbeat packets are not received from a cluster unit in a row, that cluster unit is considered to have failed. The range is 1 to 60 packets.
If the primary cluster unit does not receive a heartbeat packet from a subordinate unit before the heartbeat threshold expires, the primary unit assumes that the subordinate unit has failed.
The same occurs in reverse if the subordinate unit does not receive a heartbeat packet from the primary unit, which causes the subordinate unit to begin negotiating to become the new primary unit.
The lower the lost heartbeat interval, the faster the cluster responds to a failure. However, the heartbeat lost threshold can be increased if repeated failovers occur because cluster units cannot send heartbeat packets quickly enough.
hello-holddown <holddown_integer>
The hello state hold-down time is the number of seconds that a cluster unit waits before changing from 'hello' state to 'work' state. A cluster unit changes from the hello state to the work state when it starts up.
The hello state hold-down time range is 5 to 300 seconds. The hello state hold-down time default is 20 seconds.
hb-interval <interval_integer>
The heartbeat interval is the time between sending heartbeat packets.
The heartbeat interval range is 1 to 20 (100*ms). The heartbeat interval default is 2 (200 ms).
A heartbeat interval of 2 means the time between heartbeat packets is 200 ms. Changing the heartbeat interval to 5 changes the time between heartbeat packets to 500 ms.
HA heartbeat packets consume more bandwidth if the hb-interval is short. However, if the hb-interval is very long, the cluster is not as sensitive to topology and other network changes.
An example configuration is provided below. Execute it from a global VDOM:
config system ha
set hb-lost-threshold 6
set hello-holddown 20
set hb-interval 2
end
In this configuration example, if a unit does not receive 6 consecutive heartbeat packets (6*200ms = 1.2 seconds) from a unit, that cluster unit is considered to have failed.
Note: In some configurations following setup may appear: set hb-interval-in-milliseconds 100ms may appear, which can be too strict. Unset this command and use set hb-interval 2 instead, because if the millisecond interval remains active, the seconds-based setting will not take effect.
Related article:
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.