FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
FortiArt
Staff
Staff
Article Id 378884
Description This article presents a possible root cause for instability in an HA cluster configured with monitored interface(s) triggering repeated failovers.
Scope FortiGate.
Solution

Introduction:

When a monitored interface in an HA cluster goes down, it triggers a failover for the cluster members. When the monitored interface experiences a flapping up/down behavior, this in turn will trigger frequent failovers among the cluster members, causing instability. This in turn may affect the system resources such a memory, CPU, etc., especially if the session-pickup setting is enabled.

 

Scenario:

Here, it is assumed that the FortiGate is configured as follows. System link-monitor configured to use wan1 to ping an external server, for example, 8.8.8.8.

 

config system link-monitor
    edit "wan1-ping-server"
        set srcintf "wan1"
        set server "8.8.8.8"
        set update-cascade-interface enable         

        set update-static-route enable             
 
    next
end

 

Upon checking the system link-monitor (diagnose system link-monitor status), it is observed that the status is flapping between alive/dead. This indicates there is a reachability problem, which may be due to an ISP issue or any intermediate router routing issues in the path to the destination.

 

The system HA cluster is configured as per the following (port1 is the monitored interface):

 

config system ha

    set group-name "FGT-HA"
    set mode a-p
    set monitor "port1" 

end

 

It is necessary to relate the flapping behavior of the system link-monitor interface, wan1, with the repeated failovers in the system HA cluster.

 

Root Cause:

It is necessary to check the configuration of the system interface settings for the source interface in the system link-monitor, i.e., wan1. Confirm if the fail-detect setting is enabled and which system interface it's connected to using the fail-alert-interfaces setting. As it is evident from the following configuration, it was noticed that the system link-monitor is the source of the problem, as it triggers the flapping behavior on the monitored interface under the system HA cluster configuration:

 

config system interface

    edit "wan1"

        set ip 192.168.1.254 255.255.255.0

        set fail-detect enable                          

        set fail-detect-option detectserver link-down   

        set fail-alert-method link-down                   

        set fail-alert-interfaces "port1"              

    next
end

 

Additionally, the HA monitored interface status can be verified from the 'get system  ha status' command by checking the 'mondev' status in the output.  

Technical Tip: 'get system ha status' showing warning with 'mondev down' message 

 

Note: There may be other causes that trigger the flapping behavior for the system HA cluster units. This article shows only one possible root cause.

 

Related articles:

Technical Tip: Configuring HA Monitored Interfaces for Failover

Technical Tip: Possible cause of HA monitor interface not triggering HA failover 

Technical Tip: Troubleshooting unexpected High Availability (HA) failover