FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
wcruvinel
Staff
Staff
Article Id 393792
Description

This article describes the challenges of link failure detection on LACP (802.3ad) interfaces in FortiGate environments and proposes methods to accelerate HA failover in such scenarios.

Unlike physical interfaces, which rely on immediate electrical signaling to detect link loss, LACP interfaces operate as logical bundles and depend on LACPDU heartbeat messages exchanged at timed intervals.
This mechanism introduces delays in detecting link failures:
Slow mode (default): heartbeats every 30 seconds
Fast mode: heartbeats every 1 second (with failure detection potentially taking up to 3 seconds)

Because of this protocol-level dependency, failover events on LACP links can occur more slowly than with physical interfaces.
The article outlines strategies to mitigate this delay and enhance HA responsiveness.

Scope

FortiGate units in an HA cluster using aggregate (LACP) interfaces on FortiOS 7.x and above.

Solution

Why LACP detects failures more slowly than physical interfaces:

 

  1. LACP (802.3ad) is a logical interface: Logical interfaces are slower to detect failures than physical interfaces because they rely on protocol-level monitoring and timers rather than immediate electrical link state changes.

 

  1. Multicast-Based Protocol: LACP relies on LACPDU (Link Aggregation Control Protocol Data Units) exchanged via multicast to maintain link status.

 

This introduces latency in failure detection.

 

  1. Timer Dependency: As mentioned in the description, the LACP heartbeat intervals are defined by lacp-speed:
  • Slow mode (default): 30 seconds
  • Fast mode: 1 second

Even in fast mode, up to 3 seconds may elapse before a failure is recognized.

 

  1. Physical Ports Detect Failures Instantly: Physical links detect issues electrically and report them immediately to FortiOS, allowing fail-detect mechanisms to trigger HA failover in milliseconds.

 

The following proposal presents methods to improve and accelerate failure detection and failover in scenarios where LACP (802.3ad) and HA are in place.

 

topology.png

 

  1. Enable Fail-Detect on Aggregate and Physical Interfaces: Enabling fail-detect ensures rapid failover when a critical link, physical or logical, fails. When an interface goes down, the fail-detect mechanism triggers the associated fail-alert interface to transition to a down state. Reaction time is approximately 2 seconds after an interface goes down and 1 second after it comes back up, as detailed in the Fortinet article:  Technical Tip: What is the reaction time of fail-detect. This feature is active only on the primary unit in a high availability (HA) cluster.  On the secondary unit, even if monitored interfaces go down, the corresponding fail-alert interfaces remain up.

 

Related documents:

Technical Tip: FortiGate behavior when combining FGCP monitor interface and fail-detect
Introduction to the FGCP cluster


 

In HA environments, this mechanism helps synchronize the failover process properly, avoiding asymmetric routing issues. The secondary unit will take full control of both LAN and Internet traffic until the primary unit recovers.

 

config system interface

    edit "agg1"

        set vdom "root"

        set fail-detect enable

        set fail-alert-method link-down

        set fail-alert-interfaces "x3"              <- Here, the ISP/WAN interface can be attached.

        set type aggregate

        set member "x1" "x2"

    next

end

    edit "x3"

        set vdom "root"

        set fail-detect enable

        set fail-alert-method link-down

        set fail-alert-interfaces "agg1"             <- Here, the LAN Aggregate can be attached.

    next

end

 

  1. Disable LACP-Based HA Role Switching: Prevent unintended HA role transitions by disabling secondary decisions based on LACP state:

 

 

edit "agg1"
    set lacp-ha-secondary disable

end 

 

  1. In HA configurations, beyond adding the LACP interface, the individual member interfaces should also be configured. As discussed, LACP is a logical interface and has slower failure detection compared to physical interfaces.

 

config system ha

    set group-name "Test"
    set mode a-p 
    set override disable  

    set monitor "x1" "x2" "x3" "agg1"

end

 

  1. Enable HA Link Monitor to also trigger failover based on Layer 3 connectivity between the ISP gateway and the Internet.

 

config system link-monitor
    edit "monitor-internet"
        set srcintf "x3"                        <----- Replace with the WAN interface name
        set server "8.8.8.8" "8.8.4.4"          <----- Trusted public IPs to monitor Internet reachability
        set interval 5                          <----- Ping every 5 seconds
        set failtime 3                          <----- Fail after 3 missed pings
        set recoverytime 5                      <----- Recover after 5 successful pings
        set ha-priority 5 #                    <----- HA priority weight used in failover threshold calculation
    next

    edit "monitor-isp-gateway"
        set srcintf "x3"                        <----- Replace with the WAN interface name.
        set server "12.34.56.78"                <----- Replace with the ISP-provided reliable IP.
        set interval 5
        set failtime 3
        set recoverytime 5
        set update-cascade-interface disable
        set ha-priority-recovery enable          <----- Allow HA to revert to primary when recovered.
        set ha-priority 5 #                       <----- HA priority weight used in failover threshold calculation.
    next
end

config system ha
    set group-name "Test"
    set mode a-p 
    set override disable                      <----- Disable automatic preemption
    set pingserver-monitor-interface "x3"      <----- Interface used for link-monitor checks.
    set pingserver-failover-threshold 5 #      <----- Trigger failover when priority score reaches this value.
end

 

Expected Behavior and Proposed Recommendations:

  1. Failover will occur if a physical interface, the aggregate link, or Layer 3 connectivity fails, covering more failure scenarios beyond standard LACP (802.3ad) detection.
  2. HA role decisions will no longer depend on LACP status.
  3. Link-Monitor will trigger failover in the event of external connectivity loss with the Internet or even with ISP gateway.
  4. All recommendations working together will prevent asymmetric routing issues and accelerate HA failover, avoiding the higher detection delays inherent to LACP timers.

 

Related articles:

Technical Tip: Aggregate link configuration topologies in a High Availability cluster

Technical Tip: FortiGate behavior when combining FGCP monitor interface and fail-detect

Technical Tip: Additional time is taken to detect a link-monitor status after HA-failover

Troubleshooting Tip: LACP issue

Technical Tip: Initial troubleshooting steps for LACP (Link Aggregation - 802.3ad)

Technical Tip: FortiGate HA A-P (Active-Passive) cluster connected to a L2 switch with LACP (802.3ad...

Technical Tip: How to set HA ping server threshold

Technical Tip: Link-Monitor Explained