|
Why LACP detects failures more slowly than physical interfaces:
- LACP (802.3ad) is a logical interface: Logical interfaces are slower to detect failures than physical interfaces because they rely on protocol-level monitoring and timers rather than immediate electrical link state changes.
- Multicast-Based Protocol: LACP relies on LACPDU (Link Aggregation Control Protocol Data Units) exchanged via multicast to maintain link status.
This introduces latency in failure detection.
- Timer Dependency: As mentioned in the description, the LACP heartbeat intervals are defined by lacp-speed:
- Slow mode (default): 30 seconds
- Fast mode: 1 second
Even in fast mode, up to 3 seconds may elapse before a failure is recognized.
- Physical Ports Detect Failures Instantly: Physical links detect issues electrically and report them immediately to FortiOS, allowing fail-detect mechanisms to trigger HA failover in milliseconds.
The following proposal presents methods to improve and accelerate failure detection and failover in scenarios where LACP (802.3ad) and HA are in place.

- Enable Fail-Detect on Aggregate and Physical Interfaces: Enabling fail-detect ensures rapid failover when a critical link, physical or logical, fails. When an interface goes down, the fail-detect mechanism triggers the associated fail-alert interface to transition to a down state. Reaction time is approximately 2 seconds after an interface goes down and 1 second after it comes back up, as detailed in the Fortinet article: Technical Tip: What is the reaction time of fail-detect. This feature is active only on the primary unit in a high availability (HA) cluster. On the secondary unit, even if monitored interfaces go down, the corresponding fail-alert interfaces remain up.
Related documents:
Technical Tip: FortiGate behavior when combining FGCP monitor interface and fail-detect Introduction to the FGCP cluster
In HA environments, this mechanism helps synchronize the failover process properly, avoiding asymmetric routing issues. The secondary unit will take full control of both LAN and Internet traffic until the primary unit recovers.
config system interface
edit "agg1"
set vdom "root"
set fail-detect enable
set fail-alert-method link-down
set fail-alert-interfaces "x3" <- Here, the ISP/WAN interface can be attached.
set type aggregate
set member "x1" "x2"
next
end
edit "x3"
set vdom "root"
set fail-detect enable
set fail-alert-method link-down
set fail-alert-interfaces "agg1" <- Here, the LAN Aggregate can be attached.
next
end
- Disable LACP-Based HA Role Switching: Prevent unintended HA role transitions by disabling secondary decisions based on LACP state:
edit "agg1" set lacp-ha-secondary disable
end
- In HA configurations, beyond adding the LACP interface, the individual member interfaces should also be configured. As discussed, LACP is a logical interface and has slower failure detection compared to physical interfaces.
config system ha
set group-name "Test" set mode a-p set override disable
set monitor "x1" "x2" "x3" "agg1"
end
- Enable HA Link Monitor to also trigger failover based on Layer 3 connectivity between the ISP gateway and the Internet.
config system link-monitor edit "monitor-internet" set srcintf "x3" <----- Replace with the WAN interface name set server "8.8.8.8" "8.8.4.4" <----- Trusted public IPs to monitor Internet reachability set interval 5 <----- Ping every 5 seconds set failtime 3 <----- Fail after 3 missed pings set recoverytime 5 <----- Recover after 5 successful pings set ha-priority 5 # <----- HA priority weight used in failover threshold calculation next
edit "monitor-isp-gateway" set srcintf "x3" <----- Replace with the WAN interface name. set server "12.34.56.78" <----- Replace with the ISP-provided reliable IP. set interval 5 set failtime 3 set recoverytime 5 set update-cascade-interface disable set ha-priority-recovery enable <----- Allow HA to revert to primary when recovered. set ha-priority 5 # <----- HA priority weight used in failover threshold calculation. next end
config system ha set group-name "Test" set mode a-p set override disable <----- Disable automatic preemption set pingserver-monitor-interface "x3" <----- Interface used for link-monitor checks. set pingserver-failover-threshold 5 # <----- Trigger failover when priority score reaches this value. end
Expected Behavior and Proposed Recommendations:
- Failover will occur if a physical interface, the aggregate link, or Layer 3 connectivity fails, covering more failure scenarios beyond standard LACP (802.3ad) detection.
- HA role decisions will no longer depend on LACP status.
- Link-Monitor will trigger failover in the event of external connectivity loss with the Internet or even with ISP gateway.
- All recommendations working together will prevent asymmetric routing issues and accelerate HA failover, avoiding the higher detection delays inherent to LACP timers.
Related articles:
Technical Tip: Aggregate link configuration topologies in a High Availability cluster
Technical Tip: FortiGate behavior when combining FGCP monitor interface and fail-detect
Technical Tip: Additional time is taken to detect a link-monitor status after HA-failover
Troubleshooting Tip: LACP issue
Technical Tip: Initial troubleshooting steps for LACP (Link Aggregation - 802.3ad)
Technical Tip: FortiGate HA A-P (Active-Passive) cluster connected to a L2 switch with LACP (802.3ad...
Technical Tip: How to set HA ping server threshold
Technical Tip: Link-Monitor Explained
|