FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
edyrmishi
Staff
Staff
Article Id 408336
Description This article describes a specific scenario where, due to an HA split-brain scenario, an IPsec tunnel flaps and repeated rekey/ESP SPI mismatches are noticed.
Scope FortiGate HA.
Solution

If repeated Received ESP packets with unknown SPI entries are observed in the event log, one possible cause to verify is that the HA cluster state is broken (split-brain), causing independent rekey events and mismatched SPIs.

 

The most common symptoms to look for:

  • Frequent IPsec tunnel flapping (tunnel up/down intermittently).
  • Event log entries such as: Received ESP packet with unknown SPI.
  • Repeated Dead Peer Detection (DPD) or IKE/ESP-related errors in logs.
  • HA status indicating both units are acting as primary/active or HA heartbeat/link down.
  • Tunnel rekeys happen frequently, and endpoints fail to synchronize SA/SPIs.

 

Reasons why it happens:

When the HA heartbeat/link fails, the cluster can enter a split-brain (both members think they are primary). Each unit may then independently manage SAs and trigger rekeys. An ESP packet is accepted only if its SPI matches an active IPsec Security Association (SA).

 

If an endpoint has rekeyed and changed SPIs while the peer still uses the old SA, the peer will drop incoming ESP packets as 'unknown SPI'. In an HA split-brain (heartbeat/link failure) both cluster members can behave as primaries and independently manage SAs/rekeys, producing mismatched keys/SPIs and repeated tunnel teardowns.
Other causes for unknown-SPI include PFS/proposal/lifetime mismatches, asymmetric routing or NAT changes, but split-brain is a common cause in clustered setups.

 

To determine if an HA split-brain is the root cause, the following can be checked:

 

  • Reproduce or capture both HA state and IPsec SA/SPI simultaneously from each member:

 

get system ha status
diagnose sys ha status

 

get vpn ipsec tunnel summary
diagnose vpn tunnel list name <tunnel-name>

 

diagnose debug reset
diagnose debug application ike -1
diagnose debug enable

 

  • If both HA members appear primary/active (or HA shows no synchronization) and SPIs differ between the two HA members or between the site peers, split-brain is very likely the root cause.
  • If, after restoring HA, the SPIs align and the tunnel stabilizes, this confirms the link between HA split-brain and the unknown SPI / flapping behavior.

 

Prevention and Best Practices:

  • Ensure HA heartbeat/HA port physical connectivity is robust (secure cabling, physical redundancy if possible).
  • Monitor HA link health and configure alerts for HA link down or cluster split events.
  • When carrying out hardware replacements or link upgrades, verify HA synchronization and cluster health before allowing production traffic.
  • Configure consistent DPD/lifetime settings on both peers so rekey timing is predictable. DPD helps detect stale peers but does not fix HA split-brain.