FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
_mribwan
Staff
Staff
Article Id 330440
Description This article describes why an IPSec tunnel flaps after phase 2 rekey.
Scope FortiGate with NP6 chip (NP6 only, NP6XLite, and NP6Lite processors do not have this caching limitation).
Solution

FortiGate NP6 processors offload IPsec sessions in a way that when a new Child SA is created during rekey, the NPU must flush and re-insert the session into its fast-path table.

With anti-replay enabled (default), a very small number of in-flight packets can arrive with sequence numbers that the new offloaded SA considers 'already seen', causing a replay-window violation. The NPU silently drops these packets instead of falling back to slow-path/replay-check in software.

 

If the BGP keepalive or TCP ACK is dropped, the overlay protocol detects a timeout. Tunnel appears to 'flap' from the overlay perspective, even though IKE itself 'thinks' everything is fine.

 

To identify, the following commands need to be run during the issue:

 

diagnose npu np6 dce <id>

diagnose npu np6 sse-stats <id>

 

Example:


FGT01 # diagnose npu np6 dce 0

FGT01 # diagnose npu np6 sse-stats 0

Counters        SSE0            SSE1            Total           
--------------- --------------- --------------- --------------- 
active          1517            1460            2977            
insert-total    90300944        90303092        180604036       
insert-success  90300944        90303092        180604036       
delete-total    90299427        90301632        180601059       
delete-success  90299427        90301632        180601059       
purge-total     0               0               0               
purge-success   0               0               0               
search-total    3962932969      3894114356      7857047325      
search-hit      3730297129      3585415768      7315712897      
mcast-tx        0               0               0               
--------------- --------------- --------------- --------------- 
pht-size        8421374         8421374         
oft-size        8355838         8355838         
oftfree         8355837         8355835         
PBA             2995            
drv-drift       0  

 

If the PBA value is more than 3001, refer to this KB article: Technical Tip: VPN (ESP) traffic dropped due to NP6 PBA leak.

 

In this case, the NPU is not dropping ESP packets as there is no output on diagnose npu np6 dce 0, and the PBA value is lower than 3001, but the tunnel flaps. Further, look at the VPN Events log and the Router Events log.

 

VPN Phase 2 rekey:

 

date=2024-05-21 time=15:35:41 eventtime=1719387341790962634 tz="+0800" logid="0101037129" type="event" subtype="vpn" level="notice" vd="root" logdesc="Progress IPsec phase 2" msg="progress IPsec phase 2" action="negotiate" remip=X.X.X.X locip=Y.Y.Y.Y remport=4500 locport=4500 outintf="port5" cookies="ce4eb0c7fbe1adb5/3adc9be4f3d96fdb" user="17.159.100.5" group="N/A" useralt="N/A" xauthuser="N/A" xauthgroup="N/A" assignip=N/A vpntunnel="SA-PJ" status="success" init="remote" exch="CREATE_CHILD" dir="outbound" role="responder" result="DONE" version="IKEv2" advpnsc=0
date=2024-05-21 time=15:35:41 eventtime=1719387341790919606 tz="+0800" logid="0101037133" type="event" subtype="vpn" level="notice" vd="root" logdesc="IPsec SA installed" msg="install IPsec SA" action="install_sa" remip=X.X.X.X locip=Y.Y.Y.Y remport=4500 locport=4500 outintf="port5" cookies="ce4eb0c7fbe1adb5/3adc9be4f3d96fdb" user="17.159.100.5" group="N/A" useralt="N/A" xauthuser="N/A" xauthgroup="N/A" assignip=N/A vpntunnel="SA-PJ" role="responder" in_spi="134b7e12"out_spi="ac3b8e53" advpnsc=0
date=2024-05-21 time=15:35:41 eventtime=1719387341790807018 tz="+0800" logid="0101037122" type="event" subtype="vpn" level="notice" vd="root" logdesc="Negotiate IPsec phase 2" msg="negotiate IPsec phase 2" action="negotiate" remip=X.X.X.X locip=Y.Y.Y.Y remport=4500 locport=4500 outintf="port5" cookies="ce4eb0c7fbe1adb5/3adc9be4f3d96fdb" user="17.159.100.5" group="N/A" useralt="N/A" xauthuser="N/A" xauthgroup="N/A" assignip=N/A vpntunnel="SA-PJ" status="success" role="responder" esptransform="ESP_AES" espauth="N/A" advpnsc=0
date=2024-05-21 time=15:35:41 eventtime=1719387341790190739 tz="+0800" logid="0101037120" type="event" subtype="vpn" level="notice" vd="root" logdesc="Negotiate IPsec phase 1" msg="negotiate IPsec phase 1" action="negotiate" remip=X.X.X.X locip=Y.Y.Y.Y remport=4500 locport=4500 outintf="port5" cookies="ce4eb0c7fbe1adb5/3adc9be4f3d96fdb" user="17.159.100.5" group="N/A" useralt="N/A" xauthuser="N/A" xauthgroup="N/A" assignip=N/A vpntunnel="SA-PJ" status="success" result="N/A" peer_notif="N/A" advpnsc=0

 

Router Events showing BGP peer Down:


date=2024-05-21 time=15:36:09 eventtime=1719387369810601915 tz="+0800" logid="0103020300" type="event" subtype="router" level="warning" vd="root" logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 10.0.5.12 Down BGP Notification FSM-ERR"
date=2024-05-21 time=15:36:09 eventtime=1719387369810596451 tz="+0800" logid="0103020304" type="event" subtype="router" level="warning" vd="root" logdesc="Routing log warning" msg="BGP: %BGP-3-NOTIFICATION: received from 10.0.5.12 4/0 (Hold Timer Expired/Unspecified Error Subcode) 0 data-bytes []"

 

Notice that within 2 minutes of phase 2 re-key, the BGP peer went down. This could be attributed to the behavior of the NP6 processors' cache inbound IPsec SA. IPsec VPN sessions with anti-replay protection that are terminated by the FortiGate may fail the replay check and be dropped.

 

To maintain the performance of NPU offloading on the tunnel, it is recommended to disable anti-replay on the tunnel instead:

 

config vpn ipsec phase2-interface
    edit SA-PJ
        set replay disable

end

 

Related document:

Supporting IPsec anti-replay protection