FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
nprakash
Staff
Staff
Article Id 199493
Description This article describes how to resolve ESP traffic being dropped due to a PBA leak.
Scope FortiOS.

Solution

In some situations, when clear text or ESP packets in IPsec sessions may have large amounts of layer 2 padding, the NP6 IPsec engine may not be able to process them and the session may be blocked.

In such cases, check if the enc/dec counters in 'diagnose vpn tunnel list <name>' command:

 

dec:pkts/bytes=1/60, enc:pkts/bytes=1234/150754

 

Generate traffic across the tunnel and run this command multiple times.

Confirm if these counters increment.

 

It is also best practice to run packet captures on the remote peer for ESP traffic.

Confirm if the peer is sending the ESP traffic to its ISP gateway.  

Run packet capture for ESP traffic on the local peer’s wan interface. 

Using Wireshark confirm if the traffic from a remote peer was received successfully.

 

More information on decrypting ESP traffic using Wireshark can be found in the below link:
Technical Tip: Decrypt ESP packets

 

To confirm if the issue is actually with NP, try to disable asic-offloading. Below are the commands to disable NPU offloading:

config vpn ipsec phase1-interface

    edit phase-1-name

        set npu-offload disable

end

 

Ensure that the firewall policies created for the VPN tunnels have auto-asic offloading disabled.

 

config firewall policy

    edit <policy_id>

        set auto-asic-offload disable

end

 

Note that disabling NP will cause the tunnel to flap.

If the traffic across the tunnel is critical it is better to make these changes during the maintenance window.

Now after disabling NP, if the traffic flow is working as expected then enable auto-asic offload and collect the NP debug logs.

NP6 debug logs are mentioned below:

 

diagnose npu np6 port-list

diag npu np6 dce <id>

diag npu np6 pdq <id>

diag npu np6 register <id>

diag npu np6 sse-stats <id>

 

** id = 0-3 depending on the hardware model. Run the above command multiple times and make sure to run it during the issuing state.

 

To confirm if there is a PBA leak:


The output of 'diag npu np6 sse-stats <id>' will have a counter 'PBA', for example:

 

diagnose  npu np6 sse-stats 2

Counters        SSE0            SSE1            Total          

--------------- --------------- --------------- ---------------

active          6684            6631            13315          

insert-total    1020723648      1019924808      2040648456     

insert-success  1020723648      1019924808      2040648456     

delete-total    1020716964      1019918177      2040635141     

delete-success  1020716963      1019918176      2040635139     

purge-total     0               0               0              

purge-success   0               0               0               

search-total    1177004741      2620746558      3797751299     

search-hit      1481226345      3765280856      5246507201     

mcast-tx        0               0               0              

--------------- --------------- --------------- ---------------

pht-size        8421376         8421376        

oft-size        8355840         8355840        

oftfree         8355837         8355837        

PBA             3002           

drv-drift       0   

 

This value should be less than or equal to 3001.


Also, once the IPSec engine discards the packet the npu dce counter will rise once the command 'diagnose npu np6 dce 0' is executed in the CLI.

LAB1 # diagnose npu np6 dce 0
IHP0_PKTCHK :0000000000000485 [5a] IHP2_PKTCHK :0000000000000437 [5c]
XHP1_PKTCHK :0000000000000054 [5f] IPSEC1_ENGINB0 :0000000000122312 [89]
IPSEC1_ENGINB1 :0000000001129384 [8a] IPSEC1_ENGINB2 :0000000000028093 [8b]
IPSEC1_ENGINB3 :0000000000004890 [8c] IPSEC1_ENGINB4 :0000000000002929 [8d]
IPSEC1_ENGINB5 :0000000000002189 [8e] IPSEC1_ENGINB6 :0000000000000129 [8f]
IPSEC1_ENGINB7 :0000000000000020 [90] PDQ_ISW_SSE0 :0000000000000304 [97]
PDQ_ISW_SSE1 :0000000000000172 [98] PDQ_OSW_EHP0 :0000000002168054 [a1]
PDQ_OSW_EHP2 :0000000001274788 [a3] PDQ_OSW_IPSEC0O :0000000000035009 [a6]
PDQ_OSW_IPSEC1O :0000000000000026 [a8] PDQ_OSW_HRX0 :0000000000033882 [ae]
PDQ_OSW_HRX1 :0000000000029580 [af]

LAB1 # diagnose npu np6 dce 0
IPSEC1_ENGINB0 :0000000000000003 [89] IPSEC1_ENGINB1 :0000000000000019 [8a]

LAB1 # diagnose npu np6 dce 0
IPSEC1_ENGINB1 :0000000000000015 [8a]

LAB1 # diagnose npu np6 dce 0
IPSEC1_ENGINB0 :0000000000000002 [89] IPSEC1_ENGINB1 :0000000000000033 [8a]
IPSEC1_ENGINB2 :0000000000000002 [8b]

LAB1 # diagnose npu np6 dce 0
IPSEC1_ENGINB1 :0000000000000013 [8a]

 

Once the IPSec engine cannot process the traffic, and perform the encryption the session is discarded by the engine raising the counter for IPSEC1_ENGINBx.


To fix the PBA leak issue:

 

Enable the following settings to fix the issue.

 

config system npu

    set strip-esp-padding enable

    set strip-clear-text-padding enable

end

 

After this, reboot the unit. Note that if there is an HA cluster then ensure to reboot all the units in the HA clusters. 

 

Workaround for this issue:

 

  • Disable NPU offloading on the IPsec tunnel resolves the issue.
  • Rebooting the firewall will temporarily resolve, however, the issue will return within a couple of weeks.
  • Failing over to secondary temporarily resolves the issue.

 

The issue might be related to offloading. Verify if there are drops in NP6 packets using the following debugs:

 

diagnose npu np6 dce <id>

 

This command displays the number of dropped packets for the selected NP6 processor.  


IHP1_PKTCHK number of dropped IP packets.
IPSEC0_ENGINB0 number of dropped IPsec.


Related document:
diagnose npu np6 dce (number of dropped NP6 packets)

 

If there are drops, verify by disabling NPU offload. Note disabling NPU will cause the tunnel to flap.

If the traffic across the tunnel is critical, it is best to make these changes during the maintenance window or after hours.
Important: Disabling offload of the traffic will be handled by the CPU.

 

  • Disable NPU offloading on phase 1 and auto-asic-offload on both firewall policies on both FortiGate units (local and remote):

 

config firewall policy <----- For internal to VPN an VPN to internal firewall policies.
    edit <policy_id>
        set auto-asic-offload disable
    next
end

 

config vpn ipsec phase1-interface
    edit phase-1-name
        set npu-offload disable
end

 

After disabling offload, if the traffic flow is working as expected, enable offload and collect NP debug logs for further investigation and attach them to a TAC ticket.

 

diagnose npu np6 port-list
diag npu np6 ipsec-stats
diag npu np6 dce-all <----- dce-all show all sub-engine drop counters.
diag npu np6 dce <id>
diag npu np6 register <id>
diag npu np6 sse-stats <id>


** id = 0-3 depending on the hardware model.  Run the above command multiple times and make sure to run it during the issuing state.

 

  • Verify if UTM is on VPN policies and if so disable them. After disabling UTM, if the traffic flow is working as expected, enable UTM and open the ticket with Fortinet.
  • show full-configuration system globally.
  • If ipsec-hmac-offload is enabled, disable it on both peer firewalls and test (can impact CPU usage).
    Important: If disabled, packets are processed by the CPU. After disabling ipsec-hmac, if the traffic flow is working as expected, enable ipsec-hmac and open the TAC ticket with Fortinet.

 

If still issues, it is possible to the run following command on originating FortiGate using Putty while doing a large file transfer, for example using SMB, enable logging for Putty, and create a FortiCare ticket:


diag sniff pack any "host x.x.x.x and port 445" 6 0 a <----- Port 445 for SMB and x.x.x.x is the host IP behind the remote firewall.

 

Attach also results of iPerf tests as per the following useful KB article:

Technical Tip: Use cases for the diagnose traffictest command