FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
seshuganesh
Staff
Staff
Article Id 204727
Description

This article describes how to distribute IPsec traffic across multiple CPU cores to distribute CPU load and avoid performance issues.

Scope FortiGate FortiOS.
Solution

Two key processes take place when traffic is to be sent over an IPsec tunnel: encryption and decryption.

The plain or clear text packet from the sender is encrypted and when it arrives at its destination, this 'encrypted' packet has to be decrypted.

 

In cases where the FortiGate used for the VPN traffic encryption/decryption process has multi-core, things are faster when the job is shared among the CPU cores than when one CPU core is left to do the job alone, this is referred to as packet distribution.

Sometimes, a single CPU core will decrypt all IPSec traffic. If the data load coming through IPsec is too high, that specific CPU core's softIRQ might hit 99 percent. In this case, performance issues will occur.

 

Handling the distribution of packets to be encrypted is easier because it is in plain text and its destination information is visible to FortiGate since it is a plain text packet, so its distribution can be handled more easily.

However, it is not the same with an IPsec packet that needs to be decrypted, as the destination information is not visible until it is decrypted, this restricts its distribution over multiple CPUs.

 

One solution to the CPU core's overload briefly mentioned earlier, is to distribute IPsec traffic across all cores to divide the CPU load.

To distribute all IPSec traffic to all cores for decryption, enable the following command:

 

config system global

    set ipsec-soft-dec-async enable

end

 

ipsec-soft-dec-async is a software decryption asynchronization (using multiple CPUs to do decryption) for IPsec VPN traffic.

 

Notes:

  • This feature is disabled by default.
  • With this enabled, IPSec packets that arrive on the same CPU core (because of having the same ESP SPI) can be distributed to multiple CPUs.

 

After running the above configuration, IPsec traffic will be distributed across all CPU cores.

This can improve performance in some environments, as the packets to be decrypted will be distributed over multiple CPU cores.

It can easily be turned off if the environment does not benefit from enabling it.

 

It could be because some IPsec packet sizes are quite large, while some are quite small and despite CPU Core02, for example, receiving the smaller packet (p2) later after CPU Core01 has received the larger packet (p1), the decryption of the smaller packet (p2) finished first. Therefore creates an out-of-order situation as packet p2 is released first into the stream, and p1 is followed later or behind. Some endpoints are quite good at re-arranging out-of-order packets, if the receiving endpoint is one of such, then this out-of-order concern is already addressed.

 

The verification of this feature can be done with:

 

diagnose vpn ipsec cpu

 

Sample output from the command, when 'ipsec-soft-dec-async enable'. After enabling the setting the IPsec traffic is distributed across all CPU cores.

   

FGT-A # diagnose vpn ipsec cpu
Software crypto CPU distributions:
CPU# enc dec-in dec dec-out
0 1398472314 537189751 537189751 537189751
1 1354792811 221738182 221738182 221738182
2 1332865254 4859445469 4859445469 4859445469
3 1318542297 722261884 722261884 722261884
4 1370775656 2652981225 2652981225 2652981225
5 1331927721 482841573 482841573 482841573
6 1447120469 1336921910 1336921910 1336921909
7 1459317941 110895121 110895121 110895121


Note:

From v7.4.2+ onward the set ipsec-soft-dec-async command has been removed and is no longer available. Controlling load-distribution on IPSEC tunnels is still possible via: IPsec support for round robin and RPS distribution