FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
stroia
Staff
Staff
Article Id 414691
Description This article describes the reason for CPU core Spikes after a configuration change and how to manage undesirable side effects in a high-level FortiGate with a huge number of firewall policies.
Scope FortiGate and FortiProxy.
Solution

For each FortiGate/FortiProxy model, there are limits for the maximum number of objects, for example, the number of firewall policies.

 

Making the example of a FortiGate 4200F, the maximum number of configurable firewall policies is 400 000, but already when a couple of tens of thousands of firewall policies, for example more than 20 000, are configured, can be observed CPU's cores spikes, after a configuration change.

 

Here is an example of CPU core spikes in a FortiGate, during the first seconds after a firewall policy is enabled:

 

FGT-4200F-01 (global)# diagnose sys top 5 10 3
Run Time: 0 days, 1 hours and 2 minutes
3U, 0N, 2S, 95I, 0WA, 0HI, 0SI, 0ST; 387725T, 360904F
             wad 17633 R 99.0 0.1 8
  cmdbsvr_iprope 18245 R 99.0 0.0 0
 cmdbsvr_cfgsave 18246 R 98.5 0.0 6
            iked 14695 R 75.1 0.1 79
           voipd 14693 R 24.3 0.1 72
        bcm.user 3605 S < 6.4 0.0 36
       ipsengine 17913 S < 0.4 0.0 74
          hatalk 14702 S < 0.4 0.0 30
            newc 18247 R < 0.4 0.0 39
             wad 14694 S 0.0 0.1 32

 

With spikes on cores: 0, 6, 8, and 79.

 

This behavior can be observed in high-level FortiGates, with tens of thousands of firewall policies configured.

 

cmdbsvr_iprope and cmdbsvr_cfgsave are the daemons managing the unit configuration, and for each change, they also need to check the consistency of the rest of the configuration, so for them, spikes are expected.

 

The reason for the spikes in activity of daemons wad (which manage traffic proxing) and iked (which manages all VPN IPSec tunnels), is caused by their need to handle two activities after each configuration change:

 

  • First activity: Read the new configuration to check if a change requires negating or permitting flows of packets.
  • Second activity: Update the entire sessions table.

 

 

Regarding the second one, the time to perform this activity can be reduced by instructing the FortiGate to check only the new connections, as explained here: Technical Tip: Information about firewall-session-dirty.

 

This default behavior change of FortiGate must be carefully evaluated, because using the check-new option, if a firewall policy is urgently added or modified to block previously allowed malicious traffic, FortiGate will not deny/drop that traffic until the existing session that permits it expires.

 

The wad peaks can be observed also in FortiProxy, and CPU spikes can be observed also in other daemons like the voipd daemon.


Duration of spikes depends on multiple factors:

  1. The number of firewall policies configured.
  2. The quantity of traffic the unit is inspecting.
  3. FortiOS firmware version is running.
  4. Number of VPN IPsec tunnels configured.

 

And others should last between 15 seconds to a couple of minutes.

 

Regarding point 3, the most recent ones contain several enhancements:

 

  • A New Feature with ID 0898200 introduced the cmdb daemon decoupling, contained in the FortiOS GA releases v7.4.2 GA and all newer releases.
  • The fix for bug 1096537 introduced the distribution in different CPU cores of cmdbsvr child workers, contained in the FortiOS GA releases v7.4.9 GA, v7.6.4 GA, and all newer releases.
  • The fix for bug 1173177 speeds up the installation of the iprope4/6 tables in the kernel, and the fix for bug 1190688 makes the configuration changes checks activities performed by the iked daemon, both contained in the FortiOS GA releases v7.4.9 GA, v7.6.5 GA, and all newer releases.
  • Starting from FortiOS GA's v8.0.0 releases, there will be 2 processes managing VoIP: the voipd parent is eventually busy in cases with large amounts of configuration changes, but there is no impact on traffic because it will be handled by the voipd worker process.

 

The wad, ikedand voips peaks can cause different issues like:

  • Packet loss for traffic proxate, with the highest impact for real-time applications like Microsoft Teams.
  • VPN IPSec tunnels frequent rekey, causing flaps on eventually routing protocols like BGP and OSPF implemented over them, and in case of SD-WAN deployments, causing also SD-WAN behavior changes, like Performance SLA down without underlay degradation and traffic matched intermittently by two different SD-WAN rules.
  • SIP service interruptions, in case of VoIP traffic inspected by the SIP ALG and SIP session helper.

 

Another side effect is that in case of push from the FortiManager to the FortiGate of a massively firewall address renames (with tens of thousands of firewall policies using them), the activity can become extremely slow, because additionally to the configuration update the cmdb demons need to find and update each object reference.

 

Doing that activity with a FortiGate running FortiOS 7.4.9 GA, 7.6.5 GA or newer, the process is still long, but it requires less time since the FortiGate benefits from the fixes mentioned above.

 

In the case of FortiGates in High Availability (FGCP Cluster), peaks are observed in all cluster members, but for a longer period and involving additional daemons beyond cmdb and HA daemons, only in the primary unit.

 

 

Here is a list of precautions to mitigate this issue:

 

  1. Upgrade the FortiGate/FortiGates and the FortiManager to the most recent Mature release.
  2. Periodically review FortiGate configurations, deleting unused firewall policies and objects.
  3. In very big network environments, like for example international institutions, banks, or companies, avoid using a FortiGate for 2 or more of these purposes: Data Center Firewall, Perimeter Firewall, SD-WAN Hub, Controller of thousands of FortiSwitches and FortiAPs, root of a Fortinet Fabric with thousands of units, and so on.
  4. In case FortiGate HA Cluster is used for different scopes, manage them with different VDOMs, using the High Availability Virtual Clusters feature to use different members of the Cluster as the primary unit for different VDOMs.
  5. Make all FortiGates configuration changes in the FortiManager and only in case of urgent changes, push them to the FortiGates during business hours.
  6. Configure the FortiGate option check-new for the firewall sessions, before starting the FortiManager pushes and returning to the check-all option after the pushes are finished, if it is necessary to execute more than 1 configuration change push consecutively.

 

Related documents:

Troubleshooting Tip: WAD CPU spikes due to configuration changes

Maximum number of objects configurable for each FortiGate model: Fortinet Max Value Table and Technical Tip: FortiGate maximum values table.

For more information on how the command 'diagnose sys top' works, see Technical Tip: Using the diagnose sys top CLI command.

Activities performed by the most important FortiGate daemons: Technical Tip: Short list of processes on the FortiGate.

How to check the FortiGate firewall policies table: Technical Tip: iprope policies group.

High CPU Troubleshooting guide: Troubleshooting Tip: How high CPU usage should be investigated, and Technical Tip: VoIP and SIP configuration and troubleshooting resource lists.

What is meant by the term ‘ga version’: Technical Tip: FortiOS firmware version terminology.

What is a mature release: Firmware maturity levels 

Here is more info regarding session timeouts: Technical Tip: Default session timeout value (session-ttl).