FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
kanand
Staff
Staff
Article Id 364102
Description

 

This article describes how to understand and analyze the high Softirq situation on FortiGate.

 

Scope

 

All FortiGate models and versions.

 

Solution

 

SoftIRQs (software interrupts) play a critical role in handling system events efficiently. This article delves into the potential causes of high softIRQ utilization, focusing on excessive traffic and a high number of sessions as primary contributors.

 

The CPU can be mainly used in 3 distinct spaces:

  • User Space: Time spent running application daemons and user-level programs.
  • System Space: Time spent executing kernel instructions.
  • Interrupts (IRQ/SoftIRQs): Time spent handling hardware and software interrupts.

 

Monitoring CPU usage can provide insights into which space is contributing the most to system load. Recommended commands to start the investigation include:

 

get sys performance status
diagnose sys mpstat <refresh seconds>

 

These commands display the percentage of CPU usage for each category on a per-core basis, aiding in identifying the primary type of usage

Once the main type of usage has been identified, the focus can be moved to the specific case.

 

Interrupts are special functions triggered by specific events, instructing the CPU to pause its current task and handle the interrupt. Interrupts can be categorized as:

  1. Hardware Interrupts (IRQ): Handles incoming MSI-X (Message Signaled Interrupts). These are lightweight and primarily schedule softIRQ processes, consuming minimal CPU time.
  2. Software Interrupts (SoftIRQs): Process incoming packets using polling. They process as many packets as possible in each execution but are limited in the number of packets they can handle. Any remaining packets will re-trigger the softIRQ for further processing.

In network devices, the main source of interrupts is the network packets received on interfaces.

 

Causes of High IRQ/SoftIRQ.

High interrupt utilization can result from several factors:

  1. L2 Issues: Examples include broadcast storms or Layer 2 loops.
  2. Non-Offloaded Sessions: Sessions not handled by hardware acceleration (for example NPU sessions).
  3. High Session Setup Rates: Rapid establishment of new sessions.
  4. Excessive Traffic: A high volume of packets requiring processing.


To view hardware interrupts:


diagnose hardware sysinfo interrupts

 

Or:


fnsysctl cat /proc/interrupts


fnsysctl cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
...
142: 3506701 0 0 0 0 0 0 0 PCI-MSI-edge np6_0-tx-rx0
143: 1 742138 0 0 0 0 0 0 PCI-MSI-edge np6_0-tx-rx1
144: 1 0 3850634 0 0 0 0 0 PCI-MSI-edge np6_0-tx-rx2
145: 1 0 0 3319842 0 0 0 0 PCI-MSI-edge np6_0-tx-rx3

...

 

Investigating SoftIRQ Subsystems.

To determine which softIRQ subsystem is consuming high CPU, analyze '/proc/softirqs':


fnsysctl cat /proc/softirqs

 

kb2.PNG

 

  • HI: High-priority tasks.
  • NET_TX: Packets generated locally.
  • TIMER: Periodic timer functions.
  • NET_RX: Incoming traffic (user packets).
  • SCHEDULER: Task scheduling functions.

 

Note: NETRX is the number of executions per second, not the number of packets. The higher the number of executions, the more packets are processed.

 

For instance:

Run the command twice (for example 10 seconds apart) to calculate the difference and derive per-second values.

 

Capturekb3.PNG

 

NET_RX DIFFERENCE FOR CPU0:
427470114 - 425157559 = 2312555
2312555 / 10 = 231255

 

This calculation indicates NET_RX executed 231,255 times per second, suggesting a high volume of incoming packets.

 

Related article:

Troubleshooting Tip: Check SoftIrq increments (recommended when experiencing high CPU usage)