Created on 11-20-2022 10:27 PM Edited on 04-29-2024 08:50 AM By Stephen_G
Description |
This article describes how to troubleshoot high CPU issues. |
Scope | FortiGate. |
Solution |
The first step should always be running 'get sys perf status'. The output will be something as below (depending on the number of cores the unit has):
If this section is high, only look at 'diag sys top'. The user space is the high CPU from the processes visible in 'diag sys top'.
This is the kernel's own CPU usage, eg. process related to running the operating system (do not use 'diag sys top' at this point to further troubleshoot the issue).
At this point, run the CPU profiler, commands below:
<wait 5-10 seconds> diagnose sys profile show detail diagnose sys profile sysmap
Now, check which process shows up on top in this output when the last command is run. That process is the problematic one and causes high CPU. If there is a doubt about its name, try searching for tickets or bugs around that process.
This usage comes from firewall processing packets. If the firewall is receiving a high number of packets on its interface or the firewall is under DoS attack, high softirq usage will be visible (again do not use 'diag sys top' at this point to further troubleshoot the issue).
At this point, rely on the interface widget/stats to see if any particular interface is receiving too much traffic. Try running an open sniffer to try to find the traffic type that is causing it (let's say a lot of ARP packets coming, it could be a layer2 loop inside the user network).
If nothing works, then try disabling one interface at a time to see which one brought down the CPU usage, most likely that interface was receiving high traffic.
It can be helpful to run the CPU profiler as described in step 2. High softirq can point towards offloading issues and the profiler can give some insights. Below is an example with high softirq in one CPU core due to decryption/encryption not being offloaded.
Note: If CPU usage is high in system space or soft IRQ and there is high CPU usage in 'diag sys top', the latter command is giving false information.
This is not accurate because there is only 10 percent usage in user space and IPS is taking 99 percent of that 10 percent left from total usage. It is not actually using 99 percent of the whole CPU core.
This is why it is important to not rely solely on 'diag sys top': it is necessary to look beyond that command. If it is KVM with DPDK enabled, then DPDK is designed to be 100% busy polling. DPDK is running in IPS processes, so IPS will be always busy when DPDK is enabled. |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.