Description
This article describes how to troubleshoot high CPU or high memory usage.
Scope
FortiGate.
Solution
Access FortiGate via the CLI and run these commands (make sure that the issue is occurring when these commands are running):
Command 1:
diag sys top 1 10
This command shows the top 10 high usage daemons of the FortiGate. Sample Result: The 4th column from the left is for CPU usage percentage and 5th column from the left is the memory usage percentage.
The daemon causing the high CPU or high memory usage will be shown: Run Time: 22 days, 2 hours and 13 minutes
0U, 0N, 0S, 100I, 0WA, 0HI, 0SI, 0ST; 16064T, 11481F
miglogd 269 S 0.4 0.1
ipsengine 286 S < 0.0 0.7
ipsengine 287 S < 0.0 0.6
ipsengine 292 S < 0.0 0.6
ipsengine 289 S < 0.0 0.6
ipsengine 288 S < 0.0 0.6
ipsengine 290 S < 0.0 0.6
ipsengine 291 S < 0.0 0.6
updated 209 S 0.0 0.3
miglogd 184 S 0.0 0.2
Command 2:
get sys perf stat
This command shows the CPU and memory total usage percentage and also the concurrent connection of the FortiGate. It is advised to run this command 5x.
Sample Result:
CPU states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU0 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU1 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU2 states: 1% user 0% system 0% nice 99% idle 0% iowait 0% irq 0% softirq
CPU3 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU4 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU5 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU6 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU7 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
Memory: 16450308k total, 4570304k used (27%), 11880004k free (73%)
Average network usage: 87 / 77 kbps in 1 minute, 173 / 163 kbps in 10 minutes, 1213 / 1203 kbps in 30 minutes
Average sessions: 200 sessions in 1 minute, 215 sessions in 10 minutes, 253 sessions in 30 minutes
Average session setup rate: 4 sessions per second in last 1 minute, 3 sessions per second in last 10 minutes, 3 sessions per second in last 30 minutes
Average NPU sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes
Average nTurbo sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes
Virus caught: 0 total in 1 minute
IPS attacks blocked: 0 total in 1 minute
Uptime: 22 days, 2 hours, 17 minutes
Command 3:
diag debug crashlog read
This shows if there are any crash logs for the daemon that are causing the FortiGate high CPU or high MEM usage.
Sample Result:
290: 2019-11-18 18:20:42 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
291: 2019-11-18 18:20:42 <00207> scanunit=manager str="Success loading anti-virus database."
292: 2019-11-18 19:21:52 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
293: 2019-11-18 19:21:52 <00207> scanunit=manager str="Success loading anti-virus database."
294: 2019-11-18 19:26:17 the killed daemon is /bin/pyfcgid: status=0x0
295: 2019-11-18 20:20:23 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
296: 2019-11-18 20:20:23 <00207> scanunit=manager str="Success loading anti-virus database."
297: 2019-11-18 22:20:20 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
298: 2019-11-18 22:20:20 <00207> scanunit=manager str="Success loading anti-virus database."
299: 2019-11-18 23:47:06 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
300: 2019-11-18 23:47:07 <00207> scanunit=manager str="Success loading anti-virus database."
301: 2019-11-18 23:57:28 the killed daemon is /bin/pyfcgid: status=0x100
302: 2019-11-19 00:42:31 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
303: 2019-11-19 00:42:31 <00207> scanunit=manager str="Success loading anti-virus database."
304: 2019-11-19 00:52:18 the killed daemon is /bin/pyfcgid: status=0x100
305: 2019-11-19 02:20:40 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
306: 2019-11-19 02:20:40 <00207> scanunit=manager str="Success loading anti-virus database."
307: 2019-11-19 04:20:22 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
308: 2019-11-19 04:20:22 <00207> scanunit=manager str="Success loading anti-virus database."
309: 2019-11-19 06:21:25 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
310: 2019-11-19 06:21:25 <00207> scanunit=manager str="Success loading anti-virus database."
311: 2019-11-19 08:20:22 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
312: 2019-11-19 08:20:22 <00207> scanunit=manager str="Success loading anti-virus database."
313: 2019-11-19 10:20:41 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
314: 2019-11-19 10:20:42 <00207> scanunit=manager str="Success loading anti-virus database."
315: 2019-11-19 12:20:32 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
316: 2019-11-19 12:20:32 <00207> scanunit=manager str="Success loading anti-virus database."
317: 2019-11-19 14:20:24 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
318: 2019-11-19 14:20:24 <00207> scanunit=manager str="Success loading anti-virus database."
319: 2019-11-19 16:20:46 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
320: 2019-11-19 16:20:46 <00207> scanunit=manager str="Success loading anti-virus database."
321: 2019-11-19 18:20:19 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
322: 2019-11-19 18:20:19 <00207> scanunit=manager str="Success loading anti-virus database."
323: 2019-11-19 18:47:02 scanunit=manager pid=207 str="AV database changed (1); restarting workers"
324: 2019-11-19 18:47:02 <00207> scanunit=manager str="Success loading anti-virus database."
325: 2019-11-19 18:51:36 the killed daemon is /bin/pyfcgid: status=0x0
Crash log interval is 3600 seconds
Enable the Interface Bandwidth monitoring on the FortiGate Dashboard:
1. Select in the left column Dashboard -> Status. then select 'Add widget'.
2. Select Interface Bandwidth.
3. Select the interface that is used on the FortiGate.
4. Go to Dashboard to see the interfaces with the bandwidth usage widget. (In this scenario: the WAN interface.)
The purpose of Interface Bandwidth usage is to see whether there is high bandwidth on the FortiGate that is exceeding the supported traffic.
This information may be useful in figuring out the cause of High CPU or High Memory consumption.
Example:
Command 4:
diagnose hardware sysinfo memory
By using 'diagnose hardware system memory', all of the memory counters involved in the conserve mode and kernel conserve mode calculation can be seen.
Consider the following example:
diagnose hardware sysinfo memory
total: used: free: shared: buffers: cached: shm:
Mem: 260435968 146337792 114098176 0 221184 65974272 59985920
Swap: 0 0 0
MemTotal: 254332 kB
MemFree: 111424 kB
MemShared: 0 kB
Buffers: 216 kB
Cached: 64428 kB
SwapCached: 0 kB
Active: 26844 kB
Inactive: 37856 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 254332 kB (2)
LowFree: 111424 kB (1)
SwapTotal: 0 kB
SwapFree: 0 kB
Explaining the value 'Cached, Active, Inactive' that may take significant memory.
Cached = Active + Inactive
This is information cached by the FortiGate for its system (basically I/O buffering). The inactive part is claimed back from the system when it requires more memory.
Submit all of this gathered information to Fortinet TAC by logging in on support.fortinet.com.
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.