Hello community,
Since I have configured some mail alerts with the automation in our FortiGate-VM I noticed that the "High CPU" message gets triggerd every day exactly every 8 hours (0:00; 8:00; 16:00). But the cpu spikes only for a very short time, since I don't even see the spike in the CPU widget or GUI process monitor.
I tried configuring the generation of the local reports to other times than those 3 times with the CPU spikes, but it still happens.
I also looked at the times where IPS and AV database updates (setting is automatic), but it never was at those times from the CPU spikes in the last weeks.
Also rotation of logfiles only happens at 0:00 so this does not explain the other times from the CPU spikes.
I also run some CLI commands for monitoring the system performance, when a mail alarm is sent, here is the output from it:
========== #1, 2025-08-14 08:00:07 ========== FW-SAAS config global FW-SAAS (global) get system performance status CPU states: 83% user 14% system 0% nice 2% idle 0% iowait 0% irq 1% softirq
CPU0 states: 79% user 19% system 0% nice 1% idle 0% iowait 0% irq 1% softirq
CPU1 states: 87% user 9% system 0% nice 3% idle 0% iowait 0% irq 1% softirq
Memory: 6106124k total, 3371604k used (55.2%), 2023048k free (33.1%), 711472k freeable (11.7%) Average network usage: 3347 / 3433 kbps in 1 minute, 3355 / 3431 kbps in 10 minutes, 3241 / 3363 kbps in 30 minutes Maximal network usage: 8078 / 8106 kbps in 1 minute, 22526 / 21361 kbps in 10 minutes, 33772 / 34138 kbps in 30 minutes Average sessions: 835 sessions in 1 minute, 849 sessions in 10 minutes, 856 sessions in 30 minutes Maximal sessions: 844 sessions in 1 minute, 879 sessions in 10 minutes, 879 sessions in 30 minutes Average session setup rate: 3 sessions per second in last 1 minute, 3 sessions per second in last 10 minutes, 3 sessions per second in last 30 minutes Maximal session setup rate: 11 sessions per second in last 1 minute, 18 sessions per second in last 10 minutes, 18 sessions per second in last 30 minutes Average NPU sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes Maximal NPU sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes Virus caught: 0 total in 1 minute IPS attacks blocked: 0 total in 1 minute
Uptime: 8 days, 0 hours, 1 minutes
FW-SAAS (global) diagnose sys top 1 20 1
08:00:09 AM up 8 days, 0 hours and 1 minutes 50U, 0N, 18S, 22I, 0WA, 0HI, 10SI, 0ST; 5963T, 1966F
cmdbsvr_writeco 19348 R 32.0 1.2 0
node 19163 S 18.0 1.5 0
node 19161 S 16.0 1.6 1
node 19162 S 16.0 1.4 1
node 19164 S 16.0 1.4 1
node 3444 S 4.0 1.9 1
wad 3678 S 3.5 5.8 1
ipsengine 4272 S 3.0 3.8 1
scanunitd 19079 S < 2.0 1.3 1
scanunitd 19139 S < 2.0 1.2 1
wad 3679 S 1.5 4.0 0
ipsengine 4273 S 1.5 3.7 1
iked 3510 S 1.0 1.0 0
snmpd 3459 S 1.0 0.5 0
forticron 3433 S 0.5 0.9 1
autod 3491 S 0.5 0.8 1
csfd 3490 S 0.5 0.7 1
merged_daemons 3431 S 0.5 0.4 1
ipshelper 4222 S 0.0 1.9 0
miglogd 3442 S 0.0 1.6 1
FW-SAAS (global) diagnose sys top-mem 20 wad (3678): 255307kB wad (3679): 149389kB ipsengine (4272): 123766kB ipsengine (4273): 120407kB node (3444): 90443kB node (19161): 77566kB node (19163): 67553kB node (19162): 64019kB node (19164): 63042kB ipshelper (4222): 55238kB updated (3456): 50341kB cid-scan (3486): 44150kB wad (3671): 38943kB miglogd (3578): 35645kB miglogd (3442): 34988kB scanunitd (19079): 31467kB reportd (3443): 30985kB cmdbsvr (3352): 30591kB scanunitd (19139): 29623kB cw_acd (3483): 23125kB
Top-20 memory used: 1416588kB
======= end of #1, 2025-08-14 08:00:10 ======
The top CPU using daemons are always "node", but this daemon seems to do a lot of different things, so it is hard to tell what exactly is happening.
Has anyone any idea, where the spikes could come from or where else I could look at? And it would be nice to have a "high CPU average in the last X minutes" event to trigger a mail alarm, but I could not find one. Has anyone a solution for holding back the alarm a few minutes, so it does not trigger for very short spikes?
Thanks for the help in advance :)
Hi Svenkund;
I am Bill from Fortinet. I am preparing a script to capture CPU and system information using SecureCRT or Tera Term. I will share the script with you via email once it is ready.
In the meantime, could you please send the following to my email, bhoang@fortinet.com:
This information will help me analyze the issue. Thank you for your assistance.
Bill
Hello Bill,
Sorry for the late reply, I was on holidays.
I just sent you an email. And thanks for the help!
I don't have any direct answers to your issue. But based on this old cookbook:
https://docs.fortinet.com/document/fortigate/6.2.0/cookbook/702937/execute-a-cli-script-based-on-cpu...
you can run CLI commands like "diag sys top-summary" and "diag sys top" in a automation stitch. Although you probably wouldn't have a way to enter "Shift-p" option to sort the list by CPU usage, if you set the table size big enough, you would be able to capture the processes taking up most of CPU time.
I've never done any automation stitches myself so I'm not 100% sure it would work. But likely would work.
Toshi
Hello Toshi,
That is exactly what I am doing with the CLI output from the post above, the CLI output above is from a mail action which triggers, when the CPU spikes. diagnose sys top 1 20 1 sorts by CPU by default. But processes with most CPU usage seem to change. Since the issue started I noticed, that in the CLI command "get system performance status" user is always the one which takes most of the CPU, but I was not able to figure out what causes this.
I would open a ticket at TAC with those your findings.
Toshi
Have you looked at the collection evaluation queues? We had a collection, with incremental updates, that was poorly written and would spike the CPU for over 4 minutes. Because the incremental collection update option was ticked, it was running every 5 minutes. It wasn't seen until we started to look at the collection evaluation queue where it shows the time it takes each collection to be evaluated.
I think I will share a script to monitor CPU usage with Svenkund and cross-check it with the system logs to identify the root cause. Thanks
Bill
User | Count |
---|---|
2570 | |
1364 | |
796 | |
651 | |
455 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.