Created on 05-24-2023 10:53 PM Edited on 02-26-2024 07:05 AM By Kate_M
Hello. I have some test switches (248E-FPOE). I put them into standalone mode (factory reset and then reconfigured). Some of the switches are on 7.2.1 and others on 7.2.4. The switches are connected to each other, and only each other. There are no redundant links.
The problem is that about every three or so hours the switches cpu load spikes to ~90% for 20 or 30 minutes and then goes back down. I tried disconnecting on of the switches from the others, and it still is doing this. Looking at the logs, they just seem to spike and them cpu utilization logs get generated.
Not sure what might be causing this? I'll look into one more thing tomorrow, but I have my doubts. Was curious if anyone has ran into anything similar.
**Note: Only having this issue after putting them in standalone mode.
Solved! Go to Solution.
Nominating a forum post submits a request to create a new Knowledge Article based on the forum post topic. Please ensure your nomination includes a solution within the reply.
By reconfiguring the switches mostly from scratch (still reusing vlan and port configurations) I was able to solve this.
My best guess is that there is some configuration from managed switches that will cause CPU spikes if applied to a standalone switch.
Hello,
Thank you for using the Community Forum. I will seek to get you an answer or help. We will reply to this thread with an update as soon as possible.
Thanks,
Hello,
We are still looking for someone to help you.
We will come back to you ASAP.
Regards,
Hello,
May I ask for some support on another post I made. It is hyperlinked here -> other post. It received a reply, but I was unable to resolve any issue with it.
Thanks.
Hello,
Check for network loops or excessive broadcast storms within your network. These issues can lead to increased CPU utilization as the switches try to handle the excessive traffic. Ensure that there are no redundant links or misconfigured spanning tree settings that could cause loops.
Use network monitoring tools or packet capture utilities to analyze the network traffic passing through the switches during the CPU spikes. Look for any unusual or excessive traffic patterns that could be causing the high CPU utilization. Identify the source of the traffic and investigate if it's normal or requires further troubleshooting.
Regards,
Shilpa C P
Currently these switches are only connected to each other. I am certain there are no redundant links nor loops. I would like to note that these spikes happen every four hours and last almost roughly 30 minutes. I have attached an image of the cpu performance from the gui. Also note that the traffic (bandwidth) is stable throughout each spike (the bandwidth being low at the beginning was because I had temporarily disabled the ports).
I conducted a packet capture and there isn't much. Mostly just LLDP broadcasts roughly every three seconds or so. There were a few BOOTP packets coming from the internal interfaces as they have no IP address, yet are set to use 'DHCP' (there is no dhcp server).
After doing this I thought to use some commands to see the performance. The data here was interesting.
When CPU usage was 'normal':
SW2 # get system performance stat
CPU states: 7% user 33% system 0% nice 60% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 3 minutes
SW2 # get system performance top
Run Time: 14 days, 2 hours and 2 minutes
8U, 28S, 64I; 487T, 231F
lldpmedd 1091 S 3.7 1.9
alertd 1038 S 3.1 1.5
ctrld 1081 S 2.7 1.6
stpd 1082 S 2.3 1.8
fortilinkd 1099 S 1.9 1.7
l2d 1086 S 0.5 1.6
lpgd 1083 S 0.5 1.6
dmid 1092 S 0.5 1.5
newcli 590 R 0.5 1.5
poed 1057 S 0.3 1.4
sshd 562 S 0.1 1.8
acld 1044 S 0.1 1.6
l2dbg 1087 S 0.1 1.6
pyfcgid 1025 S N 0.0 8.4
cmdbsvr 939 S 0.0 2.7
cu_swtpd 1097 S 0.0 2.3
httpsd 1028 S N 0.0 2.3
initXXXXXXXXXXX 1 S 0.0 2.2
newcli 563 S 0.0 2.1
httpsd 1248 S N 0.0 2.1
When CPU usage was spiking:
SW2 # get system performance stat
CPU states: 13% user 55% system 0% nice 32% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 33 minutes
SW2 # get system performance top
Run Time: 14 days, 2 hours and 33 minutes
10U, 59S, 31I; 487T, 232F
snmpd 1589 S 31.5 2.0
lldpmedd 1091 S 3.7 1.9
ctrld 1081 S 2.9 1.6
fortilinkd 1099 S 2.7 1.7
alertd 1038 S 2.7 1.5
stpd 1082 S 1.1 1.8
l2dbg 1087 S 0.9 1.6
newcli 1440 R 0.9 1.5
cu_swtpd 1097 S 0.3 2.3
l2d 1086 S 0.3 1.6
initXXXXXXXXXXX 1 S 0.1 2.2
ipconflictd 1088 S 0.1 2.0
igmpsnoopingd 1042 S N 0.1 1.9
sshd 1390 S 0.1 1.8
lpgd 1083 S 0.1 1.6
lfgd 1036 S 0.1 1.5
dmid 1092 S 0.1 1.5
poed 1057 S 0.1 1.4
pyfcgid 1025 S N 0.0 8.4
cmdbsvr 939 S 0.0 2.7
I would like to test disabling snmp to see if that fixes it, or at least disabling the agent. I will most likely disable the agent later today.
Also these switches are stand-alone, so I am unsure why the fortilink process shows in the top.
I would also like to ask, when in the gui the 'nice' cpu state increases to sometimes over 20%, making the cpu usage quite high. I believe this usage falls under the 'pyfcgid' process, but I am unsure. Is this normal?
Here's an example of this:
CPU states: 13% user 51% system 35% nice 1% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 15 minutes
Run Time: 14 days, 2 hours and 13 minutes
17U, 70S, 13I; 487T, 229F
snmpd 1125 R 27.1 2.1
pyfcgid 1082 S N 24.4 9.1
lldpmedd 1149 S 3.5 2.0
Thank you for your response.
After disabling snmp (as seen in the gui) the cpu usage became stable. Is there a reason snmp would have such an excessive use of resources?
May I ask what further troubleshooting steps I could take? Upgrading to 7.4.0 seemed to have corrected the issue on three of the five switches I am testing with.
I pasted the configs into these switches (a couple hundred lines at a time) from a switch that was managed by a fortigate. These switches I am testing on are standalone. When pasting in the configs I did remove the line to enable fortilink and also disabled auto-network. Could pasting this config be part of the issue?
By reconfiguring the switches mostly from scratch (still reusing vlan and port configurations) I was able to solve this.
My best guess is that there is some configuration from managed switches that will cause CPU spikes if applied to a standalone switch.
Select Forum Responses to become Knowledge Articles!
Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.
User | Count |
---|---|
1732 | |
1106 | |
752 | |
447 | |
240 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.