Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
BingleHopper
New Contributor III

FortiSwitch CPU Load Spiking every few hours.

Hello.  I have some test switches (248E-FPOE).  I put them into standalone mode (factory reset and then reconfigured).  Some of the switches are on 7.2.1 and others on 7.2.4.  The switches are connected to each other, and only each other.  There are no redundant links.

 

The problem is that about every three or so hours the switches cpu load spikes to ~90% for 20 or 30 minutes and then goes back down.  I tried disconnecting on of the switches from the others, and it still is doing this.  Looking at the logs, they just seem to spike and them cpu utilization logs get generated.

 

Not sure what might be causing this?  I'll look into one more thing tomorrow, but I have my doubts.  Was curious if anyone has ran into anything similar.

 

**Note: Only having this issue after putting them in standalone mode.

1 Solution
BingleHopper

By reconfiguring the switches mostly from scratch (still reusing vlan and port configurations) I was able to solve this.  

My best guess is that there is some configuration from managed switches that will cause CPU spikes if applied to a standalone switch.

View solution in original post

8 REPLIES 8
Anthony_E
Community Manager
Community Manager

Hello,


Thank you for using the Community Forum. I will seek to get you an answer or help. We will reply to this thread with an update as soon as possible.


Thanks,

Anthony-Fortinet Community Team.
Anthony_E
Community Manager
Community Manager

Hello,

 

We are still looking for someone to help you.

We will come back to you ASAP.


Regards,

Anthony-Fortinet Community Team.
BingleHopper

Hello,

 

May I ask for some support on another post I made.  It is hyperlinked here -> other post. It received a reply, but I was unable to resolve any issue with it.

 

Thanks.

Shilpa1
Staff
Staff

Hello,

Check for network loops or excessive broadcast storms within your network. These issues can lead to increased CPU utilization as the switches try to handle the excessive traffic. Ensure that there are no redundant links or misconfigured spanning tree settings that could cause loops.

Use network monitoring tools or packet capture utilities to analyze the network traffic passing through the switches during the CPU spikes. Look for any unusual or excessive traffic patterns that could be causing the high CPU utilization. Identify the source of the traffic and investigate if it's normal or requires further troubleshooting.
Regards,
Shilpa C P

BingleHopper
New Contributor III

Currently these switches are only connected to each other.  I am certain there are no redundant links nor loops.  I would like to note that these spikes happen every four hours and last almost roughly 30 minutes.  I have attached an image of the cpu performance from the gui.  Also note that the traffic (bandwidth) is stable throughout each spike (the bandwidth being low at the beginning was because I had temporarily disabled the ports).Note CPU relative to BandwidthNote CPU relative to Bandwidth

I conducted a packet capture and there isn't much.  Mostly just LLDP broadcasts roughly every three seconds or so.  There were a few BOOTP packets coming from the internal interfaces as they have no IP address, yet are set to use 'DHCP' (there is no dhcp server). 

 

After doing this I thought to use some commands to see the performance.  The data here was interesting. 

When CPU usage was 'normal':

SW2 # get system performance stat
CPU states: 7% user 33% system 0% nice 60% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 3 minutes

 

SW2 # get system performance top

Run Time: 14 days, 2 hours and 2 minutes
8U, 28S, 64I; 487T, 231F
lldpmedd 1091 S 3.7 1.9
alertd 1038 S 3.1 1.5
ctrld 1081 S 2.7 1.6
stpd 1082 S 2.3 1.8
fortilinkd 1099 S 1.9 1.7
l2d 1086 S 0.5 1.6
lpgd 1083 S 0.5 1.6
dmid 1092 S 0.5 1.5
newcli 590 R 0.5 1.5
poed 1057 S 0.3 1.4
sshd 562 S 0.1 1.8
acld 1044 S 0.1 1.6
l2dbg 1087 S 0.1 1.6
pyfcgid 1025 S N 0.0 8.4
cmdbsvr 939 S 0.0 2.7
cu_swtpd 1097 S 0.0 2.3
httpsd 1028 S N 0.0 2.3
initXXXXXXXXXXX 1 S 0.0 2.2
newcli 563 S 0.0 2.1
httpsd 1248 S N 0.0 2.1

 

When CPU usage was spiking:

SW2 # get system performance stat
CPU states: 13% user 55% system 0% nice 32% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 33 minutes

 

SW2 # get system performance top

Run Time: 14 days, 2 hours and 33 minutes
10U, 59S, 31I; 487T, 232F
snmpd 1589 S 31.5 2.0
lldpmedd 1091 S 3.7 1.9
ctrld 1081 S 2.9 1.6
fortilinkd 1099 S 2.7 1.7
alertd 1038 S 2.7 1.5
stpd 1082 S 1.1 1.8
l2dbg 1087 S 0.9 1.6
newcli 1440 R 0.9 1.5
cu_swtpd 1097 S 0.3 2.3
l2d 1086 S 0.3 1.6
initXXXXXXXXXXX 1 S 0.1 2.2
ipconflictd 1088 S 0.1 2.0
igmpsnoopingd 1042 S N 0.1 1.9
sshd 1390 S 0.1 1.8
lpgd 1083 S 0.1 1.6
lfgd 1036 S 0.1 1.5
dmid 1092 S 0.1 1.5
poed 1057 S 0.1 1.4
pyfcgid 1025 S N 0.0 8.4
cmdbsvr 939 S 0.0 2.7

 

I would like to test disabling snmp to see if that fixes it, or at least disabling the agent.  I will most likely disable the agent later today.

 

Also these switches are stand-alone, so I am unsure why the fortilink process shows in the top.

 

I would also like to ask, when in the gui the 'nice' cpu state increases to sometimes over 20%, making the cpu usage quite high.  I believe this usage falls under the 'pyfcgid' process, but I am unsure.  Is this normal?

Here's an example of this:

CPU states: 13% user 51% system 35% nice 1% idle
Memory states: 48% used
Uptime: 14 days, 2 hours, 15 minutes

 

Run Time: 14 days, 2 hours and 13 minutes
17U, 70S, 13I; 487T, 229F
snmpd 1125 R 27.1 2.1
pyfcgid 1082 S N 24.4 9.1
lldpmedd 1149 S 3.5 2.0

 

Thank you for your response.

BingleHopper

After disabling snmp (as seen in the gui) the cpu usage became stable.  Is there a reason snmp would have such an excessive use of resources? 

CPU is now stable at around 38%CPU is now stable at around 38%Where SNMP was disabledWhere SNMP was disabled

BingleHopper
New Contributor III

May I ask what further troubleshooting steps I could take?  Upgrading to 7.4.0 seemed to have corrected the issue on three of the five switches I am testing with.

 

I pasted the configs into these switches (a couple hundred lines at a time) from a switch that was managed by a fortigate.  These switches I am testing on are standalone.  When pasting in the configs I did remove the line to enable fortilink and also disabled auto-network.  Could pasting this config be part of the issue?

BingleHopper

By reconfiguring the switches mostly from scratch (still reusing vlan and port configurations) I was able to solve this.  

My best guess is that there is some configuration from managed switches that will cause CPU spikes if applied to a standalone switch.

Labels
Top Kudoed Authors