CPU Profiling is a utility that allows users to perform advanced code-level CPU analysis.
However, interpreting the output can be challenging because access to the code is available only to the development team. This article will explain as many functions as possible. By understanding what each function does, a possible workaround can be implemented to reduce CPU usage.
Broadly classified, CPUs can run at kernel/system space or at user space. Kernel space is strictly used for running operating system (kernel) executables. The following command outputs need to be collected when the CPU usage is high in the kernel space. These commands will not help in cases where the CPU usage is high due to a particular process such as IPS engine/WAD.
CPU profiling is applicable for cases that have high CPU in the system and high CPU in softirq, although softirq is related to traffic processing (open pcap will give more data as well).
get system performance status
CPU states: 0% user 99% system 0% nice 1% idle 0% iowait 0% irq 0% softirq CPU0 states: 0% user 99% system 0% nice 1% idle 0% iowait 0% irq 0% softirq CPU1 states: 0% user 98% system 0% nice 2% idle 0% iowait 0% irq 0% softirq
Or:
get system performance status CPU36 states: 0% user 0% system 0% nice 0% idle 0% iowait 0% irq 100% softirq CPU37 states: 0% user 0% system 0% nice 0% idle 0% iowait 0% irq 100% softirq CPU38 states: 0% user 0% system 0% nice 0% idle 0% iowait 0% irq 100% softirq CPU39 states: 0% user 0% system 0% nice 0% idle 0% iowait 0% irq 100% softirq
Collect the following during a period of high CPU usage:
diag sys profile cpumask 15 <----- To profile a specific CPU core, in this case, CPU15. Do not use cpumask if the requirement is to profile all CPU cores.
diag sys profile start
diag sys profile stop <----- Wait for the profiler to collect data for 30-60 seconds before stopping.
diag sys profile show order
diag sys profile show detail
diag sys profile sysmap
The output will be different for different cases, but it will appear similar to the following:
diag sys profile show order
0xffffffffa007990f: 2322 fqdn_ip_check+0x4f/0xa0 0xffffffffa0078537: 547 oal_ioctl+0xa57/0x1070 0xffffffff803ac8ec: 100 __write_lock_failed+0xc/0x20 0xffffffffa00323bb: 57 __iprope_check_one_policy+0x32b/0x21c0 0xffffffffa002bd4b: 38 timer_function+0xffb/0x2e70 0xffffffffa009c287: 37 __iprope_tree_check+0x1f7/0x3e0
See the table below for a rough guide to the functions seen above.
There are two types of functions: one is the standard Linux functions, the codes of which can be found on this website.
The meaning of each code can be quickly verified by searching on Google or consulting another source of knowledge.
The other sets of functions are Fortinet proprietary. Fortinet Developer feedback is necessary to understand the meaning of each function.
Function Name
|
Short Meaning
|
Troubleshooting step
|
poll_idle
|
The CPU is not doing anything
|
Can be ignored
|
__inet_hash_connect , __inet_check_established
|
Used to open sockets
|
Seen in explicit proxy cases, in which Fortigate is not able to open outgoing source ports. Increase the source port/source IP as per this article, “config web-proxy explicit”
|
inet_frag_find, ip_defrag
|
Fragmentation
|
Run commands related to fragmentation (sniffer, diag commands) after confirmation, check the MTU of customer network
|
is_csf_valid
|
ISDB
|
Related to ISDB processing, and SDWAN. Check for large number of ISDB entries inside the Fortigate config file.
|
read_ses_rate
|
SNMP
|
Related to bulk SNMP get request against the Fortigate for specific OID
|
__local_bh_enable_ip
|
|
|
rb_next, fqdn_check
|
FQDN
|
Check for large number of FQDNs in the related policy
|
strncasecmp
|
FTP
|
Related to FTP session helper in old hardware revision of SOC4 device (100F)
|
queue_work_on, process_one_work
|
Internal loop inside linux kernel
|
N/A
|
__write_lock_failed
|
Common linux function, in which another process or thread is waiting for the lock release
|
N/A
|
sal_no_sleep
|
Possible FIB changes
|
To check if there is high number of routing changes on the Fortigate, BGP/OSPF/IPsec tunnel up/down events
|
__posix_lock_file
|
Config / IPSengine
|
When updating config or signatures high CPU is observed, add/edit/delete local users
|
fqdn_ip_check, oal_ioctl, __write_lock_failed
|
Denied traffic
|
DDoS attack, or large number of packets that are denied by firewall policy
|
ip_session_mark_one_dirty, ip6_session_mark_one_dirty
|
Session being dirtied
|
Check for config change or ‘diag sys cmbd info’, check for fcnacd debugs/ forticlient endpoints trying to register, used to cause session dirty
|
release_pages,
pagevec_lru_move_fn
|
Related to WAD sending data to IPS
|
Related to deep packet inspection, through WAD + IPS
|
do_gettimeoffset_tsc
|
Related to the use of IPS
|
IPS inspection related
|
unix_find_other.constprop
|
Config change
|
Collect
fnsysctl cat /proc/net/unix
But it is related to configuration change such as moving a firewall policy
|
add_ip_mcast_session
|
Multicast traffic processing
|
Related to multicast traffic processing
|
|