Technical Tip: High CPU/memory with FSSO and authd
Description
This article describes how to troubleshoot high CPU/Memory with authd daemon in a specific FSSO context.
Scope
All firmware levels.
Solution
High CPU with Collector Agent is generally caused by an authd daemon trying to connect in vain, overwhelming FortiGate with repetitive SSL sessions.
PID RSS ^CPU% MEM% FDS TIME+ NAME
* 97 15M 79.6 0.8 47 30:20 authd
diagnose debug crashlog read
16348: 2014-09-03 13:43:59 <02587> application authd
16349: 2014-09-03 13:43:59 <02587> *** signal 11 (Segmentation fault) received ***
16350: 2014-09-03 13:43:59 <02587> Register dump:
diagnose debug authd memory <----- Shows authd memory usage information.
diagnose debug application authd -1 <----- Checking timeouts and possible errors.
- Disable NTLM (if used with FSSO) for testing.
NTLM is heavy and can create peaks of memory, especially with lots of users and/or with polling mode on Collector Agent.
High CPU usage for authd can be caused by the high number of problematic authentication requests (i.e.: NTLM credentials are not provided or NTLM requests are started as system processes) flooding the system with repeating attempts to send logon.
- Try to optimize the Collector Agent.
- Make sure that the cache is enabled.
- Raise the Worker Thread to 512 (Advanced settings -> Worker thread count).
- Switch to DCAgent if polling mode is used.
- Check for eventual conflict between the Windows server and FSSO Agent (64-bit versus 32).
- Try to kill authd:
diagnose sys kill 11 <authd_PID_int>
If the issue persists and assistance from the support team is required, open a ticket through the support portal.
Execute the commands listed below, collect the outputs, and attach them to the ticket along with the configuration file and the FSSO Agent version information.
diagnose debug crashlog read
diagnose system top 5 40 <----- To sort by high CPU, use the 'p' key. To sort by high memory, use the 'm' key.
get sys status
get sys performance status <----- Run it 3 times.
diagnose debug reset
diagnose debug enable
diagnose debug authd fsso list
diagnose debug authd fsso server
diagnose firewall auth list
diagnose sniffer packet any 'port 8000 and host <collector agent ip>' 4
Run the capture for 2 minutes and then stop it with Ctrl + C.
diagnose debug disable
diagnose debug reset