Created on
‎06-17-2024
07:50 AM
Edited on
‎02-25-2025
04:35 AM
By
Stephen_G
Description | This article describes how to troubleshoot collectors. |
Scope | FortiSIEM Collector node. |
Solution |
In some cases, the collector has successfully been registered in the past, but is now experiencing some issues and shows a critical health status or seems to not receive or transmit events. Follow the next steps to identify the source of the issue:
To get status details, hover the mouse on the collector status and check what is in the popup. This will make it easy to focus on what is worth attention, such as:
Check the 'Last Status Updated' and the 'Last File Received' to see if the status is up-to-date or when the issue has started. This can correlate with external changes like the new firewall policy or link disconnection issues.
If the / root disk needs to be checked: du -h --max-depth=1 /mnt/ | sort -rh umount /mnt #After checking directories
If /opt disk needs to be checked:
Identify the directories that are taking up space. It could be because of system logs at:
Check for SNMP packets: tcpdump udp and port 162 -vvv -w /tmp/collector.pcap
Check for Flow packets: tcpdump 'udp and (port 2055 or port 6343)' -vvv -w /tmp/collector.pcap
Be aware that not all Flows are supported. See Flow Support. If the source device is sending Netflow to the FortiSIEM but cannot decode it, make sure that the Netflow Template is sent by the source device regularly.
If the tcpdump command shows Got 0, review the network or device configuration. /tmp/collector.pcap can be checked for further analysis.
Make sure phParser is listening on those ports to treat the UDP packets:
netstat -tulpn | egrep '514|162|2055|6343'
Check the EPS count on the collector:
tail -f /opt/phoenix/log/phoenix.log | grep PH_SYSTEM_PERF_EVENTS_PER_SEC
Check for configuration in sync in the collector: cat /opt/phoenix/config/phoenix_config.txt | egrep 'APP_SERVER_HOST' cat /opt/phoenix/config/phoenix_super.txt 10.5.8.35,
Make sure that IPs or FQDN listed as Event Upload Workers are reachable without filtering, proxy or SSL inspection.
From the collector CLI:
curl -vk https://super_or_worker_address </head> <br><br><br><br><br><br> </body> * Connection #0 to host super_or_worker_address left intact
If an SSL certificate is configured on the super and workers, make sure the common name matches with the Fully Qualified Domain Name of the machine.
Check the events sending on the super or worker. From the super or worker CLI: tail -f /var/log/httpd/ssl_request_log | grep evthandler | grep Collector_IP_or_Collector_ID
If the HTTP code is 200, this means that events have been received.If it is another code such as 401 or 403, the collector has authorization issues. Renew registration with a phProvision script on the collector with the --update option.
Then for each supervisor address, run from collector CLI as root with the noted collector ID (ex:10001) :
agent_id=`cat /opt/phoenix/config/phoenix_config.txt | grep agent_id | sed 's#agent_id=##g'` srvpass=`phLicenseTool --showServicePassword`
Check for the result of the output command and the /tmp/systemConfig.xml. Each super address should be resolvable, command should not get 4xx or 5xx code and /tmp/systemConfig.xml should contain configuration data from the super.
If it is an authentication issue, please renew registration using phProvision script on the collector with the --update option.
If /opt/phoenix/cache/NO_EVENT_UPLOAD_FILE or NO_SVN_FILE_UPLOAD are present, those files behave like flags.
Check for explicit errors in the logs at /opt/phoenix/log/phoenix.log. Check for .err files under: /opt/phoenix/cache/parser/upload/svn: cd /opt/phoenix/cache/parser/upload/svn
Remove those files with the following commands: cd /opt/phoenix/cache/
This result must be close to 0. If this figure is growing, it either means the connection to the super/worker is in a failure state or too slow compared to the amount of incoming events. Details of events in the cache can be checked with the following command: find /opt/phoenix/cache/parser -type f
agent_id=`cat /opt/phoenix/config/phoenix_config.txt | grep agent_id | sed 's#agent_id=##g'` srvpass=`phLicenseTool --showServicePassword` export IFS=',' for super_address in `cat /opt/phoenix/config/phoenix_super.txt`; do echo "Testing access to $super_address"; curl -u "${agent_id}:${srvpass}" -kL "https://${super_address}/ContentUpgrade"; done Testing access to 10.5.8.44
tail -f /opt/phoenix/log/phoenix.log | grep PHL_ERROR
Other known errors:
PH_HTTP_RESPONSE_FAILURE:
2024-05-28T01:01:09.649952+02:00 collector phEventPackager[90872]: [PH_HTTP_RESPONSE_FAILURE]:[eventSeverity]=PHL_WARNING,[procName]=phEventPackager,[fileName]=phHttpClient.cpp,[lineNumber]=614,[errorNo]=500,[phLogDetail]=HTTP response code failure
Reason: The collector is not able to send the events to the worker or the super. Resolution: The application Server process at Super needs to be checked.
PH_PARSER_FILE_STAT_FAILURE:
2023-06-16T08:49:54.775570+03:00 collector phParser[4251]: [PH_PARSER_FILE_STAT_FAILURE]:[eventSeverity]=PHL_ERROR,[procName]=phParser,[fileName]=phAgentEventProcessor.cpp,[lineNumber]=401,[filePath]=/opt/phoenix/cache/parser/upload/win/TL5tpQ.gzs,[errorNoInt]=2,[phLogDetail]=Failed to stat file
Reason: The parser process does not correct permission to treat the file. Resolution: Restore proper file permissions with the next commands from collector CLI as root:
find /opt/phoenix/cache/parser/ -ls chown -R admin:admin /opt/phoenix/cache/parser/ chmod -R 755 /opt/phoenix/cache/parser chmod -R 700 /opt/phoenix/cache/parser/fwdupload/
PH_UTIL_XML_HANDLING_ERROR:
2024-03-08T01:19:00.818830+00:00 collector phParser[100925]: [PH_UTIL_XML_HANDLING_ERROR]:[eventSeverity]=PHL_ERROR,[procName]=phParser,[fileName]=phBaseXmlParser.cpp,[lineNumber]=332,[errReason]=Exception: Expected end of tag 'data', Level: Fatal error, Line No: 1, Column No: 10668,[phLogDetail]=Failed to handle XML
Reason: The wrong parser definition has been applied. Resolution: Go on FortiSIEM GUI at Admin -> Device Support -> Parsers, review the last parser changed, deactivate it, and select the 'Apply' Button to sync with collectors. Then, review its definition until the parser can be activated again and sync with 'Apply'.
PH_GENERIC_CRITICAL (Upload service thread 0 seems stuck on task):
2025-01-17T16:00:01.810575+05:30 collector1 phEventPackager[3357]: [PH_GENERIC_CRITICAL]:[eventSeverity]=PHL_CRITICAL,[procName]=phEventPackager,[fileName]=phEventPKGProcess.cpp,[lineNumber]=1280,[p hLogDetail]=Upload service thread 0 seems stuck on task [ ip: 10.1.2.22, path: /evthandler2?10003, filePath: /opt/phoenix/cache/parser/events/evt_1736814170_1_9362.dat, counter: 22, upload seconds: 656
Reason: The collector cannot connect to port 443 of Supervisor/worker IP/hostname configured in ADMIN -> Settings -> Cluster Config -> Event Upload Worker list. Use curl -kv https://hostname_or_IP/ to verify connectivity. Resolution: Fix connectivity from collector to Supervisor/worker IP/hostname over port 443. Make sure DNS resolution also works if the hostname is used in ADMIN -> Settings -> Cluster Config -> Event Upload Worker list.
For PH_EVT_PACKAGER_FILE_UPLOAD_FAILURE: See details at How to resolve Collector Event Upload errors. |