FSSO Collector Agent showing 'unprocessed logon event' in debug file (log level warning)
Scenario: Fortigate FG600E cluster v.6.4.7. 3 Microsoft Windows Domain Controller servers (W2016), 2 Microsoft Windows Collector Agent servers (W2012): DC Agent and Collector Agent in Advanced Mode.
From time to time, specially when DCs or CA servers goes down and up, 'unprocessed logon event' messages appear continuously in both Collector Agent servers. Some logon events are missing in Collector Agent, and therefore missing in Fortigate. Therefore, Internet access problem for several users arise.
There isn't any specific procedure to recover to normal operation. Even restarting Collector Agents do not solve the problem. Sometimes, it seems to be recovered when modifying some part of the Collector Agents configuration, or when modifying some part of the DC FSSO configurtation and then restarting the Collector Agents...
Any help is really appreciated.
Many thanks in advanced.
Solved! Go to Solution.
Nominating a forum post submits a request to create a new Knowledge Article based on the forum post topic. Please ensure your nomination includes a solution within the reply.
Hey Ignacio,
if you enable DNS on DC Agents, Collector Agent will receive logins with username and IP, not just username and workstation, so it can add the logins to its table immediately without having to do another DNS lookup.
This WILL shift some burden from the Collector Agent.
Collector Agent performs DNS lookups in two cases:
- it has a login event with username and workstation, and needs to resolve the workstation to an IP (if DC Agents already provide the IP, this does NOT happen)
- it verifies the IP of a workstation with an existing login
-> this IP verification happens every minute by default, and is in place to handle roaming users if/when their IPs change (like going from WiFi to Ethernet)
Regarding the worker count, you could increase it to 256 or 512 and see if the Collector Agent stabilizes somewhat. If you're not seeing any issues with CPU/memory, then you can leave them as they are.
Depending on how frequent DNS failures are due to incomplete domain names, for example, this could slow down Collector Agent quite significantly.
Regarding the user count, Collector Agent can usually handle a few thousand users just fine; the main point is how many login events those users generate.
-> if every user generates 10 login events per seconds for whatever reason, then Collector Agent would have to process 25000 events in a second, and depending on version, worker thread and a few other factors (such as DNS failures) then it might start to struggle.
I would suggest the following:
- enable DNS in the DC Agents again to take at least some load off the Collector Agent
- if you don't have roaming users (changing IPs/subnets), you can disable the IP verification (set interval to 0)
- increase the worker count to 256 or 512
Then monitor the situation for a bit to see if the unprocessed queue goes down.
Hey IAC,
'unprocessed logon event messages' in Collector Agent means it's building up a queue of login events it hasn't processed yet (meaning those logins are not added to the logon user list nor shared with FortiGate; in effect the users are not considered authenticated by FortiGate).
This usually happens if the Collector Agent is extremely busy with other processes, such as DNS lookup for workstations or polling events from domain controllers.
- are you using DC Agent mode or polling mode?
-> if DC Agent mode, you could try enabling DNS lookup on the DC Agents (if it isn't already) to take that load off the Collector Agent
-> if polling mode, you could check through the debug log to get a rough idea of how many events it may be polling in a minute (look for EvtID) or so; if you're looking at several thousand or tens of thousands, you could consider modifying the polling event ID filter a bit to exclude some IDs to reduce the load.
In addition, if you haven't already, you could increase the worker count.
Thank you for your quick answer and suggestions.
We are using DC Agent Mode, and the two Collector Agents are not busy regarding CPU and memory.
In the beginning, the DNS check was disabled in the 3 DCs (by default). However, the unprocessed logon event problem happenned several times.
Currently, DNS check is disabled in the 3DCs. If enabled in DCs, we think that DNS resolution is made again in Collector Agents (double resolution).
We have noticed some DNS resolution problems when receiving event logs related to the format of the workstation names: i.e. name entries like est. (instead of est), est.a (incomplete domain, instead of est). However, it is not clear if this sporadic DNS problems are related to the unproccessed logon event scenario. Our DNS servers are located in our Intranet (in fact, DNS servers are installed in the 3 DCs). DNS service is always up and ready.
Around 2500 simultaneous users are logged in our Windows domain.
Yes, we can increase the worker count. We assume this configuration can be performed in the Advanced Settings option in our 2 Collector Agents. It seems that 128 is the default value. Any new value suggested? If required, we can increase CPU and memory of the 2 Collector Agent servers (Windows 2012), but it does not seem to be necessary for the time being.
Thanks a lot.
Regards, Ignacio.
Hey Ignacio,
if you enable DNS on DC Agents, Collector Agent will receive logins with username and IP, not just username and workstation, so it can add the logins to its table immediately without having to do another DNS lookup.
This WILL shift some burden from the Collector Agent.
Collector Agent performs DNS lookups in two cases:
- it has a login event with username and workstation, and needs to resolve the workstation to an IP (if DC Agents already provide the IP, this does NOT happen)
- it verifies the IP of a workstation with an existing login
-> this IP verification happens every minute by default, and is in place to handle roaming users if/when their IPs change (like going from WiFi to Ethernet)
Regarding the worker count, you could increase it to 256 or 512 and see if the Collector Agent stabilizes somewhat. If you're not seeing any issues with CPU/memory, then you can leave them as they are.
Depending on how frequent DNS failures are due to incomplete domain names, for example, this could slow down Collector Agent quite significantly.
Regarding the user count, Collector Agent can usually handle a few thousand users just fine; the main point is how many login events those users generate.
-> if every user generates 10 login events per seconds for whatever reason, then Collector Agent would have to process 25000 events in a second, and depending on version, worker thread and a few other factors (such as DNS failures) then it might start to struggle.
I would suggest the following:
- enable DNS in the DC Agents again to take at least some load off the Collector Agent
- if you don't have roaming users (changing IPs/subnets), you can disable the IP verification (set interval to 0)
- increase the worker count to 256 or 512
Then monitor the situation for a bit to see if the unprocessed queue goes down.
We have configured in the two Collector Agents servers your suggestions,
- enable DNS in the DC Agents again to take at least some load off the Collector Agent
- if you don't have roaming users (changing IPs/subnets), you can disable the IP verification (set interval to 0)
- increase the worker count to 256 or 512
and it seems that the problem has been solved, at least in the scenario when a Collector Agent server is down for several minutes and recovered afterwards. Some 'unprocessed logon event' messages arise, but only a few of them and after a few minutes no more warning messages have been noticed in the log of the Collector Agent servers.
There is one scenario pending to be checked: when a Windows DC server is down for several minutes. We will keep an eye when this scenario happens again.
Thanks a lot.
Regards, Ignacio.
Select Forum Responses to become Knowledge Articles!
Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.
User | Count |
---|---|
1712 | |
1093 | |
752 | |
447 | |
231 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.