Skip to main content
IAC
New Member
April 28, 2022
Solved

FSSO - Collector Agent - Unprocessed Logon Event

  • April 28, 2022
  • 3 replies
  • 6339 views

FSSO Collector Agent showing 'unprocessed logon event' in debug file (log level warning)

 

Scenario: Fortigate FG600E cluster v.6.4.7. 3 Microsoft Windows Domain Controller servers (W2016), 2 Microsoft Windows Collector Agent servers (W2012): DC Agent and Collector Agent in Advanced Mode.

 

From time to time, specially when DCs or CA servers goes down and up, 'unprocessed logon event' messages appear continuously in both Collector Agent servers. Some logon events are missing in Collector Agent, and therefore missing in Fortigate. Therefore, Internet access problem for several users arise.

 

There isn't any specific procedure to recover to normal operation. Even restarting Collector Agents do not solve the problem. Sometimes, it seems to be recovered when modifying some part of the Collector Agents configuration, or when modifying some part of the DC FSSO configurtation and then restarting the Collector Agents...

 

Any help is really appreciated.

Many thanks in advanced.

 

Best answer by Debbie_FTNT

Hey Ignacio,

if you enable DNS on DC Agents, Collector Agent will receive logins with username and IP, not just username and workstation, so it can add the logins to its table immediately without having to do another DNS lookup.

This WILL shift some burden from the Collector Agent.

Collector Agent performs DNS lookups in two cases:
- it has a login event with username and workstation, and needs to resolve the workstation to an IP (if DC Agents already provide the IP, this does NOT happen)

- it verifies the IP of a workstation with an existing login

-> this IP verification happens every minute by default, and is in place to handle roaming users if/when their IPs change (like going from WiFi to Ethernet)

 

Regarding the worker count, you could increase it to 256 or 512 and see if the Collector Agent stabilizes somewhat. If you're not seeing any issues with CPU/memory, then you can leave them as they are.

 

Depending on how frequent DNS failures are due to incomplete domain names, for example, this could slow down Collector Agent quite significantly.

Regarding the user count, Collector Agent can usually handle a few thousand users just fine; the main point is how many login events those users generate.

-> if every user generates 10 login events per seconds for whatever reason, then Collector Agent would have to process 25000 events in a second, and depending on version, worker thread and a few other factors (such as DNS failures) then it might start to struggle.

I would suggest the following:

- enable DNS in the DC Agents again to take at least some load off the Collector Agent

- if you don't have roaming users (changing IPs/subnets), you can disable the IP verification (set interval to 0)

- increase the worker count to 256 or 512

Then monitor the situation for a bit to see if the unprocessed queue goes down.

3 replies

Debbie_FTNT
Staff & Editor
Staff & Editor
April 28, 2022

Hey IAC,

'unprocessed logon event messages' in Collector Agent means it's building up a queue of login events it hasn't processed yet (meaning those logins are not added to the logon user list nor shared with FortiGate; in effect the users are not considered authenticated by FortiGate).

This usually happens if the Collector Agent is extremely busy with other processes, such as DNS lookup for workstations or polling events from domain controllers.

- are you using DC Agent mode or polling mode?
-> if DC Agent mode, you could try enabling DNS lookup on the DC Agents (if it isn't already) to take that load off the Collector Agent

-> if polling mode, you could check through the debug log to get a rough idea of how many events it may be polling in a minute (look for EvtID) or so; if you're looking at several thousand or tens of thousands, you could consider modifying the polling event ID filter a bit to exclude some IDs to reduce the load.

 

In addition, if you haven't already, you could increase the worker count.

IAC
IACAuthor
New Member
April 28, 2022

Thank you for your quick answer and suggestions.

 

We are using DC Agent Mode, and the two Collector Agents are not busy regarding CPU and memory.

 

In the beginning, the DNS check was disabled in the 3 DCs (by default). However, the unprocessed logon event problem happenned several times.

 

Currently, DNS check is disabled in the 3DCs. If enabled in DCs, we think that DNS resolution is made again in Collector Agents (double resolution).

 

We have noticed some DNS resolution problems when receiving event logs related to the format of the workstation names: i.e. name entries like est. (instead of est), est.a (incomplete domain, instead of est). However, it is not clear if this sporadic DNS problems are related to the unproccessed logon event scenario. Our DNS servers are located in our Intranet (in fact, DNS servers are installed in the 3 DCs). DNS service is always up and ready.

 

Around 2500 simultaneous users are logged in our Windows domain.

 

Yes, we can increase the worker count. We assume this configuration can be performed in the Advanced Settings option in our 2 Collector Agents. It seems that 128 is the default value. Any new value suggested? If required, we can increase CPU and memory of the 2 Collector Agent servers (Windows 2012), but it does not seem to be necessary for the time being.

 

Thanks a lot.

Regards, Ignacio.

Debbie_FTNT
Staff & Editor
Staff & Editor
April 29, 2022

Hey Ignacio,

if you enable DNS on DC Agents, Collector Agent will receive logins with username and IP, not just username and workstation, so it can add the logins to its table immediately without having to do another DNS lookup.

This WILL shift some burden from the Collector Agent.

Collector Agent performs DNS lookups in two cases:
- it has a login event with username and workstation, and needs to resolve the workstation to an IP (if DC Agents already provide the IP, this does NOT happen)

- it verifies the IP of a workstation with an existing login

-> this IP verification happens every minute by default, and is in place to handle roaming users if/when their IPs change (like going from WiFi to Ethernet)

 

Regarding the worker count, you could increase it to 256 or 512 and see if the Collector Agent stabilizes somewhat. If you're not seeing any issues with CPU/memory, then you can leave them as they are.

 

Depending on how frequent DNS failures are due to incomplete domain names, for example, this could slow down Collector Agent quite significantly.

Regarding the user count, Collector Agent can usually handle a few thousand users just fine; the main point is how many login events those users generate.

-> if every user generates 10 login events per seconds for whatever reason, then Collector Agent would have to process 25000 events in a second, and depending on version, worker thread and a few other factors (such as DNS failures) then it might start to struggle.

I would suggest the following:

- enable DNS in the DC Agents again to take at least some load off the Collector Agent

- if you don't have roaming users (changing IPs/subnets), you can disable the IP verification (set interval to 0)

- increase the worker count to 256 or 512

Then monitor the situation for a bit to see if the unprocessed queue goes down.

gagandeeps
Staff
Staff
December 25, 2024

Optimization of  the collector agent can be done as follows:

 

1.) Changed the Max worker thread to 576 on logic- Max Worker Threads = 512 + ((Logical CPUs − 4) × 16)
2.) Set log level to Debug and log size to 100 MB.
3.) Cache user group results to 60