Description
This article describes general troubleshooting and considerations when deploying the FortiAuthenticator to handle FSSO operation in high-performance environments.
Scope
FortiAuthenticator is used as an FSSO Collector.
Solution
Table of contents:
FSSO Concept
Methods
Design requirements
Technical Requirements
Considerations on FortiAuthenticator setup
Troubleshooting
FSSO Concept:
It is important to understand how FSSO works prior to implementing it efficiently.
FSSO is a user database built on information collected via various means and methods. The user database consists mainly of IP+User+Group, and it is forwarded to FortiGate (except for users that do not match configured group filters, which are then filtered out).
An example entry on the FortiGate:
FGT # diagnose firewall auth list
192.168.95.11, FORTI1
type: fsso, id: 0, duration: 46, idled: 9
server: FAC
packets: in 75 out 61, bytes: in 14601 out 16205
group_id: 33554481 33554434 33554461
group_name: CN=FORTIGROUP,CN=USERS,DC=FORTI,DC=LAB CN=DOMAIN USERS,CN=USERS,DC=FORTI,DC=LAB
This firewall user entry shows an IP with its assigned user and two groups of that user. The FortiAuthenticator will maintain the entry, including removal if required. FortiGate will reflect what the connected Collector has stored in its user database (filtering out what does not match the group filter).
In order to create such a firewall user entry on FortiGate, FortiAuthenticator will collect the users via the configured methods, whereas each method will be handled quite differently and needs to be considered on its own.
In these methods, two components are used:
Both components will be complemented with more information:
- The workstation will be used to get the IP address of the station, if not already present as an IP. It will also be used for periodic DNS lookups as the IP might change (Typical use case: Laptops dock and undock from a wired docking station, connecting to Wi-Fi).
- The username will be used to do an LDAP group lookup, which in turn will be used on the firewall policies (directly or with FSSO groups).
Once this information is complete, an entry can be added to the user database and forwarded to FortiGate.
FortiAuthenticator will show the collected users in GUI -> Monitor -> SSO -> SSO Sessions.
FortiGate will show the resulting and current database in GUI -> Dashboard -> Assets & Identities -> Firewall Users (or on CLI as per example).
Methods:
The methods on FortiAuthenticator to collect these two components are:
- Event log polling - FortiAuthenticator reads (collects) certain Event IDs on configured DCs.
- DCAgent pushes user logons from the DC to FortiAuthenticator as a Collector.
- TSAgent push of terminal server user logons (this method adds another component to discern multiple users on a single workstation (= IP), and assigns different outgoing port ranges (s) to each logged-on user).
- RADIUS Accounting - FortiAuthenticator can parse RADIUS Accounting messages with freely defined mapping.
- Syslog - FortiAuthenticator can parse logs received via syslog with freely defined mapping.
- FortiNAC SSO.
- FortiClient SSO Mobility Agent (SSOMA).
- Portal login - authenticate directly to FortiAuthenticator using its self-service portal (Kerberos, SAML).
- SSO REST API via FortiAuthenticators REST API endpoint /ssoauth.
Mixing methods is possible, but it has to be considered what each method adds to the user database.
Having multiple methods for maintaining the same user entry may be detrimental to performance or, in the worst case, break the entry. If a valid entry already exists, there is no need to have another method overwrite the entry.
Understanding how FSSO works and what it offers makes it easier to go through the design requirements for this environment.
Design Requirements:
- What will FSSO be used for? Which FSSO groups are to be used on the FortiGate? How many and to which destinations?
Typically, a small, ideally a minimal subset of the available domain groups is configured.
- The number of users to be expected for FSSO. The more users, the more important resource provisioning is. Which users are intended to be users for FSSO? Consider filtering unneeded users, for example, service accounts, or IP ranges that would not be required in the SSO session list.
- Time of day for FSSO users. Will users appear in a 24/7 shift operation, or will all FSSO users start in regular business hours. On shift start, a peak of users is to be expected as well as during or rather after lunch. This also depends on how strict business operation is defined. If a shift start is set to be at 9 AM sharp, the peak will be very short, but also very high. If users slowly start 'around' 9, the peak will be much longer, but also lower.
- Type of FSSO users (Terminal server users, regular domain users, user-supplied via syslog/RADIUS Accounting (typically from a wireless controller), SSO mobility agent (or SSOMA) users). Methods should not be mixed.
- Domain Policy: Do user group memberships change often and need to be reflected in the SSO concept? The Cache settings (DNS+LDAP) must reflect this requirement.
- Type of machines: Are these stationary workstations, or mobile devices, like laptops. If IP addresses can change, the settings on FortiAuthenticator must allow discovery (DNS).
- Is DNS propagation working fast for DNS discovery of changed IP addresses? If not, FSSO may give a bad experience to end users who change the IP, typically docking/undocking a laptop. SSO Mobility Agent may be the alternative.
- Geographic locations that need to be covered by FSSO. Is FortiAuthenticator handling a certain region only, multiple regions, or are there LB-HA nodes handling the SSO sessions? Do the regions mix users, or can they be considered separate realms?
- How many Domain Controllers are handling the users? Each domain controller may have a subset of users. Add DCs as event log sources as needed, in order to cover the required user base. Do not add DCs to the setup if they do not add additional information to the SSO session list.
- Is installing software on a domain controller acceptable? The answer to the question may exclude the DCAgent from the start. Installing and upgrading a DCAgent will require reboots and, as such, may be considered 'expensive' in maintenance planning.
The design requirements may underlie some technical restrictions of this environment.
Technical Requirements:
- Because of how FSSO operates, what are the expected operation peak times? At what time of day will users cause a large number of logins? Firewall traffic is irrelevant, but user logins are. User logins can either be interactive, entering credentials to a login request, or non-interactive, through automated logins. Examples of those:
- Navigating network drives.
- Background services operation using service accounts, Antivirus software, EDR services, and other device management (MDM) software.
- Scripts with different user permissions or general scripts that intend to imitate logins.
- Response times from FSSO supplying services (feeding the collector with events):
- SSOMA - This is a service executed from the client side. An agent installed on the client will directly contact FortiAuthenticator to supply its current IP and workstation, along with the user. This is the fastest way to receive login information from workstations.
In addition, FortiAuthenticator will not need to do a DNS lookup for the stations, which makes SSOMA a requirement in some environments where DNS is not reliably fast propagating. SSOMA can come as a FortiClient component, but also as a standalone agent. Both cannot be installed at the same time.
- DCAgent - This is a service, rather than a DLL, that is hooked to lsass.exe and will push logon events to the collector. The packets are small, but DCAgent can be configured to resolve workstations, but will then rely on the system DNS, and as such will wait for the result before sending it to the collector. This will block a thread for the time the DNS response needs to arrive.
- Response times from the FSSO background operation (required to process an event):
Each user login event, even if it appears multiple times, will need to be looked up, unless otherwise configured. Events to be processed are stored in a queue until processed. Threads will pick up events from the working queue and process each one by one.
An event cannot be created unless the lookups are finished. A lookup will block the processing thread, which cannot be available for that time, so a fast response will allow faster processing:
-
LDAP Lookups: The user group membership is required, so it is looked up. If a user contains several hundred groups, then these will be returned each time, and each time the server needs to spend CPU cycles to process this, return the result over the network, and the collector needs to process the result as well. This may lead to an overload in the components and, as a consequence, to delays.
-
DNS - if workstations are to be looked up in DCAgent mode or Event log polling, a thread will do so and handle the query and response.
Considerations on FortiAuthenticator setup:
These setup considerations are in detail:
-
Design:
This article is not meant to explain each setting, but the ones that are related. Review the current design requirements and the settings in GUI -> Fortinet SSO -> Settings -> Methods will give visibility of the methods for this installation:
SSO Methods
SSO Method Settings
In working installations, the monitoring section can also give hints, for example, the status of the DC event log polling:
SSO Event log poll monitor
-
Resource provisioning:
Some installations will do more than FSSO alone. It will be good to follow this Technical Tip: Best practices on hardening FortiAuthenticator environments in order to not only see the resources provisioned correctly but also to not have them used up by unintended access or authentication attempts. It will already help to disable interface services that are not required and check the RADIUS server configuration that by default may cause additional load on FortiAuthenticator.
-
Caching:
Caching will avoid repeated lookups. It is quite normal that a station will generate multiple logins; more extreme cases may easily result in 2000 logins of a station per hour. While this amount is possible, it should be investigated as to the source of the logins and if that can be avoided, saving processing power in the end. Especially, DCAgent and Event log polling will result in each login to be regularly processed, DNS lookup, and LDAP lookup. Caching can help to avoid duplicate queries/lookups. For this reason, it is recommended to keep caching enabled and keep the cache for a time as long as possible.
- DNS Cache:
This is found in GUI -> Network -> DNS and can be enabled or disabled. The TTL setting specifies how long each entry is to be kept available locally. The default setting '0' will use the TTL that comes with the DNS response (set on the DNS server). Note that the results (positive and negative, except timeouts) of the last DNS query are kept.
DNS Cache
The DNS server configuration means that the primary is always used, but if there is no response after 5 seconds, the secondary will be contacted. If there is a result from the secondary, it will then be kept in the Cache. Note that if Cache is used, IP changes will not be detected until the respective entries' cache expiry.
- LDAP Cache:
The LDAP Cache settings are found at GUI -> Fortinet SSO -> Settings -> User Group Membership:
LDAP Cache
The settings here greatly influence performance, not just because of the Cache. The settings in detail:
- AD Server discovery:
Automatic discovery will discover all domain controllers available for LDAP lookups via the SRV record.
Sub-option of Restricting the auto-discovered domain controllers to configured ones refers to using only the LDAP servers that had been configured in GUI -> Authentication -> Remote Auth. Servers -> LDAP.
Manual discovery will allow the choice of specific servers.
This option also influences the SSO domains, visible in GUI -> Monitor -> SSO -> Domains.
- Group cache mode:
Fixed Lifetime will expire a cached entry after the time set in 'Group cache item lifetime'. After a new logon appears for the expired cache item, the lookup has to be repeated to the server the first time. There is an option to clear the cache in case new results are needed.
Active refresh will actively refresh cached entries after the specified timer at the update period interval below.
- 'Always fetch groups from the AD server for these sources on a new logon event': This will bypass the LDAP cache settings above for certain methods. As per requirements, if domain group memberships change often and this needs to be reflected in FSSO, then the caching will interfere with this, as new group memberships wouldn't be noticed due to caching.
- 'Restrict user groups to groups defined in global pre-filter if configured': This option is not related to the caching function, but makes the list of SSO session users more manageable.
If the global pre-filter has the groups defined for filtering at GUI -> Fortinet SSO -> Filtering -> FortiGate, choose the option to 'Forward FSSO information for users from the following subset of users/groups/containers only' and define the respective groups. Processing groups and names will be faster, and the SSO Session Monitor becomes more readable for maintenance.
Note: Caching will avoid lookups. In a few use cases, lookups are required, for example, when the IP address is changed on purpose or the group membership changes often and needs to be reflected on the firewall entry. Adapt the caching to the maximum tolerable time. Disabling the caches will have a performance impact on both FortiAuthenticator and the DNS/LDAP servers/domain controllers.
-
Other settings that may have a performance impact:
- GUI -> Fortinet SSO -> Settings -> Methods.
- Windows Event log polling:
- Configure Events [button]: This will have a FortiAuthenticator filter and read only the configured events. Which ones to use depends on the environment. The most common though are 4624, 4768, and 4776. The fewer event IDs that will reflect all the intended user base, the better.
- DNS lookup to get IP from workstation name: If the logon event comes with a workstation, FortiAuthenticator can attempt to do a DNS A-record lookup to get the IP. This is required in most cases. This will be repeated periodically in order to detect IP changes for the workstations. In doubt, check the respective Event IDs in the Security Event log viewer on the respective DC (run the command 'eventvwr.msc').
- Directly use domain DNS suffix in lookup: this will append the domain name, taken from the DC being polled, to the workstation to make a DNS query with FQDN, instead of the hostname. May be required.
- Reverse DNS lookup to get the workstation name from IP: If the logon event does not come with a workstation name, but IP, it is complete and usable. In the case that a user would change the workstation's IP, for example, by undocking the laptop, FortiAuthenticator would be unable to track this. Use this option to query for a PTR-record for the IP in order to get the workstation name. FortiAuthenticator can then periodically query for the A-record. In doubt, check the respective Event IDs in the Security Event log viewer on the respective DC (run the command 'eventvwr.msc').
- Do one more DNS lookup to get a full list of IPs after a reverse lookup of the workstation name: A workstation may have multiple IPs by which it may be known. Pick them all up. Normally not required.
- Include account name ending with $ (usually computer account): Use this to include machine names in FSSO, in case of automated tasks that cause logins that are supposed to be used with FSSO. Normally not required.
- FortiClient SSO Mobility Agent Service:
- Require client certificate in TLS connection: Use this to secure the connection between FortiClient/Agent and FortiAuthenticator - the FortiClient will then have to supply a certificate whose issuer FortiAuthenticator trusts. Alternatively, use the 'Enable Authentication', which adds a field to define a preshared key.
- Keep-alive interval: FortiClient / Agent will send a keep-alive to FortiAuthenticator every defined interval. If FortiAuthenticator receives no keepalive, it will expire the workstation's session after the 'Idle timeout'.
As FortiClient will instantly communicate its new address to FortiAuthenticator, keepalive settings of 10 minutes or slower are valid. Be sure to adjust the 'Idle timeout' to a sensible value. FortiAuthenticator will inform the client on the check-in and when to report again.
- DC/TS Agent Clients:
- DNS lookup to get IP from workstation name: This setting behaves the same as in Windows event log polling. Note that this is not required, if the DCAgent is set to resolve the workstations to IPs. Depending on whether the event logs contain the workstation or not, disabling this setting may be detrimental to the DNS server (if 'DNS lookup to get IP from workstation name' is enabled) or the user experience. If there is only an IP in the event, IP changes may not be detected on time. It is recommended to enable this to cover the use case of IP changes. Otherwise, disable it.
- Ignore the workstation name that is not a full DNS name: Use this option to drop workstation names without the DNS suffix. This will avoid querying the DNS server for an A-record of <hostname> when the server would only know the answer by querying the FQDN. It is recommended to enable this option, but only when the DNS server is known to not respond the hostname queries, but FQDNs only.
- Reverse DNS lookup to get workstation name from IP: As in event log polling, query the DNS server for a PTR-record in case the workstation is unknown. When DNS lookup to get IP from the workstation name is enabled, no workstation will be known, and for all workstations, PTR-records will be queried.
- Windows Active Directory workstation IP verification: FortiAuthenticator will try to validate whether the user is still logged on to the workstation, or not. This requires WMI traffic from FortiAuthenticator to the workstation. The sub-options will attempt to use a new IP if found via A-record lookup. Normally, this is not needed as in many environments the traffic from FortiAuthenticator to the workstation is not allowed or intended.
- GUI -> Fortinet SSO -> Methods -> Windows Event Log -> <server-entry> -> LDAP Lookup.
LDAP lookup in event log source
This directs LDAP group lookup traffic by priority setting. Primary sources will be queried for user information via LDAP. If configured as a secondary source, this server will only be queried if all other configured primary source servers fail. If this source is disabled, the source will not be used for lookups. It is possible to disable all configured event log sources when used in conjunction with GUI -> Fortinet SSO -> Settings -> User Group Membership -> 'Restrict auto-discovered domain controllers to configured Windows event log sources and remote LDAP servers'.
- Secure connections:
The definition here will not apply to a certain setting, but unencrypted LDAP or SSOMA traffic is valid if the network is trusted.
If the network is not trusted and unsolicited/unwanted packet captures could be taken inside the network, enable secure connections where required. This will, however, add to every connection the overhead of the TLS protocol, and add to response time and general processing.
TLS also needs to consider maintenance, as required certificates for use with TLS/STARTTLS always have a validity period. Certificates must be renewed before expiry, otherwise a downtime is possible due to failing connections to the other node, like the LDAP server.
- GUI -> Fortinet SSO -> Settings -> Log config.
This setting seems inconspicuous, but it is highly recommended to keep the log level on 'Warning'.
Levels 'Info' or 'Debug' may help for troubleshooting, but for FSSO handling, the intensive logging, especially on 'Debug', may cause performance issues itself. Processing threads on FortiAuthenticator itself have a significant overhead writing the logs.
- GUI -> Fortinet SSO -> Filtering -> IP Rules.
By default, all events are admitted to FortiGate. Use this setting to block certain IPs or ranges. See this document: IP rules.
- Fine-grained controls:
This uses the SSO Users and SSO Groups section and configured objects to apply restrictions to users, maximum sessions per user, or block the user out entirely. For more information, see this document: Fine-grained controls.
Note that the above only reflects settings that are highly relevant to performance. Other functional settings are omitted and already explained in the documentation.
Troubleshooting:
Troubleshooting SSO will rely on understanding the design, requirements, and what the problem is to be troubleshooted.
Here are some common issues:
- Resource problem, unstable behavior:
Check https: //fac-ip/debug/ and find 'Kernel'. Check for errors, specifically 'out of memory' or other obvious repeated errors. Out of memory would indicate that this machine has too little memory for its operation. Compare the machine resources with the datasheet.
- Event log processing is slow or stuck:
Check how FortiAuthenticator receives its logins and how it processes them (LDAP+DNS).
A packet capture on the LDAP port(s) will help over a longer duration; the same is true for DNS.
Use this procedure to create the captures (assuming LDAP traffic is not encrypted), one by one, not simultaneously, as FortiAuthenticator only stores one capture:
exec tcpdumpfile -i any port 389 or port 3268
Or:
exec tcpdumpfile -i any port 53
Alternatively, this can also be done on the firewall that forwards this traffic from FortiAuthenticator to the LDAP or DNS server. This would allow for parallel captures, making DNS and LDAP traffic for a single event visible.
Leave the capture(s) running for about 20 minutes; longer is better and more representative. When done or confirmed that the issue is present, stop the capture with CTRL-C. Download the capture at https: //fac-ip/debug/pcap-dump.
Open the resulting capture with Wireshark.
For LDAP, go to the toolbar and open Statistics, then to 'Service Response Time' (SRT) and there to 'LDAP'. A new Window opens which will show a quick analysis like this:
LDAP Service Response Time
In this example, the service response time is fast, over 220438 LDAP Searches, a maximum response time of 420ms was calculated.
This particular search may be slow, but it is not close to the average, which is 33ms.
For DNS, employ a simple filter:
dns.time > 1
This shows every packet where the DNS response is taking longer than one second. This will contribute to delays. If there are results, these have to be investigated. The "Time" field in the DNS responses is calculated by Wireshark and is usable for this filter.
This can take other components to filter down more, to avoid the caching mechanism, as it will also be slow when the responding server is slow. FortiAuthenticator will consult a local caching DNS server, which will respond directly or request the information from the configured DNS server. Local cache, however, is not needed for that analysis. Filter these packets out with this example filter:
dns.time > 1 && ip.addr !=127.0.0.1
LDAP packet delays may not be spotted if the server is not responding slowly on the TCP handshake. That packet capture should be thoroughly analyzed for each of the target destinations. One method to check is to set the capture by destination IPs and scroll through to see an ordered list of IPs.
The list of expected servers may also differ from what FortiAuthenticator lists as expected servers. Previously in this article, the section 'AD Server discovery' referred to the SSO domains in GUI -> Monitor -> SSO -> Domains. This section will display what LDAP servers FortiAuthenticator will contact. The list is populated by an SRV record query by default and influenced by the AD Server Discovery options. The SRV record may return stale DC entries, DCs that have existed but are no longer reachable. FortiAuthenticator may try to contact those, and will time out. Crosscheck what the DNS server returns on such a query with either a packet capture or:
execute dig +short srv _ldap._tcp.forti.lab
Check the SSO domains and if incorrect, head over to the 'Domain Manager' section in the https: //fac-ip/debug, check the logs that will state the current SSO domain layout. To refresh the domain layout if it is incorrect, click the [button] to rebuild the SSO domains. If the structure continues to show unexpected domain controllers, the DNS SRV records will need checking, or the FortiAuthenticator configuration on the AD server discovery. The configuration may also show a stale LDAP server configuration on GUI -> Authentication -> Remote Auth. Servers -> LDAP. Crosscheck each server's validity, as well as each entry's secondary server IPs/FQDNs.
- SSOMA connection failures after some time of operation:
The FSSO debug logs show:
"FCT: reached maximum client number, cannot accept new connection from ...".
FortiAuthenticator can accept a maximum of 2048 simultaneous connections from SSO mobility agents. That may not sound much, but the connections are short-lived. If there is a delay in traffic, the connections stay up longer than intended and cannot be closed fast enough. At a shift start, the same may occur if all users at the same time try to connect.
This will typically not give a bad user experience as an SSOMA that failed the connection will shortly retry to deliver its message, and the connection queue will be empty. See to capture traffic of the clients on FortiAuthenticator can be captured for some minutes and see if delays can be spotted. The keepalive timer mentioned earlier in the article can influence the ongoing and repeated connections from the client.
If there is a delay in processing (LDAP lookup), the same logs will be written.
- The workstation's IP address changed, but FortiAuthenticator doesn't see that, or very delayed.
Have a workstation (and its name and current IP) ready that can reproduce the problem. On FortiAuthenticator CLI, run:
exec nslookup workstation.forti.lab
The result should reflect the current IP.
Change the IP on the client with the regular method that produces the reported problem.
Keep repeating on FortiAuthenticator CLI:
exec nslookup workstation.forti.lab
until the IP address reflects the new address of the workstation.
When this is the case, see the SSO Monitors session list in the FortiAuthenticator GUI to compare how long it takes.
If the DNS result of nslookup is delayed, check whether the DNS cache is enabled on System -> Network -> DNS. If it is not, but the delay is visible (likely), run the DNS packet capture on CLI (described above) and see with the timestamps if the DNS server returns the wrong address. If so, investigate the server. If the address is updated fast, but FortiAuthenticator is delaying the update, go back to checking the resources of FortiAuthenticator and the logs available at https: //fac-ip/debug.
- RADIUS Accounting with random failures:
Like SYSLOG, RADIUS Accounting is rather flexible and reliable. It is typically used by, but not limited to, Wireless controllers.
If the reported problem can be reproduced or rather pinpointed to a certain group or subnet, a packet capture in conjunction with logs of https: //fac-ip/debug in conjunction with FortiGate user event logs is likely to get an explanation. The RADIUS Accounting messages come with 2 types relevant for this setup: A START message and a STOP message.
The START message will, if all mapping of attributes is set up correctly, create a regular user session, whereas the STOP message will end it.
The packet capture should be generated in one of these ways:
exec tcpdumpfile -i any port 1813
exec tcpdump -nnvvi any port 1813
The former creates a packet capture in PCAP format. When done or confirmed that the issue is present, stop the capture with CTRL-C. Download the capture at https: //fac-ip/debug/pcap-dump. The latter writes all as a less readable text onto the CLI; nevertheless, it records the username and the message type (start or stop).
There have been occasions where the controller initiating the Accounting messages sends unexpected STOP messages for some reason. If the packet capture confirms this, the reason needs to be understood. Possible reasons include roaming or a bad signal to the connected access point; the WLC may deem the client to be disconnected.
Further investigation:
If further investigation is required, contact TAC for support.
When working with TAC support, be sure to reproduce the issue with logs and packet capture that reflect the occurrence of the issue. Additionally, as a supplement, download the FortiAuthenticator debug reports in GUI -> Logging -> Log Access -> Log and there, select the Download dropdown and download:
An example:
FAC debug logs
If further debugging is to be done on-site, see the following related articles:
Technical Tip: Explaining FSSO - a primer
Technical Tip: How FSSO works and how to troubleshoot FSSO
Troubleshooting Tip: How to debug FortiAuthenticator Services
Troubleshooting Tip: FSSO Complete troubleshooting for TAC tickets