Description
This article expands upon the FortiAuthenticator administration guide High Availability section, in particular setting up a load-balancing (Layer 3) cluster.
Related document:
High availability
Scope
FortiAuthenticator.
Solution
FortiAuthenticator offers two different clustering modes – active-passive (Layer 2), and load-balancing (Layer 3):
With active-passive clustering, two FortiAuthenticators will appear as a single device to the wider network, much like a FortiGate cluster, with an HA link and shared IPs they are reachable on.
With a load-balancing setup, two (or more) FortiAuthenticators fundamentally act as single devices, but synchronize a large part of their databases.
They can act as backup to each other, but may be geographically separate and utilize completely different networks.
In a load-balancing setup, a single FortiAuthenticator (or an active-passive cluster) serves as primary unit, and up to ten single FortiAuthenticators may be added as load-balancing nodes.
There is no automated failover between them, and devices utilizing the FortiAuthenticators will need to switch to a different IP address to address a different node, as if switching to a completely independent device.
Not all configuration is synchronized between the primary and load-balancing nodes. Changes are only synchronized from the primary node to load-balancing nodes, not the other way around; the primary node essentially holds the authoritative database version, and load-balancing nodes copy it. Any changes made on load-balancing nodes are not reflected on the primary, and may be overwritten on the next sync with primary.
Up to firmware version 6.5 included, only the following is synchronized:
- Token and seeds.
- Local user database.
- Remote user database.
- Group mappings.
- Token and user mappings.
- Certificates included in Certificate Management -> End Entities -> Local Services, excluding firmware (Fortinet) certificates.
- Certificate Management > Certificate Authorities -> Local CAs, including firmware (Fortinet) certificates.
- Certificate binding settings for local/remote user accounts.
- SAML configurations:
- IdP settings are configured in Authentication -> SAML IdP > General.
- SP settings are configured in Authentication -> SAML IdP -> Service Providers.
Note:
Realm tables are not synchronized, but the default realm selection (radio button) is.
Note:
Admin users (local or remote users with role admin) are NOT synced by default. Each user has a toggle to enable synchronization of the admin user to the load-balancing node starting in 6.2. This needs to be set manually.
All other settings are NOT synchronized.
This includes any remote authentication servers, RADIUS/LDAP service settings, policies, etc.
This in turn means some settings need to be undertaken on the load-balancing nodes before clustering and synchronization can be set up correctly.
Starting in firmware version 6.6, more configuration can be synchronized. This is optional; settings are available under System > Administration > High Availability on the primary node. By default, the synchronization behavior is the same as in 6.5 and earlier.
Note:
Any load-balancing nodes should have the same licensing as the primary node/cluster if it is a VM, or should be the same hardware model as the primary node/cluster. They also need to run the same firmware version. Load-balancing clusters CAN form between different models, but there may be issues with synchronization, especially if the number of supported users is different.
- Preparations.
Start by fully configuring the standalone primary (or the active-passive cluster); add remote servers, users, RADIUS policies, etc.
Note.
If possible, do not add FortiTokens to the FortiAuthenticator(s) yet; they can complicate initial synchronization if already present on the primary node(s) and assigned to users.
Take a note of all configuration except the user/groups, certificate and SAML settings; any RADIUS/LDAP configuration, SSO configuration, Portal configuration, etc. should be noted.
These config items need to be recreated manually on any load-balancing node(s), up to firmware 6.5 included.
In firmware version 6.6 and higher, SOME of this configuration (like remote servers, realms, RADIUS policies) can be synced, but other config (like SSO, Portals) can not be synced.
Any config that is NOT synced needs to be recreated on the load-balancing node manually, including with identical names.
When config items (like users/groups) are synced, they may contain references to objects that are not synced; if those objects do not exist on the load-balancing node, the synchronization will fail. For example, if a user belonging to server LDAP1 is synced to a load-balancing node, but the load-balancing node has no entry for a server LDAP1, this will result in an error.
If the planned setup consists of an active-passive cluster and load-balancing nodes, not just a standalone primary with nodes, please note there are additional routing considerations:
- The load-balancing node must be able to reach the active-passive cluster HA interfaces.
- The active-passive cluster must be able to reach the load-balancing node via HA interfaces.
If the primary and load-balancing nodes utilize the same infrastructure (the same LDAP/RADIUS servers, the same clients, etc.), you can work around needing to manually recreate the configuration as follows (provided console access to the load-balancing node is in place):
- Take a backup of the primary
- Restore it on the load-balancing node
- Correct the IP/network settings to what they should be (via console, same CLI syntax as FortiGate, 'config system interface' and 'config router static').
- Log into the GUI of the load-balancing node.
- Correct hostname, alias, DNS settings, and anything else that should be unique on the load-balancing node
- Upload the correct VM licence if the load-balancing node is a VM
- If HA settings are already in place, disable HA again and save this
Routing and Network considerations:
- the load-balancing node must be able to reach the standalone primary on any specific interface, or the cluster primary on its HA interface!
- the primary must be able to reach the load-balancing node on any specific interface; if the primary is a cluster node, then it must be able to reach the load-balancing node via the HA interface!
- the HA link between the primary and load-balancing node(s) is handled via an OpenVPN tunnel that the load-balancing node initiates. UDP port 720, 721 and 1194 must be allowed between the primary and load-balancer.
- If the link between the primary and load-balancer goes through a VPN (like a site-to-site IPSec tunnel) this frequently leads to fragmentation issues and thus unstable HA links. Setting a lower MTU on FortiAuthenticator interfaces (identical to FortiGate, 'config system interface > edit > set mtu') or setting tcp-mss on policies the OpenVPN/IPSec traffic goes through can help.
- Enabling High Availability.
Once the primary is fully configured, and the load-balancing nodes have the prerequisite configuration in place, High Availability can be enabled.
On the primary, set the following under System > Administration > High Availability:
- Enable HA.
- Set the Role to 'Standalone Primary'.
- Set a pre-shared key to be used by the cluster.
- If an active-passive cluster serves as primary, skip step 1-3.
- Add the IP addresses of any load-balancing nodes in the 'Load Balancers' table. Up to ten may be added.
- If an active-passive cluster serves as primary, add a route to the load-balancing node IP on the primary; set the HA interface as outgoing interface.
A routing diagram of an active-passive cluster with load-balancing node:
Configuration on the primary (this is automatically mirrored to the secondary)
On the load-balancing nodes, set the following System > Administration > High Availability:
- Enable HA.
- Set the Role to 'Load Balancer'.
- Add the primary IP address under 'Load Balancing primary IP address'.
- If the primary is an active-passive cluster, set the HA interface IP of the primary unit (priority high) in the cluster.
- Set the same pre-shared key as on the primary (or used by the primary cluster).
Set a route to the primary on the load-balancer node IF the primary is not reachable via the default route.
Note.
No HA interface needs to be configured on the load-balancing unit; FortiAuthenticator will use the interface as indicated by its routing table to communicate with other nodes in the load-balancing setup.
If a cluster serves as primary node, leave the cluster HA settings untouched and only add the load-balancing nodes with their IP addresses.
If not done before, FortiTokens may now be added to the primary device/cluster. If Tokens were in place before High Availability was enabled, this can cause some synchronization issues with the FortiToken/User mappings.
Trial tokens are removed in a load-balancing cluster setup! If they are in use, this can lead to synchronization issues and/or error when trying to enable load-balancing HA.
- Viewing High Availability status and Troubleshooting.
The HA status is visible under System > Dashboard > HA status. This should show either the primary or load-balancing nodes depending on which device the HA status is viewed. It will show the synchronization status; clicking on the node will provide details on what items are or are not synchronized. If there are errors, the icons for the affected items will show as orange or red and clicking on the icons should show some additional information on where the error is.
A few common issues that can crop up in a load-balancing setup:
Error: 'Foreign Key error: Entry for name=<> not found in <> on local server' or something similar
Solution: The entry with the specified name is not present on the load-balancing node in the specified table (like a remote LDAP server missing).
Create an appropriate entry with identical name to what is on the primary, add to realms/portals/policies as required, then synchronize again
Error: '<> with reference ID <> does not exist'.
Solution.
The specified item (like user or FortiToken) was not synchronized, or was synchronized but with a different reference in the database. This can sometimes happen if a backup was taken from primary and restored on secondary with things like User/FortiToken association already in place. Deleting the appropriate section (like remote users) and synchronizing it again from primary usually resolves this.
General troubleshooting steps for synchronization issues:
These are actions that can be done in general to attempt to resolve issues with load-balancing clusters:
- Rebuilding HA tables. This can be done under System -> Dashboard -> HA Status.It will cause the load-balancing node to rebuild the database from information synched from the primary. While this is going on, the load-balancing node may be slowed down. This does not trigger a reboot and does not affect the primary.
- Checking logs and /debug section. Some details on synchronization issues may show in FortiAuthenticator GUI logs under Logging -> Log Access -> Logs. Typing 'High Availability' into the search bar in the upper right will limit the display to relevant logs. There is a debug section available under https://<FortiAuthenticator>/debug. Debug sections may be selected in the drop-down menu in the upper left corner. The “LB” and “LB HA Sync” sections are particularly relevant. These logs can also be downloaded.
- Recalculating checksums. If the configuration appears to be in sync (same number of users, same groups present etc.), but HA status still shows as out of sync, recalculating the cluster checksums can resolve this. The cluster checksums can only be recalculated on the primary node under https://<FortiAuthenticator>/debug/lb_sync