Description
This article describes how to use and configure Remote Link Monitoring in combination with an FGCP HA cluster.
Scope
FortiGate.
Solution
Basic steps to implement a Remote Link Monitor:
There is no real limitation on how Remote Link Monitoring can be applied. It can either be implemented in conjunction with Port Monitoring or standalone mode. It can be composed of one or several remote devices accessible from one or several ports of the FGCP HA cluster.
The following information is to be specified:
Example of a Remote Link Monitoring configuration as defined above:
show system link-monitor
config system link-monitor
edit ha-link-monitor
set server 10.10.10.10 <------------- 1.
set srcintf port1 <------------- 2.
set protocol ping <------------- 3 (Ping is the default option setting).
set ha-priority 1 <------------- 4 (1 is the default value).
set interval 5 <------------- 5 (5 is the default value).
set failtime 2 <------------- 6 (5 is the default value).
end
Once a Remote Link Monitor (also referred to as Remote IP Monitoring or PING server Monitoring in the documentation) has been defined, it can then be integrated into the cluster HA configuration and the following information specified:
show system ha
config system ha
…
set pingserver-monitor-interface port1 <------------- 1.
set pingserver-failover-threshold 0 <------------- 2 (0 is the default value).
set pingserver-secondary-force-reset disable <------------- 3 (option enabled by default).
set pingserver-flip-timeout 60 <------------- 4 (by default set to 60 minutes).
…
end
Note: Having the 'pingserver-failover-threshold' variable set to '0' is a means to trigger a HA failover right after the remote link failure is detected.
Note: Link monitoring is a mechanism to activate the FGCP HA election process. The decision to trigger a fail over or not is ultimately taken by the HA process itself, based on the HA parameters value such as the 'override' parameter being enabled or not, the HA priority value set on each cluster units, and so on.
The scenario detailed below is based on the assumption that the HA 'override' parameter is enabled and the cluster 'preferred' primary is set with a higher 'priority' value than the secondary. This type of setting is typically used when there is a need to have one of the cluster units acting, as far as possible, as a “preferred” primary unit.
Using Remote Link Monitor in conjunction with the FGCP cluster High Availability:
Each time a remote link monitoring failure is detected by the HA cluster primary unit, the 'global' PENALTY that is by default set to '0’'is incremented by the 'nominal' PENALTY value (the 'ha-priority' parameter value) and compared to the fail over threshold value (the 'pingserver-failover-threshold' variable value).
When the threshold value is reached, the 'global' PENALTY value of the primary is compared with that of the secondary, and if it is higher, the FLIP timer is started and a failover occurs. The new primary starts monitoring the remote link on its own and will handle any remote link monitoring failure as described above, i.e., the ‘global’ PENALTY will be incremented by its 'nominal’ PENALTY value, up to the point the fail-over threshold value is reached. The action taken at the time the FLIP timer elapses will then depend upon the settings of the “pingserver-secondary-force-reset” variable value.
When the FLIP timer elapses, the 'preferred' primary 'global' PENALTY is reset. Regardless of the remote link monitoring status on the new primary, the cluster automatically returns to normal operation, i.e., a failover occurs since the HA 'override' parameter is enabled and the HA 'priority' of the 'preferred' primary is higher than that of the new primary. The FLIP timer is started, and the 'preferred' primary unit starts remote link monitoring again. If the remote link is restored, the cluster continues to operate normally. If, however, the remote link is still down, remote link failover causes the cluster to fail over again at the time the FLIP timer expires.
This sequence, known as FLIP-FLOP failover, will repeat each time the FLIP timer expires, up until the failed remote link is restored.
With this setting, the 'preferred' primary 'global' PENALTY is not reset when the FLIP timer elapses. This way, there will be no FLIP-FLOP failover if the new primary does not detect any remote link failover failure, the drawback being that the 'preferred' primary will never get a chance to become primary again, even if the remote link is restored on its side.
Only a manual failover (likely after restoring the ping server failure) or a remote link failure on the new primary side can trigger a failover. Indeed, in case the new primary also experiences a remote link failure, its 'global' PENALTY will be increased and become equal to that of the 'preferred' primary, thus causing the HA election process to start. In this case, the 'preferred' primary will take the cluster ownership back since the HA 'override' parameter is enabled and the HA 'priority' of the 'preferred' primary is higher than that of the new primary.
Verifying and controlling the Link Monitor can be done using the following command set:
diagnose debug application link-monitor -1
diagnose debug console timestamp enable
diagnose debug enable
Note: In an FGCP HA cluster, only the primary unit can perform remote link monitoring.
By design, a secondary unit cannot perform any monitoring since it has no active routing table. This can be verified from the following command excerpt, which was recorded on an FGCP HA cluster configured with link monitor and HA settings defined previously.
In the example below, FGT1 is configured as 'preferred' primary. It is primary at the beginning of the test and becomes secondary after it loses connectivity with the remote ping server.
1) 08:37:14: link monitor PING test towards the remote server (10.219.5.237) is done. It is successful
2) 08:37:19: 5 seconds later (cf. ‘interval’ variable setting) another PING test is done. It is successful
3) 08:37:24: 5 seconds later another successful PING test is done. It is successful
4) -> a loss of connectivity in between FGT1 and the remote ping server is simulated
5) 08:37:29: 5 seconds later another PING test is done but fails. It is done a second time (cf. ‘failtime’ variable setting) and also fails.
6) 08:37:31: link monitor is flagged as non-operational (cf. ‘ha-link-monitor is dead’ message)
7) 08:37:33: routing table is deactivated on FGT1 - failover occurs (FGT2 becomes primary)
8) 08:37:37: the PING test cycle is re-initiated but no packets are effectively issued since routing table is inactive
9) 08:37:42: idem than step 8
10) 08:37:47: idem than step 8
#FGT1 # diagnose debug application link-monitor -1 FGT1 # diag debug console timestamp enable FGT1 # diag debug enable
2019-07-03 08:37:14 lnkmtd::ping_send_msg(256): --> ping 10.219.5.237 seq_no=24344, icmp id=0, send 40 bytes
2019-07-03 08:37:14 lnkmtd:: ha-link-monitor send check request, try 1
2019-07-03 08:37:14 lnkmtd::ping_match(71): try matching ping response 10.219.5.237
2019-07-03 08:37:14 lnkmtd::ping_do_addr_up(57): ha-link-monitor->10.219.5.237(10.219.5.237), rcvd
2019-07-03 08:37:14 monitor_peer_recv-1790: lnkmtd: ha-link-monitor send time 1562135834s 205177us, revd time 1562135834s 206031us
2019-07-03 08:37:14 lnkmtd: ha-link-monitor all servers are probed after 1 times
2019-07-03 08:37:14 policy route related to the monitor(ha-link-monitor) may be added
2019-07-03 08:37:14 lnkmt_ha_mstate_build-182: monitor=ha-link-monitor, state=0,send sz=78
2019-07-03 08:37:14 rcvd cmd = 0
2019-07-03 08:37:19 lnkmtd::ping_send_msg(256): --> ping 10.219.5.237 seq_no=24345, icmp id=0, send 40 bytes
2019-07-03 08:37:19 lnkmtd:: ha-link-monitor send check request, try 1
2019-07-03 08:37:19 lnkmtd::ping_match(71): try matching ping response 10.219.5.237
2019-07-03 08:37:19 lnkmtd::ping_do_addr_up(57): ha-link-monitor->10.219.5.237(10.219.5.237), rcvd
2019-07-03 08:37:19 monitor_peer_recv-1790: lnkmtd: ha-link-monitor send time 1562135839s 205390us, revd time 1562135839s 206305us
2019-07-03 08:37:19 lnkmtd: ha-link-monitor all servers are probed after 1 times
2019-07-03 08:37:19 policy route related to the monitor(ha-link-monitor) may be added
2019-07-03 08:37:19 lnkmt_ha_mstate_build-182: monitor=ha-link-monitor, state=0,send sz=78
2019-07-03 08:37:19 rcvd cmd = 0
2019-07-03 08:37:24 lnkmtd::ping_send_msg(256): --> ping 10.219.5.237 seq_no=24346, icmp id=0, send 40 bytes
2019-07-03 08:37:24 lnkmtd:: ha-link-monitor send check request, try 1
2019-07-03 08:37:24 lnkmtd::ping_match(71): try matching ping response 10.219.5.237
2019-07-03 08:37:24 lnkmtd::ping_do_addr_up(57): ha-link-monitor->10.219.5.237(10.219.5.237), rcvd
2019-07-03 08:37:24 monitor_peer_recv-1790: lnkmtd: ha-link-monitor send time 1562135844s 205655us, revd time 1562135844s 206595us
2019-07-03 08:37:24 lnkmtd: ha-link-monitor all servers are probed after 1 times
2019-07-03 08:37:24 policy route related to the monitor(ha-link-monitor) may be added
2019-07-03 08:37:24 lnkmt_ha_mstate_build-182: monitor=ha-link-monitor, state=0,send sz=78
2019-07-03 08:37:24 rcvd cmd = 0
2019-07-03 08:37:29 lnkmtd::ping_send_msg(256): --> ping 10.219.5.237 seq_no=24347, icmp id=0, send 40 bytes
2019-07-03 08:37:29 lnkmtd:: ha-link-monitor send check request, try 1
2019-07-03 08:37:30 lnkmtd::ping_send_msg(256): --> ping 10.219.5.237 seq_no=24348, icmp id=0, send 40 bytes
2019-07-03 08:37:30 lnkmtd:: ha-link-monitor send check request, try 2
2019-07-03 08:37:30 lnkmtd: ha-link-monitor have tried 2 times, and will restart after 3 seconds
2019-07-03 08:37:31 lnkmtd: ha-link-monitor is dead.
2019-07-03 08:37:31 policy route related to the monitor(ha-link-monitor) may be removed
2019-07-03 08:37:31 lnkmt_ha_mstate_build-182: monitor=ha-link-monitor, state=1,send sz=78
2019-07-03 08:37:31 rcvd cmd = 0
2019-07-03 08:37:33 lnkmt_proute_refresh-582
2019-07-03 08:37:37 rcvd cmd = 0
2019-07-03 08:37:42 rcvd cmd = 0
2019-07-03 08:37:47 rcvd cmd = 0
Very good explanations.. Thank you
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2025 Fortinet, Inc. All Rights Reserved.