FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
Mono_FTNT
Staff
Staff
Article Id 289056
Description

This article describes that if a cluster fails over, the link-monitor status will only update after a cold start timer of 10 seconds.
It will take the configured failtime plus 10 seconds to detect the link-monitor status.

Scope FortiGate v7.0.11/7.2.5/7.4.0 or later version.
Solution

For example, a link-monitor has been configured on HA as follows:

 

KB_DIAGRAM.gif

 

In this situation, if Router-1 goes down, the link-monitor failure will be detected by 4 seconds.

  • Time = 'interval' * 'failtime' + 'probe-timeout'
  • Time = 1 * 3 + 1 = 4 seconds

 

Here is the output of link-monitor's debug log:

 

2023-12-12 23:34:06 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=13447, icmp id=1, send 20 bytes
2023-12-12 23:34:06 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)
2023-12-12 23:34:06 lnkmtd::ping_do_addr_up(116): ---> 1->10.10.10.254(10.10.10.254), rcvd
2023-12-12 23:34:06 lnkmtd::monitor_peer_recv(1992): ---> 1 send time 1702391646s 118905us, revd time 1702391646s 119003us
2023-12-12 23:34:07 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=13448, icmp id=1, send 20 bytes
2023-12-12 23:34:07 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)
2023-12-12 23:34:08 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=13449, icmp id=1, send 20 bytes
2023-12-12 23:34:08 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(1)
2023-12-12 23:34:09 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=13450, icmp id=1, send 20 bytes
2023-12-12 23:34:09 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(2)
2023-12-12 23:34:10 lnkmtd::monitor_ppeer_fail(1682): ---> 1(10.10.10.254 ping) is dead.

 

If L2_switch-1 goes down then HA failover occurs, and the link-monitor failure is detected by 14 seconds.

  • Time = 'interval' * 'failtime' + 'probe-timeout' + 'lnkmtd cold start'
  • Time = 1 * 3 + 1 + 10 = 14 seconds

 

Here is the output of link-monitor's debug log:

 

2023-12-12 23:36:11 ha_sync_handle_reset()-471: num_peers=1, local_ip=169.254.0.1
2023-12-12 23:36:11 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2215, icmp id=709, send 20 bytes
2023-12-12 23:36:11 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)
2023-12-12 23:36:12 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2216, icmp id=709, send 20 bytes
2023-12-12 23:36:12 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)

                                                                   <SNIP>
2023-12-12 23:36:20 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2224, icmp id=709, send 20 bytes
2023-12-12 23:36:20 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)
2023-12-12 23:36:21 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2225, icmp id=709, send 20 bytes
2023-12-12 23:36:21 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(0)
2023-12-12 23:36:22 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2226, icmp id=709, send 20 bytes
2023-12-12 23:36:22 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(1)
2023-12-12 23:36:23 lnkmtd::ping_send_msg(409): ---> ping 10.10.10.254 seq_no=2227, icmp id=709, send 20 bytes
2023-12-12 23:36:23 lnkmtd::monitor_proto_peer_send_request(605): ---> 1(10.10.10.254:ping) send probe packet, fail count(2)
2023-12-12 23:36:24 lnkmtd::monitor_ppeer_fail(1682): ---> 1(10.10.10.254 ping) is dead.

 

This cold start mechanism was introduced in v7.0.11, v7.2.5 v7.4.0 because in a failover situation, it will take some time to update virtual MAC Addresses, Elastic IP or Public Static IP (in cloud environments) to the new master and if link-monitor kicks in, it can cause a failover back to former primary because there are packet loss on probes.

 

Note:

The previous example is for HA failover in which the link-monitor status=dead, but the same applies for status=alive. After HA failover, it will take 10 seconds plus the configured timers in the config system link-monitor to update the link-monitor status for status=alive.

 

Starting from v7.0.14, v7.2.8, and v7.4.2  to not have packet loss statistics on the link-monitor, in the cold start duration, the seq_no will not increase, not counting these failed probes into the statistics.

 

2025-06-09 11:53:25 HA event
2025-06-09 11:53:25 ha_sync_handle_reset()-559: num_peers=1, local_ip=20.1.2.10
2025-06-09 11:53:25 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3086, icmp id=7, send 20 bytes
2025-06-09 11:53:25 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:25 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3086, icmp id=7, send 20 bytes
2025-06-09 11:53:35 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:35 lnkmtd::ping_do_addr_up(136): ---> L_M_Port1->89.180.243.203(89.180.243.203), rcvd
2025-06-09 11:53:35 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3086, icmp id=7, send 20 bytes
2025-06-09 11:53:35 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:35 lnkmtd::ping_do_addr_up(136): ---> L_M_Port1->89.180.243.203(89.180.243.203), rcvd
...
2025-06-09 11:53:35 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3087, icmp id=7, send 20 bytes
<----- 10 seconds have passed since failover, cold start timer finishes, probes started to count.
2025-06-09 11:53:35 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:35 lnkmtd::ping_do_addr_up(136): ---> L_M_Port1->89.180.243.203(89.180.243.203), rcvd
2025-06-09 11:53:35 lnkmtd::monitor_peer_recv(2219): ---> L_M_Port1 send time 1749466415s 821681us, revd time 1749466415s 867711us
2025-06-09 11:53:35 lnkmtd::monitor_proute_cmdb_set(1147): ---> policy routes or internet service routes related to the monitor(L_M_Port1) may be added
2025-06-09 11:53:35 lnkmtd::lnkmt_addr_mode_do_downgateway4(369): ---> added route vd(root), oif=port1(3) gateway(0.0.0.0) for subnet(0.0.0.0/0)
2025-06-09 11:53:35 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3088, icmp id=7, send 20 bytes
2025-06-09 11:53:35 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:35 lnkmtd::ping_do_addr_up(136): ---> L_M_Port1->89.180.243.203(89.180.243.203), rcvd
2025-06-09 11:53:35 lnkmtd::monitor_peer_recv(2219): ---> L_M_Port1 send time 1749466415s 842915us, revd time 1749466415s 889214us
2025-06-09 11:53:35 lnkmtd::ping_send_msg(435): ---> ping 89.180.243.203 seq_no=3089, icmp id=7, send 20 bytes
2025-06-09 11:53:35 lnkmtd::monitor_proto_peer_send_request(698): ---> L_M_Port1(89.180.243.203:ping) send probe packet, fail count(0)
2025-06-09 11:53:35 lnkmtd::ping_do_addr_up(136): ---> L_M_Port1->89.180.243.203(89.180.243.203), rcvd

 

Mon Jun 9 11:53:35 WEST 2025
AWS-HA-Active # diag sys link-monitor status
Link Monitor: L_M_Port1, Status: dead, Server num(1), cfg_version=0 HA state: local(dead), shared(dead)

 

Mon Jun 9 11:53:36 WEST 2025
AWS-HA-Active # diag sys link-monitor status
Link Monitor: L_M_Port1, Status: alive

 

The status is updated at 11:53:35. In this example Time = 0.02 * 2 + 0.1 + 10 = 10.14 seconds.