Technical Tip: How the "pingserver-secondary-force-reset" settings impacts the HA cluster remote link monitoring behaviour

saleha · ‎06-04-2025

Description

This article describes the behavior of an HA cluster when deployed to use link-monitor as failover criteria. An example and illustration of this deployment can be found in Technical Tip: Combining Remote Link Monitoring with FGCP cluster High Availability.

Scope

FortiOS - HA Cluster - Link-Monitor.

Solution

One of the main advantages of deploying HA cluster with link-monitor / remote-monitoring is to allow the firewall admin to monitor a link via VLAN interface as the HA cluster regular interface monitoring does not allow monitoring a VLAN interface.

By default, the command 'set pingserver-secondary-force-reset' is enabled once the 'pingserver-monitor-interface' option has a selected interface. For example:

config system ha

.....

set pingserver-monitor-interface "port1"
set pingserver-secondary-force-reset enable

.....

end

This command's behavior depends on HA cluster failover election method: Uptime or Priority. When the option 'override' is enabled on any member of the cluster, FortiOS will look at which member of the cluster has a higher priority value first. When override is disabled, uptime is the first election method.

For more details about the HA cluster failover election method, see Technical Tip: FortiGate HA Primary unit selection process when override is disabled vs enabled.

The following is an illustration in both scenarios of the HA cluster primary election method when link-monitor is down on the primary. For time-conservation purposes, the 'pingserver-flip-timeout' value is set to the minimum, which is 6 minutes.

First Scenario: Override is disabled.

In this scenario, the primary member of the cluster is elected by the longest uptime as illustrated on the output of the command 'get system ha status' below:

The following is the link-monitor and ha config:

config system link-monitor
edit "ha-monitor"
set srcintf "port2"
set server "8.8.8.8"
set failtime 2
set ha-priority 5
set update-cascade-interface disable
set update-policy-route disable
next
end

config system ha
set group-id 103
set group-name "remote-mon"
set mode a-p
set password ENC pblkR52GmqtF+jcUMjpP/cbYKVdH4H7AXFYfwVn0rKln2/NlvVpxqDEFxa+

+M0cmqIxwQn2Qz+RrJWZIOJGISP0wmtV5S0pkcZktee9IKFWW7uBpBll36t7ETBlj5ulSjM5lH1DkTqI7Y3fMKy

E3/CVyCWsdMcaBYvtwWTeKgoQkIGV0Hiq+NjGNk7F6Vqu4kgd8U1lmMjY3dkVA

set hbdev "port5" 50

set session-pickup enable

set session-pickup-connectionless enable

set ha-mgmt-status enable

config ha-mgmt-interfaces
edit 1
set interface "port3"
set gateway 10.9.31.254
next
end
set override disable
set priority 130
set monitor "port2"
set pingserver-monitor-interface "port2"

set pingserver-secondary-force-reset enable
set pingserver-failover-threshold 4
set pingserver-flip-timeout 6
end

The next step is to force a failover when the link-monitor for port2 is down:

After link-monitor forces a failover, the flip timer starts counting down:

diagnose sys ha dump-by group
HA information.
group-id=103, group-name='remote-mon'
has_no_aes128_gcm_sha256_member=0

gmember_nr=2
'FGVM02TM23001313': ha_ip_idx=1, hb_packet_version=4, last_hb_jiffies=9225072, linkfails=1, weight/o=0/0, support_aes128_gcm_sha256=1
hbdev_nr=1: port5(mac=0043..05, last_hb_jiffies=9225072, hb_lost=0),
'FGVM02TM23001318': ha_ip_idx=0, hb_packet_version=38, last_hb_jiffies=0, linkfails=0, weight/o=0/0, support_aes128_gcm_sha256=1

vcluster_nr=1
vcluster-1: start_time=1748958892(2025-06-04 01:54:52), state/o/chg_time=3(standby)/2(work)/1748958892(2025-06-04 01:54:52)
pingsvr_flip_timeout/expire=360s/263s <-----
mondev: port2(prio=50,is_aggr=0,status=1)
'FGVM02TM23001313': ha_prio/o=0/0, link_failure=0, pingsvr_failure=0, flag=0x00000001, mem_failover=0, uptime/reset_cnt=792/1
'FGVM02TM23001318': ha_prio/o=1/1, link_failure=0, pingsvr_failure=5, flag=0x00000000, mem_failover=0, uptime/reset_cnt=0/4

After the flip time expires, there are no further failovers unless the current primary link-monitor goes down, even if the current primary uptime is lower than the secondary member of the cluster:

Second Scenario: Override is enabled.

This is the case where the primary election method is highest priority:

config system ha
.....
set override enable
.....
end

Status of the cluster before any failover by link-monitor:

When link-monitor for port2 is down, the cluster performs failover to the secondary as expected with lower priority and starts the flip timer:

However, in this case, once the flip timer is down to 0, the cluster will perform another failover back to the primary with the highest priority as illustrated by the command output and following image:

diagnose sys ha dump-by group
HA information.
group-id=103, group-name='remote-mon'
has_no_aes128_gcm_sha256_member=0

gmember_nr=2
'FGVM02TM23001313': ha_ip_idx=1, hb_packet_version=10, last_hb_jiffies=17700913, linkfails=1, weight/o=0/0, support_aes128_gcm_sha256=1
hbdev_nr=1: port5(mac=0043..05, last_hb_jiffies=17700913, hb_lost=0),
'FGVM02TM23001318': ha_ip_idx=0, hb_packet_version=90, last_hb_jiffies=0, linkfails=0, weight/o=0/0, support_aes128_gcm_sha256=1

vcluster_nr=1
vcluster-1: start_time=1748976732(2025-06-04 06:52:12), state/o/chg_time=3(standby)/2(work)/1749043349(2025-06-05 01:22:29)
pingsvr_flip_timeout/expire=360s/0s <-----
'FGVM02TM23001313': ha_prio/o=0/0, link_failure=0, pingsvr_failure=0, flag=0x00000001, mem_failover=0, uptime/reset_cnt=0/2
'FGVM02TM23001318': ha_prio/o=1/1, link_failure=0, pingsvr_failure=0, flag=0x00000000, mem_failover=0, uptime/reset_cnt=66253/7

This is why, in cases where override is disabled, it is critically important to make sure the 'pingserver-flip-timeout' value is fine-tuned. If this flip timer times out and the secondary device with higher priority still had an issue connecting to the ISP, the failover will still occur and the flip timer will restart while the HA cluster will not automatically failover to the secondary until that timer runs down to 0. This causes a flip-flap effect until the member with the higher priority has restored its connection to the ISP.

Technical Tip: How the "pingserver-secondary-force-reset" settings impacts the HA cluster remote link monitoring behaviour

Description

Scope

Solution

You are leaving our website