Created on 04-02-2020 12:10 AM Edited on 11-26-2024 02:45 AM By Jean-Philippe_P
Description
This article describes how to verify HA cluster members individual uptime.
From the command 'get system ha status' FGVM04TM19-----3 is being selected as Master due to the largest value of uptime:
Master (global) # get system ha status
HA Health Status: OK
Model: FortiGate-VM64
Mode: HA A-P
Group: 100
Debug: 0
Cluster Uptime: 0 days 0:39:28
Cluster state change time: 2020-03-30 17:18:09
Master selected using:
<2020/03/30 17:18:09> FGVM04TM19-----3 is selected as the master because it has the largest value of uptime.
<2020/03/30 17:15:46> FGVM04TM19-----4 is selected as the master because it has the largest value of uptime.
<2020/03/30 16:39:47> FGVM04TM19-----3 is selected as the master because it has the largest value of override priority.
<2020/03/30 16:39:00> FGVM04TM19-----3 is selected as the master because it's the only member in the cluster.
ses_pickup: enable, ses_pickup_delay=disable
override: enable
Configuration Status:
FGVM04TM19-----3(updated 4 seconds ago): in-sync
FGVM04TM19-----4(updated 2 seconds ago): in-sync
System Usage stats:
FGVM04TM19-----3(updated 4 seconds ago):
sessions=25, average-cpu-user/nice/system/idle=0%/0%/0%/99%, memory=63%
FGVM04TM19-----4(updated 2 seconds ago):
sessions=16, average-cpu-user/nice/system/idle=0%/0%/0%/99%, memory=61%
HBDEV stats:
FGVM04TM19-----3(updated 4 seconds ago):
port10: physical/10000full, up, rx-bytes/packets/dropped/errors=8409460/31503/0/0, tx=35385838/37462/0/0
FGVM04TM19-----4(updated 2 seconds ago):
port10: physical/10000full, up, rx-bytes/packets/dropped/errors=33122327/35969/0/0, tx=8409529/31458/0/0
MONDEV stats:
FGVM04TM19-----3(updated 4 seconds ago):
port3: physical/10000full, up, rx-bytes/packets/dropped/errors=10471996/40295/0/0, tx=840/14/0/0
FGVM04TM19-----4(updated 2 seconds ago):
port3: physical/10000full, up, rx-bytes/packets/dropped/errors=4344523/15513/0/0, tx=360/6/0/0
Master: Master , FGVM04TM19-----3, cluster index = 1
Slave : Slave , FGVM04TM19-----4, cluster index = 0
number of vcluster: 1
vcluster 1: work 169.254.0.2
Master: FGVM04TM19-----3, operating cluster index = 0
Slave : FGVM04TM19-----4, operating cluster index = 1
However, from 'get system performance status', FGVM04TM19-----3 and FGVM04TM19----4 have identical uptime of 44 minutes, how does the HA cluster select FGVM04TM19-----3 due to having higher uptime?
Master (global) # get system performance status | grep Uptime
Uptime: 0 days, 0 hours, 44 minutes
Slave (global) # get system performance status | grep Uptime
Uptime: 0 days, 0 hours, 44 minutes
Scope
FortiGate.
Solution
Use the command 'diagnose sys ha dump-by group' to verify HA member individual uptime:
Master (global) # diagnose sys ha dump-by group
HA information.
group-id=100, group-name='fortigate'
gmember_nr=2
'FGVM04TM19-----3': ha_ip_idx=1, hb_packet_version=6, last_hb_jiffies=0, linkfails=0, weight/o=0/0
'FGVM04TM19-----4': ha_ip_idx=0, hb_packet_version=8, last_hb_jiffies=236404, linkfails=0, weight/o=0/0
hbdev_nr=1: port10(mac=000c..05, last_hb_jiffies=236404, hb_lost=0),
vcluster_nr=1
vcluster_0: start_time=1585558721(2020-03-30 16:58:41), state/o/chg_time=2(work)/3(standby)/1585559889(2020-03-30 17:18:09)
pingsvr_flip_timeout/expire=3600s/3572s
mondev: port3(prio=50,is_aggr=0,status=1)
'FGVM04TM19-----3': ha_prio/o=0/1, link_failure=0, pingsvr_failure=0, flag=0x00000001, uptime/reset_cnt=1167/1
'FGVM04TM19-----4': ha_prio/o=1/0, link_failure=0, pingsvr_failure=0, flag=0x00000000, uptime/reset_cnt=0/1
Referring to the 'start_time', FGVM04TM19-----3 uptime is from 2020-03-30 16:58:41. Check on FGVM04TM19-----4:
Slave (global) # diagnose sys ha dump-by group
HA information.
group-id=100, group-name='fortigate'
gmember_nr=2
'FGVM04TM19-----3': ha_ip_idx=1, hb_packet_version=6, last_hb_jiffies=236702, linkfails=0, weight/o=0/0
hbdev_nr=1: port10(mac=000c..4f, last_hb_jiffies=236702, hb_lost=0),
'FGVM04TM19-----4': ha_ip_idx=0, hb_packet_version=8, last_hb_jiffies=0, linkfails=0, weight/o=0/0
vcluster_nr=1
vcluster_0: start_time=1585559888(2020-03-30 17:18:08), state/o/chg_time=3(standby)/2(work)/1585559889(2020-03-30 17:18:09)
pingsvr_flip_timeout/expire=3600s/3541s
mondev: port3(prio=50,is_aggr=0,status=1)
'FGVM04TM19-----3': ha_prio/o=0/0, link_failure=0, pingsvr_failure=0, flag=0x00000001, uptime/reset_cnt=1167/1
'FGVM04TM19-----4': ha_prio/o=1/1, link_failure=0, pingsvr_failure=0, flag=0x00000000, uptime/reset_cnt=0/1
The uptime for FGVM04TM19------4 is actually from 2020-03-30 17:18:08. Therefore FGVM04TM19-----3 is being selected as HA master due to higher uptime.
The time difference between both units is actually 1167 seconds.
Which is higher than the default 'ha-uptime-diff-margin' of 300 seconds, therefore FGVM04TM19-----4 will be selected as Master when override is not enabled.
Device uptime provides information on how long the member is up.
Cluster uptime detailed how long at least one member of the cluster has been able to handle the traffic. Even if a failover occurs, this time will not be reset.
HA uptime is a timer used for the election process of the primary device in cluster A-P. This value detailed the time that a device has been primary without an event that would trigger a new election process.
A dedicated article has been written about the HA uptime for chassis devices (FortiGate 6000 and 7000) :
Technical Tip: Understanding the HA uptime for Chassis based device
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.