Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
dkonate
New Contributor II

HA Issues

Hello Everyone,

 

we have a problem with the configuration of our HA, the HA is well configured and synchronized but the problem is that the master works well, but as soon as there is a problem on the master and we switch to the slave there is no traffic passing through the slave and we lose all access to the internet until the master is restored.

 

a lacp conf has been set up (the master and the slave belong to the same LACP aggregate on the switch side).

Initially, when I plugged the ports, they were all UP, but the slave ports went down later after a LACP negotiation I guess.

 

 

Architecture.PNG

https://docs.fortinet.com/document/fortigate/6.4.15/administration-guide/666376 

27 REPLIES 27
Toshi_Esumi

Why do you want to connect the slave ports first? In A-P HA, slave/secondary FGT(s) doesn't pass any packets. You want to connect the primary first to bring up the connection through the master/primary FGT. Then after confirmed it's working, you want to bring up the secondary LACP.

 

Toshi

dkonate
New Contributor II

the problem is that since configuring the ha, it doesn't work, and when the slave ports are plugged in we lose internet access, so to recover internet access we disconnected the slave ports to recover internet access while waiting to resolve the HA problem

Toshi_Esumi

What did you see in "get sys ha status" when you connected the secondary/slave? Did the slave take over the primary?

 

Toshi

Toshi_Esumi

If didn't take over, any log indicating the cause? And, what was in the routing table "get router info routing-table all"? Still showed the same default route?

AEK

Do you confirm the cluster nodes are not both active at the same time? A split brain situation can lead to IP conflict, and can lead to loss of internet access and other network access.

AEK
AEK
dkonate
New Contributor II

Hello,

 

sorry for the delay, i'm coming back to you with some new information, so we've reconfigured the LACP on the switch because there's an error in the layout of the switch ports in the LACP groups. once the reconfiguration was done, all the master and slave ports were connected to the switch without any problem, but today we wanted to do a HA failover test, we rebooted the master but once the reboot was launched we completely lost internet access which means that the secondary didn't take over and we regained internet access once the master had finished rebooting. Translated with DeepL.com (free version)

Toshi_Esumi

You need to share 1) the ha config on both units "config system ha", 2) "get sys ha status" on both units when both are up, then 3) "get sys ha status" on the secondary when the primary is down.


Or just open a ticket at TAC to get it looked into, which would be much faster.

Toshi

dkonate
New Contributor II

output of primary:

 

fw1 # diagnose sys link-monitor status

fw1 # get system ha status
HA Health Status: OK
Model: FortiGate-1101E
Mode: HA A-P
Group: 0
Debug: 0
Cluster Uptime: 0 days 19:29:18
Cluster state change time: 2024-11-19 18:07:55
Primary selected using:
<2024/11/19 18:07:55> FG10E1 is selected as the primary because it has the largest value of override pr
iority.
<2024/11/19 18:03:58> FG10E1 is selected as the primary because it's the only member in the cluster.
ses_pickup: disable
override: disable
Configuration Status:
FG10E1(updated 3 seconds ago): in-sync
FG10E1(updated 2 seconds ago): in-sync
System Usage stats:
FG10E1(updated 3 seconds ago):
sessions=89418, average-cpu-user/nice/system/idle=3%/0%/5%/90%, memory=49%
FG10E1(updated 2 seconds ago):
sessions=0, average-cpu-user/nice/system/idle=1%/0%/0%/98%, memory=32%
HBDEV stats:
FG10E1(updated 3 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=241182508/586184/0/0, tx=525872520/1425049/0/0
FG10E1(updated 2 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=525154181/1423780/0/0, tx=238300951/548820/0/0
MONDEV stats:
FG10E1(updated 3 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=361372988422/1011492711/0/0, tx=780920810340/125
4371795/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=508650687759/1102948589/0/0, tx=45573173966
9/1055327810/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=418812907934/387705802/0/0, tx=109625066959/23
9398427/0/0
FG10E1(updated 2 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=1453997498/7880114/0/0, tx=504064/3938/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=2287558/9356/0/0, tx=256/2/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=1084836/4676/0/0, tx=0/0/0/0
Primary : fw1 , FG10E1, HA cluster index = 0
Secondary : fw2 , FG10E1, HA cluster index = 1
number of vcluster: 1
vcluster 1: work 169.254.0.1
Primary: FG10E1, HA operating index = 0
Secondary: FG10E1, HA operating index = 1

fw1#

dkonate
New Contributor II

Output of secondary:

 

fw2 # get system ha status
HA Health Status: OK
Model: FortiGate-1101E
Mode: HA A-P
Group: 0
Debug: 0
Cluster Uptime: 0 days 19:41:47
Cluster state change time: 2024-11-19 18:07:54
Primary selected using:
<2024/11/19 18:07:54> FG10E1 is selected as the primary because it has the largest value of override priority.
ses_pickup: disable
override: disable
Configuration Status:
FG10E1(updated 0 seconds ago): in-sync
FG10E1(updated 1 seconds ago): in-sync
System Usage stats:
FG10E1(updated 0 seconds ago):
sessions=0, average-cpu-user/nice/system/idle=1%/0%/0%/98%, memory=32%
FG10E1(updated 1 seconds ago):
sessions=81950, average-cpu-user/nice/system/idle=4%/0%/1%/93%, memory=49%
HBDEV stats:
FG10E1(updated 0 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=530485858/1446580/0/0, tx=240841928/554830/0/0
FG10E1(updated 1 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=243754157/592591/0/0, tx=531204179/1447836/0/0
MONDEV stats:
FG10E1(updated 0 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=1653489178/8443691/0/0, tx=507136/3962/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=2312008/9456/0/0, tx=256/2/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=1096436/4726/0/0, tx=0/0/0/0
FG10E1TB20901351(updated 1 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=367972245673/1030470996/0/0, tx=815278854782/1290561502/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=518143935132/1121640655/0/0, tx=462417285441/1070853807/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=443046215087/406783224/0/0, tx=112743901116/246391323/0/0
Secondary : fw2-wan-saclay , FG10E1, HA cluster index = 1
Primary : fw1-wan-saclay , FG10E1, HA cluster index = 0
number of vcluster: 1
vcluster 1: standby 169.254.0.1
Secondary: FG10E1, HA operating index = 1
Primary: FG10E1, HA operating index = 0

fw2#

Hemin88
New Contributor III

Hi @dkonate 


Can you share the output of the following commands

1. di sys link-monitor status
2. get sys ha status

IP Network Engineer
IP Network Engineer
Announcements

Select Forum Responses to become Knowledge Articles!

Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.

Labels
Top Kudoed Authors