Hello Everyone,
we have a problem with the configuration of our HA, the HA is well configured and synchronized but the problem is that the master works well, but as soon as there is a problem on the master and we switch to the slave there is no traffic passing through the slave and we lose all access to the internet until the master is restored.
a lacp conf has been set up (the master and the slave belong to the same LACP aggregate on the switch side).
Initially, when I plugged the ports, they were all UP, but the slave ports went down later after a LACP negotiation I guess.
https://docs.fortinet.com/document/fortigate/6.4.15/administration-guide/666376
Nominating a forum post submits a request to create a new Knowledge Article based on the forum post topic. Please ensure your nomination includes a solution within the reply.
Created on 11-13-2024 08:58 AM Edited on 11-13-2024 08:58 AM
Why do you want to connect the slave ports first? In A-P HA, slave/secondary FGT(s) doesn't pass any packets. You want to connect the primary first to bring up the connection through the master/primary FGT. Then after confirmed it's working, you want to bring up the secondary LACP.
Toshi
the problem is that since configuring the ha, it doesn't work, and when the slave ports are plugged in we lose internet access, so to recover internet access we disconnected the slave ports to recover internet access while waiting to resolve the HA problem
Created on 11-13-2024 09:08 AM Edited on 11-13-2024 09:08 AM
What did you see in "get sys ha status" when you connected the secondary/slave? Did the slave take over the primary?
Toshi
If didn't take over, any log indicating the cause? And, what was in the routing table "get router info routing-table all"? Still showed the same default route?
Do you confirm the cluster nodes are not both active at the same time? A split brain situation can lead to IP conflict, and can lead to loss of internet access and other network access.
Created on 11-19-2024 11:19 AM Edited on 11-19-2024 11:42 AM
Hello,
sorry for the delay, i'm coming back to you with some new information, so we've reconfigured the LACP on the switch because there's an error in the layout of the switch ports in the LACP groups. once the reconfiguration was done, all the master and slave ports were connected to the switch without any problem, but today we wanted to do a HA failover test, we rebooted the master but once the reboot was launched we completely lost internet access which means that the secondary didn't take over and we regained internet access once the master had finished rebooting. Translated with DeepL.com (free version)
Created on 11-19-2024 11:28 AM Edited on 11-19-2024 11:32 AM
You need to share 1) the ha config on both units "config system ha", 2) "get sys ha status" on both units when both are up, then 3) "get sys ha status" on the secondary when the primary is down.
Or just open a ticket at TAC to get it looked into, which would be much faster.
Toshi
output of primary:
fw1 # diagnose sys link-monitor status
fw1 # get system ha status
HA Health Status: OK
Model: FortiGate-1101E
Mode: HA A-P
Group: 0
Debug: 0
Cluster Uptime: 0 days 19:29:18
Cluster state change time: 2024-11-19 18:07:55
Primary selected using:
<2024/11/19 18:07:55> FG10E1 is selected as the primary because it has the largest value of override pr
iority.
<2024/11/19 18:03:58> FG10E1 is selected as the primary because it's the only member in the cluster.
ses_pickup: disable
override: disable
Configuration Status:
FG10E1(updated 3 seconds ago): in-sync
FG10E1(updated 2 seconds ago): in-sync
System Usage stats:
FG10E1(updated 3 seconds ago):
sessions=89418, average-cpu-user/nice/system/idle=3%/0%/5%/90%, memory=49%
FG10E1(updated 2 seconds ago):
sessions=0, average-cpu-user/nice/system/idle=1%/0%/0%/98%, memory=32%
HBDEV stats:
FG10E1(updated 3 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=241182508/586184/0/0, tx=525872520/1425049/0/0
FG10E1(updated 2 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=525154181/1423780/0/0, tx=238300951/548820/0/0
MONDEV stats:
FG10E1(updated 3 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=361372988422/1011492711/0/0, tx=780920810340/125
4371795/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=508650687759/1102948589/0/0, tx=45573173966
9/1055327810/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=418812907934/387705802/0/0, tx=109625066959/23
9398427/0/0
FG10E1(updated 2 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=1453997498/7880114/0/0, tx=504064/3938/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=2287558/9356/0/0, tx=256/2/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=1084836/4676/0/0, tx=0/0/0/0
Primary : fw1 , FG10E1, HA cluster index = 0
Secondary : fw2 , FG10E1, HA cluster index = 1
number of vcluster: 1
vcluster 1: work 169.254.0.1
Primary: FG10E1, HA operating index = 0
Secondary: FG10E1, HA operating index = 1
fw1#
Output of secondary:
fw2 # get system ha status
HA Health Status: OK
Model: FortiGate-1101E
Mode: HA A-P
Group: 0
Debug: 0
Cluster Uptime: 0 days 19:41:47
Cluster state change time: 2024-11-19 18:07:54
Primary selected using:
<2024/11/19 18:07:54> FG10E1 is selected as the primary because it has the largest value of override priority.
ses_pickup: disable
override: disable
Configuration Status:
FG10E1(updated 0 seconds ago): in-sync
FG10E1(updated 1 seconds ago): in-sync
System Usage stats:
FG10E1(updated 0 seconds ago):
sessions=0, average-cpu-user/nice/system/idle=1%/0%/0%/98%, memory=32%
FG10E1(updated 1 seconds ago):
sessions=81950, average-cpu-user/nice/system/idle=4%/0%/1%/93%, memory=49%
HBDEV stats:
FG10E1(updated 0 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=530485858/1446580/0/0, tx=240841928/554830/0/0
FG10E1(updated 1 seconds ago):
ha: physical/1000auto, up, rx-bytes/packets/dropped/errors=243754157/592591/0/0, tx=531204179/1447836/0/0
MONDEV stats:
FG10E1(updated 0 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=1653489178/8443691/0/0, tx=507136/3962/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=2312008/9456/0/0, tx=256/2/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=1096436/4726/0/0, tx=0/0/0/0
FG10E1TB20901351(updated 1 seconds ago):
LAN_GENES: aggregate/00, up, rx-bytes/packets/dropped/errors=367972245673/1030470996/0/0, tx=815278854782/1290561502/0/0
TOR-DATACENTER: aggregate/00, up, rx-bytes/packets/dropped/errors=518143935132/1121640655/0/0, tx=462417285441/1070853807/0/0
WAN-RENATER: aggregate/00, up, rx-bytes/packets/dropped/errors=443046215087/406783224/0/0, tx=112743901116/246391323/0/0
Secondary : fw2-wan-saclay , FG10E1, HA cluster index = 1
Primary : fw1-wan-saclay , FG10E1, HA cluster index = 0
number of vcluster: 1
vcluster 1: standby 169.254.0.1
Secondary: FG10E1, HA operating index = 1
Primary: FG10E1, HA operating index = 0
fw2#
Hi @dkonate
Can you share the output of the following commands
1. di sys link-monitor status
2. get sys ha status
Select Forum Responses to become Knowledge Articles!
Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.
User | Count |
---|---|
1688 | |
1087 | |
752 | |
446 | |
226 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.