Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
CSKUM
New Contributor

Fortigate HA Active-Passive out of sync all the time

Hello,

 

Few days ago we've started having trouble with our Active Passive cluster of two 1000F fortigates running 7.2.10 firmware.

 

After making changes on the primary unit, those changes does not propagate to secondary and after few minutes we see HA cluster out of sync. We've waiting couple of hours but they didn't synchronize.

 

The only way to get synchronize back is to manually force it by CLI:

 

diagnose sys ha checksum recalculate

execute ha synchronize start

 

After executing those commands couple of times on both primary and secondary cluster becomes synchronized.

 

Any ideas what happened?

Szymon Malinowski
Szymon Malinowski
15 REPLIES 15
Toshi_Esumi

I see you're using port7 for heartbeat:

    set hbdev "port7" 512 

But those ports are 10Gbps ports by default. Are you sure you're using CAT6 patch cable?

Show us interface config below:
config sys int
  edit port7

and, output of below command:
diag hard deviceinfo nic port7

By the way, even the HA port is 2.5 Gbps multi-gig port.

Toshi

CSKUM

Ports 1-8 were previously used for connecting to switches after we bought both fortigates and before we recieved 10Gbit DAC cables and they were all set up for 1Gbit uplinks. So the uplink on port7 between both FGs is set up to 1Gbit.

 

I have some spare 10Gbit DAC cables and I can connect both fortigates with it tommorow and set up HA on this port but I don't think patch cord is the problem. It was replaced yesterday and everything was working fine for over a year (we bought devices on November 2023) and it all started few weeks ago.

Szymon Malinowski
Szymon Malinowski
Toshi_Esumi

FG1000F's port1-8 are RJ45 ports. If you want to use a DAC cable, you have to move the heartbeat port.

 

And, unless you upgraded firmware a few weeks ago, or changed the HA related config at that time, it's unlikely a software issue if it started having problem all the sudden.


Toshi

Toshi_Esumi
SuperUser
SuperUser

By the way, based on what I'm seeing in my lab environment, the "WARN" message itself seems to be normal. When I changed the config on the primary side it copied over to the secondary and showed me below when transaction was completed.
60EPOE-0239 # <hasync:WARN> conn=0x63c0940, peer closed the connection: dst=169.254.0.2, sync_type=10(cli-command)

But your "WARN" message is based on (diff). So it shouldn't keep happening repeatedly. That indicates a problem. What are you seeing on the secondary side in the same hasync app debugging?

Toshi

Toshi_Esumi

Looks like these "WARN" messages are on the secondary side. So they don't tell us what might be the problem.

Toshi

CSKUM
New Contributor

The problem stopped on it's own after upgrading Firmware to 7.4.7. After upgrading after few hours devices got synchronizec and they keep synced all the time even after chaning configuration on primary one. We've encounter another bug with traffic shaping which was cousing kernel panic on our 1000F but that's a completly diffrent problem.

Szymon Malinowski
Szymon Malinowski
Announcements
Check out our Community Chatter Blog! Click here to get involved
Labels
Top Kudoed Authors