Skip to main content
CSKUM
New Member
February 7, 2025
Question

Fortigate HA Active-Passive out of sync all the time

  • February 7, 2025
  • 3 replies
  • 8511 views

Hello,

 

Few days ago we've started having trouble with our Active Passive cluster of two 1000F fortigates running 7.2.10 firmware.

 

After making changes on the primary unit, those changes does not propagate to secondary and after few minutes we see HA cluster out of sync. We've waiting couple of hours but they didn't synchronize.

 

The only way to get synchronize back is to manually force it by CLI:

 

diagnose sys ha checksum recalculate

execute ha synchronize start

 

After executing those commands couple of times on both primary and secondary cluster becomes synchronized.

 

Any ideas what happened?

3 replies

CSKUM
CSKUMAuthor
New Member
February 10, 2025

I've tried everything from the link you've provided exept rebuilding the HA from scratch (reseting secondary to factory defaults). I've replaced the cable used for connecting both Fortigates. I event switched the port used for HA from HA port to port7. Same result.

 

Almost every time when I get the synchronization back manually and I add something new on the primary unit the secondary gets out of sync. New objects which are added on primary unit don't show up on secondary. Sometimes it works but that's not very often. Few minutes ago I've made a test and added 3 address objects on primary unit one by one and check if they show up on secondary. And it did. But when I removed them from primary they weren't removed on secondary and HA became out of sync again.

 

When I debug the HA from CLI i get multiple WARNINGS, but I don't know if it is normal or not:

 

025-02-10 13:00:56 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:01 <hasync:WARN> conn=0xc4d1c20, peer closed the connection: dst=169.254.0.2, sync_type=18(byod)
2025-02-10 13:01:03 <hasync:WARN> conn=0xc536520, peer closed the connection: dst=169.254.0.2, sync_type=12(auth)
2025-02-10 13:01:04 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188864
2025-02-10 13:01:06 <hasync:WARN> conn=0xc4f9cd0, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:14 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188874
2025-02-10 13:01:16 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:24 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188884
2025-02-10 13:01:26 <hasync:WARN> conn=0xc4d1c20, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:32 <hasync:WARN> conn=0xc536520, peer closed the connection: dst=169.254.0.2, sync_type=18(byod)
2025-02-10 13:01:34 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188894
2025-02-10 13:01:36 <hasync:WARN> conn=0xc4f9cd0, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:44 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188904
2025-02-10 13:01:46 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:01:54 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188914
2025-02-10 13:01:56 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:03 <hasync:WARN> conn=0xc4d1c20, peer closed the connection: dst=169.254.0.2, sync_type=18(byod)
2025-02-10 13:02:04 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188924
2025-02-10 13:02:06 <hasync:WARN> conn=0xc56dc20, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:14 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188934
2025-02-10 13:02:16 <hasync:WARN> conn=0xc4f9cd0, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:24 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188944
2025-02-10 13:02:26 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:33 <hasync:WARN> conn=0xc4d1c20, peer closed the connection: dst=169.254.0.2, sync_type=18(byod)
2025-02-10 13:02:34 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188954
2025-02-10 13:02:36 <hasync:WARN> conn=0xc56dc20, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:44 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188964
2025-02-10 13:02:46 <hasync:WARN> conn=0xc4e4440, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)
2025-02-10 13:02:54 <hatalk> vcluster_1: ha_prio=1(secondary), state/chg_time/now=3(standby)/1739187627/1739188974
2025-02-10 13:02:56 <hasync:WARN> conn=0xc56dc20, peer closed the connection: dst=169.254.0.2, sync_type=14(diff)

Toshi_Esumi
SuperUser
SuperUser
February 10, 2025

When you run the command below on both units as in the KB, what did you see in the hatalk application debug output? Didn't the command recover the sync?

  fnsysctl killall hasync

 

It might be another HA related bug in 7.2.x. Or already reported. You should open a ticket at TAC to get it evaluated. 


Toshi

Toshi_Esumi
SuperUser
SuperUser
February 10, 2025

By the way, based on what I'm seeing in my lab environment, the "WARN" message itself seems to be normal. When I changed the config on the primary side it copied over to the secondary and showed me below when transaction was completed.
60EPOE-0239 # <hasync:WARN> conn=0x63c0940, peer closed the connection: dst=169.254.0.2, sync_type=10(cli-command)

But your "WARN" message is based on (diff). So it shouldn't keep happening repeatedly. That indicates a problem. What are you seeing on the secondary side in the same hasync app debugging?

Toshi

Toshi_Esumi
SuperUser
SuperUser
February 10, 2025

Looks like these "WARN" messages are on the secondary side. So they don't tell us what might be the problem.

Toshi

CSKUM
CSKUMAuthor
New Member
March 2, 2025

The problem stopped on it's own after upgrading Firmware to 7.4.7. After upgrading after few hours devices got synchronizec and they keep synced all the time even after chaning configuration on primary one. We've encounter another bug with traffic shaping which was cousing kernel panic on our 1000F but that's a completly diffrent problem.