Solution |
- FortiGate in Active Passive HA setup running v7.2.11 shows the configuration is out of sync. As per the 'get system ha status', the device displays the checksum as all zeros for the peer device. It also does not display the peer device's name.
chameleon-kvm04 # get system ha status Primary selected using: HA Health Status: OK Model: FortiGate-VM64-KVM Mode: HA A-P Group Name: Lab Group ID: 10 Debug: 0 Cluster Uptime: 0 days 1:12:51 Cluster state change time: 2025-06-12 09:45:55 <2025/06/12 09:45:55> vcluster-1: FGYYYYYYYYYYYYY is selected as the primary because its uptime is larger than peer member FGXXXXXXXXXXXXX. ses_pickup: disable override: disable Configuration Status: FGXXXXXXXXXXXXX(updated 4 seconds ago): out-of-sync FGXXXXXXXXXXXXX chksum dump: 01 c3 5b df bb 27 3a f2 d2 be 9f e7 42 13 5c b2 FGYYYYYYYYYYYYY(updated 1749716749 seconds ago): in-sync FGYYYYYYYYYYYYY chksum dump: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ----> Checksum of the peer device System Usage stats: FGXXXXXXXXXXXXX(updated 4 seconds ago): sessions=0, average-cpu-user/nice/system/idle=32%/0%/17%/51%, memory=37% FGYYYYYYYYYYYYY(updated 1749716749 seconds ago): sessions=0, average-cpu-user/nice/system/idle=0%/0%/0%/0%, memory=0% HBDEV stats: FGXXXXXXXXXXXXX(updated 4 seconds ago): port2: physical/10000full, up, rx-bytes/packets/dropped/errors=7734008/16786/0/0, tx=5723188/14660/0/0 FGYYYYYYYYYYYYY(updated 1749716749 seconds ago): Secondary : chameleon-kvm04 , FGXXXXXXXXXXXXX, HA cluster index = 0 Primary : , FGYYYYYYYYYYYYY, HA cluster index = 1 ----> Not displaying the Hostname of the peer unit. number of vcluster: 1 vcluster 1: standby 169.254.0.2 Secondary: FGXXXXXXXXXXXXX, HA operating index = 1 Primary: FGYYYYYYYYYYYYY, HA operating index = 0
- FortiGate does not show any symptoms of the Split brain, as the above logs show it can detect the peer device, but configuration sync is not working.
- Verify the iprope list in HA vsys to verify if the local-in ipropes are available for the HA sync communication. Below is an example that shows the iProp list that should be present, along with how to check. The administrator must run the command 'exec enter root' after checking the iprope to go back root VDOM.
chameleon-kvm04 # exec enter vsys_ha current vdom=vsys_ha:1
chameleon-kvm04 # diagnose firewall iprope list
Policy Group 00100004
policy index=4294967295 uuid_idx=0 action=accept flag (10100): nat master flag2 (80): skip_unauth flag3 (20): link-local schedule() cos_fwd=0 cos_rev=0 group=00100004 av=00000000 au=00000000 split=00000000 host=1 chk_client_info=0x0 app_list=0 ips_view=0 misc=0 zone(1): 20 -> zone(1): 0 source(1): 169.254.0.0-169.254.0.63, uuid_idx=0, dest(1): 0.0.0.0-255.255.255.255, uuid_idx=0, service(1): [0:0x0:0/(0,65535)->(0,65535)] flags:0 helper:auto
Policy Group 0010000e
policy index=4294967295 uuid_idx=35 action=accept flag (0): schedule() cos_fwd=0 cos_rev=0 group=0010000e av=00000000 au=00000000 split=00000000 host=1 chk_client_info=0x0 app_list=0 ips_view=0 misc=0 zone(1): 0 -> zone(1): 0 source(1): 169.254.0.1-169.254.0.62, uuid_idx=0, dest(1): 0.0.0.0-255.255.255.255, uuid_idx=0, service(2): [17:0x0:0/(0,65535)->(0,65535)] flags:0 helper:auto [6:0x0:0/(0,65535)->(0,65535)] flags:0 helper:auto
chameleon-kvm04 # exec enter root current vdom=root:0
- In the device that has the issue, the Administrator will not see the iprope entries which was mentioned above.
- The issue will be fixed as part of the reported issue 1136097.
- The workaround for the issue is to recalculate the checksum 'diagnose sys ha checksum recalculate', stop HA synchronization 'execute ha synchronize stop', and start HA synchronization again by restarting the hatalk daemon 'fnsysctl killall hatalk '. Note: Restarting the hatalk daemon may cause a split-brain situation in an Active Passive HA setup, meaning the Administrator must take required precautions while executing the command.
- The issue is marked resolved in 7.4.8 (and upcoming 7.6.4).
|