Fortinet Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
howardsinc
New Contributor

HA Cluster Link-monitoring Failover kinda working....

I have a two unit HA cluster (override disable). Working with Remote link failover.

 

http://docs.fortinet.com/uploaded/files/2177/fortigate-ha-526.pdf

(page 243)

 

I am able to successfully fail the cluster over when ever I break the Link-Monitor. But once the Slave becomes the new Master, After the 'pingserver-flip-timeout 6' expires, if I kill the Link-Monitor AGAIN it will not failback to the orginal Master.

 

The behavior I expect would be If I just leave the ping monitor broke (failed state), The HA cluster should keep just failing its self over to the other unit ever 6 minutes.

 

---

config system link-monitor

HA2 (test1) # get name : test1 srcintf : port1 server: == [ 172.20.40.1 ] address: 172.20.40.1 protocol : ping gateway-ip : 172.20.40.1 source-ip : 0.0.0.0 interval : 1 timeout : 1 failtime : 1 recoverytime : 3 ha-priority : 10 update-cascade-interface: disable update-static-route : enable status : enable

---- HA1 (test1) # get name : test1 srcintf : port1 server: == [ 172.20.40.1 ] address: 172.20.40.1 protocol : ping gateway-ip : 172.20.40.1 source-ip : 0.0.0.0 interval : 1 timeout : 1 failtime : 1 recoverytime : 3 ha-priority : 10 update-cascade-interface: disable update-static-route : enable status : enable

--

 

config system ha set group-id 99 set group-name "test1" set mode a-p set password P@ssword123 set hbdev "port23" 1 "port24" 1 set hb-interval 1 set hb-lost-threshold 2 set session-pickup enable set session-pickup-connectionless enable set ha-mgmt-status enable set ha-mgmt-interface "mgmt2" set ha-mgmt-interface-gateway 172.16.1.1 set override disable set monitor "port1" "port2" "port3" "port4" set pingserver-monitor-interface "port1" set pingserver-failover-threshold 10 <------------------- set pingserver-slave-force-reset disable set pingserver-flip-timeout 6  <------------------- set ha-direct enable end

 

edit "port1" set vdom "dmz" set ip 172.20.40.2 255.255.255.192 set allowaccess ping fgfm set fail-detect enable set fail-detect-option detectserver set type physical set alias "dmz" set snmp-index 5 set secondary-IP enable config ipv6 set ip6-allowaccess ping set ip6-address 2707:b200:f303:3::100/64 end

==================

 

 

Has anyone ever seen this type of behavior? 

 

Thanks!

JNCIA, CCNP R/S, NSE4 , NSE7, Associate of (ISC)²

6 REPLIES 6
howardsinc
New Contributor

Ok so to answer my own post, after testing with different HA settings (config sys ha) it wasn't untill I enabled:

 

'set pingserver-slave-force-reset enable'

 

and then the cluster work as expected. Once the Link-monitor failed, cluster would failover to Slave unit, If I left the link-monitor in a failed state. The cluster would then failover again back to the orginal master.

 

I could not find any documentation on this command but ya.

 

Regards,

JNCIA, CCNP R/S, NSE4 , NSE7, Associate of (ISC)²

jc83419

howardsinc wrote:

Ok so to answer my own post, after testing with different HA settings (config sys ha) it wasn't untill I enabled:

 

'set pingserver-slave-force-reset enable'

 

Did the cluster work as expected. Once the Link-monitor failed, cluster would failover to Slave unit, If I left the link-monitor in a failed state. The cluster would then failover again back to the orginal master.

 

I could not find any documentation on this command but ya.

 

Regards,

I have same question. i couldn't find any documents mentioned this cli. 

jc83419

i found it . it is in 5.4 cli reference. holy crap.

 

pingserver-slave-force-reset  

Enable/disable force reset of slave after PING server failure. in 5.4 default , enable.

 

sonanio
New Contributor

I have experienced a similar situation on a 2X100D-POE and 2XFS108D.

I have "set override enable" on primary unit and "set pingserver-slave-force-reset enable".

under "config system link-monitor" I have "set ha-priority 5".

 

When ping fails to pingserver on primary unit, it triggers an HA failover to slave unit, and after that, no matter what I do, it never fails back to primary unit.

I have been searching for answers to this problem and no luck yet.

mscheiber

Any update on this? 

elfaran_FTNT

maybe it is this, what you are looking for?

 

forceIf you have enabled override, you can disable pingserver-slave-force-reset to reduce the number of failovers. If override is enabled and a remote link failover has occured, after the flip timeout, even if the current primary unit is not experiencing a remote link failure, if pingserver-slave-force-reset is enabled, override causes the cluster to negotiate and select the FortiGate with the highest priority to become the primary unit. Then, if the remote link has not been restored for the FortiGate with the highest priority, remote link failover may cause another failover. But with override enabled, if pingsever-slave-force-reset is disabled, as long as the current primary unit is not experiencing a remote link failure, the cluster will not renegotiate. In brief, disabling pingserver-slave-force-reset prevents repeated failovers if the remote link is not restored for both FortiGates when the current primary unit experiences a remote link failure.

 

[link=https://help.fortinet.com/fos60hlp/60/Content/FortiOS/fortigate-high-availability/HA_failoverRemoteLink.htm](https://help.fortinet.com...failoverRemoteLink.htm[/link])