Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
pnobels
New Contributor II

HA split brain logging

Hi,

 

we suffered a split brain scenario on our Fortigate 6.7 cluster yesterday, due to a datacenter fibre connection split.  We know that in such scenario on Checkpoint, the active cluster member remains active, and the standby does nothing.

 

I'm wondering in case of the Fortinet HA setup...  According to the docs a Fortigate HA split brain would result in two active members.  I believe this is correct as - when connectivity was restored - i got email telling me the standby was migrated to primary.  And at about the same time i got an email telling me the original primary had become primary again.  I assume when both cluster members were able to talk again, priority kicked in here, so the original standby member returned to this state.

 

Since i wasn't able to login to one cluster member, i wonder if there's any log to be retrieved by cli on the host itself?  I was interested in seeing what the member which was not reachable actually had done during the split brain.  And FortiAnalyzer actually does not show me much more then 'heartbeat packet lost' and 'Virtual cluster member dead'.  When i 'execute log display' i get :

 

0 logs found

0 logs returned

 

I would expect this node to keep some logs locally since FortiAnalyzer was not reachable.  Is this not the case?

3 REPLIES 3
ede_pfau
Esteemed Contributor III

Well, I wonder what the reasoning behind CP's way to handle a split-brain is.

Split-brain will occur when the cluster members lose contact. The primary goal of a HA cluster is to maintain connectivity of it's networks. So, if I am a cluster member, and am fully synced, then my co-unit disappears, I will dutyfully declare myself the 'primary' and continue to serve my networks.

Of course, the other unit does the same reasoning and action.

So, how does Checkpoint obtain (additional) information on the situation to make one unit a primary, but shut down the second one? What if the second one is the only one? Network outage? Strange.

 

Now to your questions:

you can log to a FAZ and at the same time log into memory. I doubt you will see anything helpful though. You know you can re-enact the split-brain situation at any time, by pulling the HA link(s). You could watch the HA processes on the console port while being split, with

diag deb en

diag deb app hat -1

diag deb app has -1

 

BTW, I would expect to not being able to talk to the 2 members of a split-brain HA cluster in that situation, as both use the same IP and MAC addresses. So the recipient of any mgmt traffic would be unit 1 or unit 2 randomly.

 


Ede

"Kernel panic: Aiee, killing interrupt handler!"
Ede"Kernel panic: Aiee, killing interrupt handler!"
Pascal
New Contributor

Checkpoint verifies that both units have the same "HA Group" "HA ID" and "HA Password" and if they are set in A-P and both are primary, then it logs and sends an alert.   

I've used Fortinet terms for this in order to make sense to those who read this forum. 


vsahu
Staff
Staff

Hello pnobels,

 

You can check the ha status on both devices it can give an idea as when split bran occurred both devices should have been acting as primary. You can match the time it will have few entries to verify, else you can check the HA event logs from primary and secondary to get a better understanding (memory or disk log)

get system ha status

diag system ha history

Regards,
Vishal
Labels
Top Kudoed Authors