Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
Alby23
Contributor II

File System Check in a HA environment

Hi all,

 

after a power outage an Active-Passive cluster of FGs shows the "File System Check" message at the login.

 

If I perform the File System Check, only one of the firewall will reboot and check for errors or I will have both of them offline? I'd like to avoid service interruption but I don't know if this is possible in this case.

 

2 Solutions
jhouvenaghel_FTNT

Hello,

 

If you had a power outage on your cluster, I will suppose that both units got this power outage.

 

Now if you connect to your cluster using the cluster mgmt IP address (shared between both units) , you connect in fact to your master unit. So logically the "File system Check"  message is reported by this unit and if you perform the FS check, then I would expect your master unit to reboot , so a failover would occur.

 

When the original master unit is rebooting, the original slave will take over as master and if you connect at that time, you may see the same message (FS check).

When the original master unit has rebooted, then depending on your HA settings (override enable and so on ...) this original unit may be elected as master again and you should no longer see the message because you connect to the master unit. The slave unit may still need the FS check.

 

If you connect to your cluster using the HA dedicated mgmt interface (single IP address per unit), then if you connect to your slave and see this message, then performing a FS check should reboot this slave.

View solution in original post

ede_pfau

...without causing a failover in this case.

 

But, as both cluster units will have been affected by the outage, you will need to fsck the master unit anyway.

To obtain the minimum number of failovers/network interrupts I'd

- disable any HA override on the master

- set both HA priorities the same

- connect locally to the slave and fsck (will reboot - no failover)

- connect to the master/cluster and fsck (will reboot - failover) and will stay slave after recovering

 

The amount of effort you are willing to invest depends on the sensitivity of your network, as always.


Ede

"Kernel panic: Aiee, killing interrupt handler!"

View solution in original post

Ede"Kernel panic: Aiee, killing interrupt handler!"
3 REPLIES 3
digimetrica
New Contributor

nobody answered, but May I know if you found the answer by other means? :)

I am interested in this issue since i have the same situation

jhouvenaghel_FTNT

Hello,

 

If you had a power outage on your cluster, I will suppose that both units got this power outage.

 

Now if you connect to your cluster using the cluster mgmt IP address (shared between both units) , you connect in fact to your master unit. So logically the "File system Check"  message is reported by this unit and if you perform the FS check, then I would expect your master unit to reboot , so a failover would occur.

 

When the original master unit is rebooting, the original slave will take over as master and if you connect at that time, you may see the same message (FS check).

When the original master unit has rebooted, then depending on your HA settings (override enable and so on ...) this original unit may be elected as master again and you should no longer see the message because you connect to the master unit. The slave unit may still need the FS check.

 

If you connect to your cluster using the HA dedicated mgmt interface (single IP address per unit), then if you connect to your slave and see this message, then performing a FS check should reboot this slave.

ede_pfau

...without causing a failover in this case.

 

But, as both cluster units will have been affected by the outage, you will need to fsck the master unit anyway.

To obtain the minimum number of failovers/network interrupts I'd

- disable any HA override on the master

- set both HA priorities the same

- connect locally to the slave and fsck (will reboot - no failover)

- connect to the master/cluster and fsck (will reboot - failover) and will stay slave after recovering

 

The amount of effort you are willing to invest depends on the sensitivity of your network, as always.


Ede

"Kernel panic: Aiee, killing interrupt handler!"
Ede"Kernel panic: Aiee, killing interrupt handler!"
Labels
Top Kudoed Authors