Skip to main content
Alby23
New Member
November 22, 2016
Solved

File System Check in a HA environment

  • November 22, 2016
  • 2 replies
  • 13684 views

Hi all,

 

after a power outage an Active-Passive cluster of FGs shows the "File System Check" message at the login.

 

If I perform the File System Check, only one of the firewall will reboot and check for errors or I will have both of them offline? I'd like to avoid service interruption but I don't know if this is possible in this case.

 

    Best answer by ede_pfau

    ...without causing a failover in this case.

     

    But, as both cluster units will have been affected by the outage, you will need to fsck the master unit anyway.

    To obtain the minimum number of failovers/network interrupts I'd

    - disable any HA override on the master

    - set both HA priorities the same

    - connect locally to the slave and fsck (will reboot - no failover)

    - connect to the master/cluster and fsck (will reboot - failover) and will stay slave after recovering

     

    The amount of effort you are willing to invest depends on the sensitivity of your network, as always.

    2 replies

    digimetrica
    New Member
    March 14, 2017

    nobody answered, but May I know if you found the answer by other means? :)

    I am interested in this issue since i have the same situation

    jhouvenaghel_FTNT
    Staff
    Staff
    March 14, 2017

    Hello,

     

    If you had a power outage on your cluster, I will suppose that both units got this power outage.

     

    Now if you connect to your cluster using the cluster mgmt IP address (shared between both units) , you connect in fact to your master unit. So logically the "File system Check"  message is reported by this unit and if you perform the FS check, then I would expect your master unit to reboot , so a failover would occur.

     

    When the original master unit is rebooting, the original slave will take over as master and if you connect at that time, you may see the same message (FS check).

    When the original master unit has rebooted, then depending on your HA settings (override enable and so on ...) this original unit may be elected as master again and you should no longer see the message because you connect to the master unit. The slave unit may still need the FS check.

     

    If you connect to your cluster using the HA dedicated mgmt interface (single IP address per unit), then if you connect to your slave and see this message, then performing a FS check should reboot this slave.

    ede_pfau
    SuperUser
    ede_pfauAnswer
    SuperUser
    March 14, 2017

    ...without causing a failover in this case.

     

    But, as both cluster units will have been affected by the outage, you will need to fsck the master unit anyway.

    To obtain the minimum number of failovers/network interrupts I'd

    - disable any HA override on the master

    - set both HA priorities the same

    - connect locally to the slave and fsck (will reboot - no failover)

    - connect to the master/cluster and fsck (will reboot - failover) and will stay slave after recovering

     

    The amount of effort you are willing to invest depends on the sensitivity of your network, as always.