FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
nalexiou
Staff
Staff

Purpose


The purpose of this document is to describe how to configure the FortiGate HA cluster and the OSPF settings (using the graceful-restart feature) so that each router (the FortiGate and its peer(s)) will keep the OSPF routes in their routing table(s) to avoid traffic interruption during an HA failover.


Scope

 

- All FortiOS.
- FortiGate running in NAT and HA mode.


Diagram


Expectations, Requirements
HA cluster with one or more OSPF peers will failover without traffic interruption.
Configuration
On a FortiGate HA cluster, the OSPF router daemon process is only running on the Primary (Master) unit. When there is an HA failover a new OSPF process will be launched on the newly elected master.

Even though the FortiGate has all the routes, if the peer sees the FortiGate as unresponsive, it will remove all the route from its routing table and traffic will be interrupted :

Therefore what needs to be done to avoid traffic interruption  can be divided in three parts, which are detailed later :
1) Check that remote peer will not delete the routes.
2) Check that the FortiGate cluster will keep the OSPF routes in the kernel ('# get router info kernel' command)
.
3) Fine tune timers.

1) Check that remote peer will not delete the routes.

This can be achieved with OSPF graceful restart. 'Graceful Restart' is a OSPF capability. It is an Internet standard defined in RFC 4724. This capability needs to be configured on both peers.

Configuration snapshot.

# config router ospf
    # config area
       
set router-id 30.1.1.2
        set restart-mode graceful-restart
            # config area
                edit 0.0.0.0
            next

       
end

        # config network
            edit 1
                set prefix 30.1.1.0 255.255.255.0
            next
            edit 2
                set prefix 60.1.1.0 255.255.255.0
            next
        end

        # config redistribute "connected"
        end
        # config redistribute "static"
        end
        # config redistribute "rip"
        end
        # config redistribute "bgp"
        end
        # config redistribute "isis"
        end

 

2) Check that the FortiGate cluster will keep the OSPF routes in the kernel (# get router info kernel' command).

When the FortiGate is configured in an HA cluster, all the routes will be synchronized to the slave units.
The synchronized routes on the slave will have a limited lifetime and a lower priority.

The lifetime of these routes can be configured through the 'route-ttl', 'route-wait' and 'route-hold' parameter in system ha configuration.

 

# config system ha
    set route-ttl 60
    set route-wait 60
    set route-hold 60
 end

 

route-ttl:
Controls how long HA routes are kept in the FIB of a clustr unit after it has been promoted Master
The route-ttl range is 5 to 3600 seconds. The default route-ttl time is 10 seconds.

route-wait:
The time the primary unit waits after receiving routing table update before sending the update to the subordinate units in the cluster.
The route-wait range is 0 to 3600 seconds. The default route-wait is 0 seconds.

route-hold:

The time that the primary unit waits between sending routing table updates to subordinate units in a cluster.
The route hold range is 0 to 3600 seconds. The default route hold time is 10 seconds.

3) Fine tuning timers

There are other main timers that can be tuned :

restart-period(default 120) :
Time needed for neighbours to restart(sec)

This is the number of seconds to wait for the HELLO Message before removing the routes.
restart-period should be less or equal to the route-ttl

Consider tuning  these counters with the two following criteria :
- Time you want to detect a real OSPF peer failure
- Maximum time allowed for a restart time

Note.
When graceful-restart is enabled it will delay the time at which a real network/peer failure will be detected, and as a consequence this will end up in a down time that can be as long as the route-ttl

Therefore it is important that those timers be configured to a value that suits to the network requirements.

Verification
In output of  the CLI commands :
FGT# get router info ospf status
Check the graceful restart capabilities :
 Routing Process "ospf 0" with ID 30.1.1.1
 Process uptime is 1 minute
 Process bound to VRF default
 Conforms to RFC2328, and RFC1583Compatibility flag is disabled
 Supports only single TOS(TOS0) routes
 Supports opaque LSA
 Supports Graceful Restart
 SPF schedule delay 5 secs, Hold time between two SPFs 10 secs
 Refresh timer 10 secs

 

Check timers in the CLI command  :
FGT# get router info ospf neighbor a.b.c.d
OSPF process 0:
 Neighbor 20.1.1.2, interface address 20.1.1.2
    In the area 0.0.0.0 via interface wan2
    Neighbor priority is 1, State is Full, 6 state changes
    DR is 20.1.1.2, BDR is 20.1.1.1
    Options is 0x42 (*|O|-|-|-|-|E|-)
    Dead timer due in 00:00:32
    Neighbor is up for 00:03:18



Troubleshooting
Check other OSPF related information with :

FGT# get router info ospf neighbor
FGT# get router info ospf status
FGT# get router info ospf route
FGT# get router info routing-table database
More in deep:
# diag ip router ospf level info
# diag ip router ospf all enable

Related Articles

Technical Note : Configuring FortiGate HA and BGP graceful-restart to avoid traffic interruption dur...

Controlling how HA synchronizes routing table updates