Created on 02-03-2010 01:19 AM Edited on 11-22-2023 01:34 AM By Jean-Philippe_P
Description
This article describes the configuration that needs to be applied to a FortiGate HA cluster and the BGP settings so that each router (the FortiGate and its peer(s)) will keep the BGP routes in their routing table(s) to avoid traffic interruption during an HA failover.
Scope
FortiGate running in NAT and HA mode.
Solution
Expectations, Requirements.
HA cluster with one or more BGP peers will failover without traffic interruption.
Configuration:
On a FortiGate HA cluster, the BGP router daemon process is only running on the Primary (Master) unit. When there is an HA failover a new BGP process will be launched on the newly elected master.
Even though the FortiGate has all the routes, if the peer sees the FortiGate as unresponsive, it will remove all the routes from its routing table and traffic will be interrupted :
Therefore what needs to be done to avoid traffic interruption can be divided into three parts, which are detailed later :
This can be achieved with BGP graceful restart. 'Graceful Restart' is a BGP capability. It is an internet standard defined in RFC 4724. This capability needs to be configured on both peers.
In order to be effective on FortiGate, Graceful Restart needs to be enabled at both :
Configuration snapshot:
config router bgp
set as 65111
set graceful-restart enable
config neighbor
edit "10.2.3.4"
set capability-graceful-restart enable
set remote-as 65000
set weight 20
next
end
config network
edit 1
set prefix 172.31.0.0 255.255.0.0
next
end
end
When the FortiGate is configured in an HA cluster, all the routes will be synchronized to the slave devices. The synchronized routes on the slave will have a limited lifetime and a lower priority.
The lifetime of these routes can be configured through the 'route-ttl' parameter in system ha configuration:
config system ha
set route-ttl 30
end
The default value is 10.
Below are the main timers that can be tuned:
holdtime-timer (default 180): Number of seconds to mark the peer as dead.
This is the number of seconds to wait between keepalive, update, or notification messages before considering the connection to the peer as closed.
graceful-restart-time(default 120): Time needed for neighbors to restart (sec).
This is the number of seconds to wait for the OPEN message before removing the stale routes
graceful-restart-time should be less or equal to the holdtime-timer.
graceful-update-delay (default 120): Route advertisement/selection delay after restart.
After an HA failover, the route populated on the new master would be delayed based on the timer applied against this setting.
graceful-stalepath-time(default 360): Time to hold stale paths of restarting neighbor (sec).
The total maximum time that a stale route should be kept before being deleted.
CLI Syntax:
config router bgp
set graceful-restart enable
set graceful-restart-time <integer value> --> graceful-restart-time, Enter an integer value from <1> to <3600> (default = <120>).
set graceful-stalepath-time <integer value> --> graceful-stalepath-time, Enter an integer value from <1> to <3600> (default = <360>).
set holdtime-timer <integer value> --> holdtime-timer, Enter an integer value from <3> to <65535> or (special = <0>) (default = <180>).
set graceful-update-delay <integer value> --> graceful-update-delay, Enter an integer value from <1> to <3600> (default = <120>).
end
Consider tuning these counters with the two following criteria:
Note:
When graceful-restart is enabled it will delay the time at which a real network/peer failure will be detected, and as a consequence, this will end up in a downtime that can be as long as the graceful-restart-time.
Therefore it is important that those timers be configured to a value that suits the network requirements. Also, do not expect that after failover is finished, BGP peering will continue to work with uptime as on the previous primary device. BGP will be re-established so BGP 'flap' will be visible. But this is expected behavior.
Verification.
In the output of the CLI commands:
FGT # get router info bgp neighbor a.b.c.d
Check the graceful restart capabilities :
For address family: IPv4 Unicast:
BGP table version 1, neighbor version 0
Index 1, Offset 0, Mask 0x2
AF-dependant capabilities:
Graceful restart: advertised, received
Check timers in the CLI command:
FGT # get router info bgp neighbor a.b.c.d
Routes in the FIB can be validated by using the below command:
diag ip route list
or:
get router info kernel
Troubleshooting:
A packet capture taken with the BGP peer IP would be helpful.
Check other BGP-related information with :
FGT # get router info bgp neighbor
FGT # get router info bgp summary
FGT # get router info bgp network
FGT # get router info routing-table database
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.