Dear Sir,
I have a Fortigate Cluster consisting of two units in the main center, and another Fortigate Cluster with two units in the backup center. There is also a single Fortigate unit at a branch location. IPSec VPN tunnels are established between the branch, main center, and backup center. They have implemented iBGP routing on these devices. I have observed that when the primary node in the main center fails, the BGP routes on the branch Fortigate device switch to the primary node in the backup center, and it takes several tens of seconds to switch back to the BGP routes on the main center node. Is it possible to configure the branch Fortigate device not to switch its BGP routes during the main center's HA failover? Alternatively, can the downtime during the switch be reduced to less than 5 seconds?
Nominating a forum post submits a request to create a new Knowledge Article based on the forum post topic. Please ensure your nomination includes a solution within the reply.
Hi Bruce,
Please see Technical Tip : Configuring FortiGate HA and BGP g... - Fortinet Community should be helpful.
Best regards,
Jin
I have referred to this article, and the following is my configuration. Please point out which part I still need to pay attention to. It seems that the routes learned by the branch Fortigate through BGP from the main center still disappear for several tens of seconds.
Based on the information provided in the link, "Note. When graceful-restart is enabled it will delay the time at which a real network/peer failure will be detected, and as a consequence this will end up in a downtime that can be as long as the graceful-restart-time." Does this mean that the routes on the branch Fortigate will always be disconnected and relearned?
Does this architecture differ in terms of VM or hardware devices?
I triggered the switch from FG-HQ2 to FG-HQ1.
Here are my log files. Please help me clarify the issue.
<Primary>FG_HQ2 # 2023-06-17 18:17:38 BGP: NSM Message Header
2023-06-17 18:17:38 BGP: 11.25.0.33-Outgoing [FSM] State: Established Event: 2
2023-06-17 18:17:38 BGP: BGP VRF 0 leaking 10.71.0.0/24 afi 1, safi 1
2023-06-17 18:17:38 BGP: VRF 0 NSM withdraw: 10.71.0.0/24
2023-06-17 18:17:38 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Down Peer being deleted"
2023-06-17 18:17:38 BGP: BGP VRF 0 leaking 172.16.11.0/24 afi 1, safi 1
2023-06-17 18:17:38 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor FGHQVPN-PEERS Down Peer being deleted"
2023-06-17 18:17:52 secondary succeeded to sync external files with primary
------------------------------------------------------------------------------------------------------------------------
<Secondary>FG_HQ1 # 2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Down Member added to peer group"
2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Down Member added to peer group"
2023-06-17 18:17:55 BGP: bgp_keepalive_proc: notif_rcv 4-4
2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Up "
diagnose debug disable
------------------------------------------------------------------------------------------------------------------------
FG_REMOTESITE #
2023-06-17 18:17:35 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:35 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 37 KAlive msg(s) sent
2023-06-17 18:17:37 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:17:37 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:17:37 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:17:39 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:39 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 38 KAlive msg(s) sent
2023-06-17 18:17:43 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:43 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 39 KAlive msg(s) sent
2023-06-17 18:17:48 BGP: [RIB] Scanning BGP Network Routes...
2023-06-17 18:17:48 BGP: [RIB] Scanning BGP RIB...
2023-06-17 18:17:48 BGP: [NSM] Verified NH 11.25.0.1 with NSM
2023-06-17 18:17:48 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:48 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 40 KAlive msg(s) sent
2023-06-17 18:17:51 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 10
2023-06-17 18:17:51 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 3
2023-06-17 18:17:51 BGP: %BGP-3-NOTIFICATION: sending to 11.25.0.1 4/0 (Hold Timer Expired/Unspecified Error Subcode) 0 data-bytes []
2023-06-17 18:17:51 BGP: BGP VRF 0 leaking 172.16.11.0/24 afi 1, safi 1
2023-06-17 18:17:51 BGP: VRF 0 NSM withdraw: 172.16.11.0/24
2023-06-17 18:17:51 BGP: [GRST] Timer Announce Defer: Check
2023-06-17 18:17:51 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.1 Down Hold Timer Expired"
2023-06-17 18:17:51 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.1 Down BGP Notification FSM-ERR"
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: Idle Event: 3
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [NETWORK] FD=26, Sock Status: 0-Success
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: Connect Event: 17
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 1
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Open: Ver 4 MyAS 65002 Holdtime 15
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Open: Msg-Size 71
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 1, length 71
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open: Optional param len 42
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 1, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 1, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 2
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 128, Cap Len 0
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: RR Cap(old) for all address-families
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 2
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 2, Cap Len 0
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: RR Cap(new) for all address-families
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 65, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 8
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 64, Cap Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Cap GR: Restart Flag On, Restart Time 15
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Cap GR: AFI/SAFI 1/1 Fwd-state Flag 1, action: Set
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: OpenSent Event: 19
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 41 KAlive msg(s) sent
2023-06-17 18:17:55 BGP: bgp_keepalive_proc: notif_rcv 4-4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: OpenConfirm Event: 26
2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.1 Up "
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 34
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 2
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [ENCODE] Attr IP-Unicast: Tot-attr-len 21
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [ENCODE] Update: Msg #3 Size 48
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 2
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [ENCODE] Update: Msg #4 Size 23
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 2, length 48
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [DECODE] Update: Starting UPDATE decoding... Bytes To Read (29), msg_size (29)
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [DECODE] Update: NLRI Len(4)
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 27
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [RIB] Update: Received Prefix 172.16.11.0/24 path_id 0
2023-06-17 18:17:56 BGP: BGP VRF 0 leaking 172.16.11.0/24 afi 1, safi 1
2023-06-17 18:17:56 BGP: VRF 0 NSM announce: 172.16.11.0/24
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 2, length 23
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [DECODE] Update: Starting UPDATE decoding... Bytes To Read (4), msg_size (4)
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 27
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [FSM] Update: IPv4 Unicast End-Of-Rib Marker Received
2023-06-17 18:17:56 BGP: 11.25.0.1-Outgoing [FSM] Process End-of-RIB: Received for afi/safi: 1/1
2023-06-17 18:17:59 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:17:59 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 42 KAlive msg(s) sent
2023-06-17 18:18:01 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:01 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:01 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:03 BGP: [RIB] Scanning BGP Network Routes...
2023-06-17 18:18:03 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:03 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 43 KAlive msg(s) sent
2023-06-17 18:18:06 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:06 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:06 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:08 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:08 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 44 KAlive msg(s) sent
2023-06-17 18:18:11 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:11 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:11 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:13 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:13 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 45 KAlive msg(s) sent
2023-06-17 18:18:15 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:15 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:15 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:18 BGP: [RIB] Scanning BGP Network Routes...
2023-06-17 18:18:18 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:18 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 46 KAlive msg(s) sent
2023-06-17 18:18:19 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:19 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:19 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:22 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 34
2023-06-17 18:18:22 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:22 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 47 KAlive msg(s) sent
2023-06-17 18:18:24 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:24 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:24 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:26 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:26 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 48 KAlive msg(s) sent
2023-06-17 18:18:28 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:28 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:28 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:31 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:31 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 49 KAlive msg(s) sent
2023-06-17 18:18:32 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:32 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:32 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:33 BGP: [RIB] Scanning BGP Network Routes...
2023-06-17 18:18:35 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:35 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 50 KAlive msg(s) sent
2023-06-17 18:18:37 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:37 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:37 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:40 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:40 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 51 KAlive msg(s) sent
2023-06-17 18:18:41 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:41 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:41 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:44 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:44 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 52 KAlive msg(s) sent
2023-06-17 18:18:45 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:45 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:45 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
diagnose debug disable
FG_REMOTESITE # 2023-06-17 18:18:48 BGP: [RIB] Scanning BGP Network Routes...
2023-06-17 18:18:48 BGP: [RIB] Scanning BGP RIB...
2023-06-17 18:18:48 BGP: [NSM] Verified NH 11.25.0.1 with NSM
2023-06-17 18:18:48 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:48 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 53 KAlive msg(s) sent
2023-06-17 18:18:49 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 34
2023-06-17 18:18:50 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:50 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:50 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
diagnose debug disable
FG_REMOTESITE # 2023-06-17 18:18:52 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:52 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 54 KAlive msg(s) sent
2023-06-17 18:18:55 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:55 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:55 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
2023-06-17 18:18:57 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 4
2023-06-17 18:18:57 BGP: 11.25.0.1-Outgoing [ENCODE] Keepalive: 55 KAlive msg(s) sent
2023-06-17 18:18:59 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 4, length 19
2023-06-17 18:18:59 BGP: 11.25.0.1-Outgoing [DECODE] KAlive: Received!
2023-06-17 18:18:59 BGP: 11.25.0.1-Outgoing [FSM] State: Established Event: 26
diagnose ip router bgp level none
FG_REMOTESITE #
Hi,
Yes, it is possible to optimize BGP route switching and HA failover with BGP over IPSEC in your FortiGate setup. Here are some steps you can follow to improve the failover time:
1. Configure BGP over IPSEC redundancy: In your FortiGate setup, you can configure BGP over IPSEC redundancy by creating multiple IPSec VPN tunnels between the main center, backup center, and branch location. By creating multiple tunnels, you can ensure that BGP routes are always available, even in the event of a tunnel or device failure.
2. Configure BGP maximum-paths: By default, FortiGate devices only use a single BGP path for a given destination. However, you can increase this value by configuring the BGP maximum-paths parameter. This will allow the FortiGate device to use multiple paths for a given destination, which can improve failover time.
3. Configure BGP graceful restart: BGP graceful restart is a mechanism that allows BGP peers to maintain their BGP sessions during a planned restart or failover. By enabling BGP graceful restart, you can minimize the impact of failover events on BGP routes.
4. Configure BFD: BFD (Bidirectional Forwarding Detection) is a protocol that provides fast failure detection for network paths. By configuring BFD on your FortiGate devices, you can detect and respond to failover events more quickly.
5. Configure BGP fast external fallover: BGP fast external fallover is a mechanism that allows BGP peers to quickly detect and respond to external link failures. By enabling BGP fast external fallover, you can minimize the impact of external link failures on BGP routes.
By following these steps, you can optimize BGP route switching and HA failover with BGP over IPSEC in your FortiGate setup. It is recommended to test these configurations in a lab environment before applying them to your production network.
I hope this helps! Let me know if you have any further questions or if there's anything else I can assist you with.
Dear Sir,
These seem more like Link Failover, but in our case, we are assuming Cluster Failover. We are just curious why BGP Pickup takes so long. In your experience, how many seconds would be considered normal for Cluster Failover? Based on what conditions?
Bruce Liu
Dear Sir,
If you noticed logs I posted on 17th June, when primary FG made a switchover to secondary FG, it disconnected BGP sesseion and it's own withdrawn BGP route in a flash, as shown below
2023-06-17 18:17:38 BGP: VRF 0 NSM withdraw: 10.71.0.0/24
2023-06-17 18:17:38 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Down Peer being deleted"
HOWEVER, the BGP session stay disconnected until secondary FG picked up it, it took 17 SECs if you look at below logs, it is way too long compared to which on route processer card switch over of routers like Cisco ASR9K, it picked up BGP session in very short period, and I think FG should acts like routers but it is not apparently.
2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Down Member added to peer group"
2023-06-17 18:17:55 BGP: bgp_keepalive_proc: notif_rcv 4-4
2023-06-17 18:17:55 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.33 Up "
On the other hand ,as you can see, the remote site FG was waiting for Keepalive message sent from primary/secondary FG at HQ, after missing 3 consecutive messages, it finally disconnected BGP session with HQ's FG, and attempted to re-form a new session by sending OPEN message,
the timestamp(2023-06-17 18:17:55) is JUST match which on HQ's FG when it's BGP session came back.
2023-06-17 18:17:51 BGP: %BGP-3-NOTIFICATION: sending to 11.25.0.1 4/0 (Hold Timer Expired/Unspecified Error Subcode) 0 data-bytes []
2023-06-17 18:17:51 BGP: BGP VRF 0 leaking 172.16.11.0/24 afi 1, safi 1
2023-06-17 18:17:51 BGP: VRF 0 NSM withdraw: 172.16.11.0/24
2023-06-17 18:17:51 BGP: [GRST] Timer Announce Defer: Check
2023-06-17 18:17:51 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.1 Down Hold Timer Expired"
2023-06-17 18:17:51 id=20300 logdesc="BGP neighbor status changed" msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 11.25.0.1 Down BGP Notification FSM-ERR"
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: Idle Event: 3
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [NETWORK] FD=26, Sock Status: 0-Success
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: Connect Event: 17
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Msg-Hdr: Type 1
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Open: Ver 4 MyAS 65002 Holdtime 15
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [ENCODE] Open: Msg-Size 71
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Msg-Hdr: type 1, length 71
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open: Optional param len 42
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 1, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 1, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 2
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 128, Cap Len 0
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: RR Cap(old) for all address-families
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 2
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 2, Cap Len 0
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: RR Cap(new) for all address-families
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 65, Cap Len 4
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Opt: Option Type 2, Option Len 8
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Open Cap: Cap Code 64, Cap Len 6
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Cap GR: Restart Flag On, Restart Time 15
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [DECODE] Cap GR: AFI/SAFI 1/1 Fwd-state Flag 1, action: Set
2023-06-17 18:17:55 BGP: 11.25.0.1-Outgoing [FSM] State: OpenSent Event: 19
As a consequence, why FG takes so long time to re-connect BGP session when HA failover happens ? From my perspective, to shorten BGP downtime,
it should send BGP Notification when HA failover by secondary FG to force disconnect all BGP sessions and send OPEN messages right away to re-connect BGP session as short as possible, instead of sending OPEN message by remote FG, it is the other way around.
Now, it follows 2 questions:
In order to minimize BGP route switching during main center HA failover, you can explore BGP route dampening techniques to prevent unnecessary route fluctuations. Additionally, optimizing your BGP timers, such as the BGP Last Longer hold time and keepalive intervals, can help reduce the downtime to less than 5 seconds. It's advisable to fine-tune these settings based on your network's requirements and performance considerations.
Select Forum Responses to become Knowledge Articles!
Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.
User | Count |
---|---|
1660 | |
1077 | |
752 | |
443 | |
220 |
The Fortinet Security Fabric brings together the concepts of convergence and consolidation to provide comprehensive cybersecurity protection for all users, devices, and applications and across all network edges.
Copyright 2024 Fortinet, Inc. All Rights Reserved.