Skip to main content
nalexiou
Staff & Editor
Staff & Editor
November 12, 2024

Technical Tip: How to investigate BGP neighborship flap

  • November 12, 2024
  • 0 replies
  • 6758 views
Description This article describes how to troubleshoot BGP interruptions.
Scope FortiGate.
Solution

The packet that is sent to tear down the neighborship is the Notification packet and includes information why the action was taken.

 

In case only a flap was observed and the BGP neighborship is stable, the Router event logs can be checked via GUI under Log & Report -> System Events -> Router Events.

 

As a filter, LOG ID 20304 can be used:

 

KB1.PNG

 

In this example, logs indicate different reasons why the neighborship was torn down, e.g., Hold Timer Expired, Administratively shutdown.

 

When the neighborship is not stable and flaps are still occurring, live troubleshooting can be performed to identify what is causing the issue.

 

To debug the BGP process:

 

diagnose ip router bgp all enable

diagnose ip router bgp level info

diagnose debug enable

 

To disable debugs:

 

diagnose ip router bgp all disable

diagnose ip router bgp level none

diagnose debug reset

 

Note:

Starting from v7.2.0+, it is possible to collect BGP debugs for a specific neighbor by using the filter command 'diagnose ip router bgp set-filter neighbor <neighbor address>'. For more information, see Technical Tip: Capture BGP debugs for a specific neighbor.

 

In the debug output, the notification packets can be identified, and the reason is displayed:

 

BGP: %BGP-3-NOTIFICATION: received from 10.191.19.33 6/2 (Cease/Administratively Shutdown.) 0 data-bytes []

BGP: 10.191.19.33-Outgoing [FSM] State: Established Event: 25

BGP: 10.191.19.33-Outgoing [FSM] BGP Notification received

id=20300 msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 10.191.19.33 Down BGP Notification FSM-ERR"

 

BGP: %BGP-3-NOTIFICATION: sending to 10.191.19.33 4/0 (Hold Timer Expired/Unspecified Error Subcode) 0 data-bytes []

id=20300 msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 10.191.19.33 Down Hold Timer Expired"

id=20300 msg="BGP: %BGP-5-ADJCHANGE: VRF 0 neighbor 10.191.19.33 Down BGP Notification FSM-ERR"

 

The notification packet can also be analyzed in packet capture in the Wireshark format:


notification.PNG

 

To capture the BGP packets, use port 179 as a filter. See Troubleshooting Tip: Packet Capture on FortiOS GUI.

 

By collecting the BGP debug and the packet capture while the flaps are occurring, additional analysis can be performed on the packets exchanged between the peers to identify the cause of the issue. 

 

Note:

  • The message 'Down BGP Notification FSM-ERR' indicates a BGP adjacency change event, specifically that a BGP neighbor has gone down due to a BGP Notification FSM (Finite State Machine) error.
  • Different problems can be notified as FSM Error:
  1. 'Hold timer Expired' is usually associated with packet loss or MTU issues between the BGP peers.
  2. An unexpected message may be received in a particular FSM state.
  3. Some problem of underlying TCP connection (Network Problems).
  • The message 'Hold Timer Expired/Unspecified Error Subcode' typically arises when the hold timer expires, which means that a keepalive message was not received in the expected time frame.