FortiSwitch
FortiSwitch: secure, simple and scalable Ethernet solutions
riteshpv
Staff
Staff
Article Id 332740
Description This article describes how to resolve a scenario which leads to FortiLink instability.
Scope Any supported FortiSwitch model.
Solution
One potential cause of FortiLink instability is network loops or configuration errors in the FortiSwitch or FortiGate, which may result in high CPU usage, packet loss, or unstable connectivity between network devices.
 
While the issue is observed, the following message will repeat in the FortiGate event log:
 
msg="FortiLink: internal echo reply timing out echo-miss(10)"
 
FortiSwitch logs will show the following message repeatedly:
 
action=caputp-disconnect msg="Switch-Controller: CAPWAP Tunnel down with event 26 old_state 13"
msg="FortiLink: port31 echo reply timing out echo-miss(10)"
msg="FortiLink: port31 left Fortigate-uplink"
msg="FortiLink: port31 joined Fortigate-uplink trunk-id(7)"
event="stp disabled on interface" msg="user FortiLink disabled STP on primary interface GT81FTK300XXXXX"
 
In this scenario, a loop in the network is caused by a physical loop, leading to FortiLink instability.
 
Topology:
 
topology-loop.jpg
 
As shown above, a physical loop is made on SW2 on port5 and port6.
 
Due to this, the line rate spiked on port5 and port6 for TX and RX. 
 
Additionally, all of the ports' data rates have spiked under TX, which indicates the network is experiencing a loop.
 
The following output is from SW2:
 
diagnose switch physical-port linerate up
 
Rate Display Mode: DATA_RATE
Port       |  TX Packets          |  TX Rate        ||  RX Packets          |  RX Rate        |
-----------------------------------------------------------------------------------------------
     port1 |            590655166 |    77.3760 Mbps ||              5841197 |     0.0670 Mbps |
     port2 |            580928153 |    77.2803 Mbps ||                18416 |     0.0000 Mbps |
     port3 |           3135152434 |    77.3992 Mbps ||               225659 |     0.0877 Mbps |
     port5 |           2437815563 |    38.6389 Mbps ||           2437862039 |    38.6338 Mbps |  
     port6 |           2437922339 |    38.6388 Mbps ||           2437875859 |    38.6390 Mbps |  
    port20 |            297318967 |     3.8312 Mbps ||                    0 |     0.0000 Mbps |
    port31 |            580846802 |    77.2800 Mbps ||                  178 |     0.0000 Mbps |
    port33 |            562374882 |    72.7812 Mbps ||               825446 |     0.7287 Mbps |
    port43 |            598784937 |    77.9103 Mbps ||             39351059 |     1.9095 Mbps |
    port44 |            554430250 |    72.9052 Mbps ||               220688 |     0.0000 Mbps |
    port47 |            564256009 |    77.2800 Mbps ||               218465 |     0.0000 Mbps |
    port48 |            603631938 |    78.8473 Mbps ||             39252313 |     4.2850 Mbps |
  internal |             31412326 |     0.4811 Mbps ||               263017 |     0.0023 Mbps |
-----------------------------------------------------------------------------------------------
                                  |   770.6495 Mbps ||                      |    84.3529 Mbps |
   
However, to understand which FortiSwitch is experiencingthe loop or causing the issue, validate the line rate and note which has the high RX rate. 
 
The following output is from SW1:
 
diagnose switch physical-port linerate up
 
Rate Display Mode: LINE_RATE
Port       |  TX Packets          |  TX Rate        ||  RX Packets          |  RX Rate        |
-----------------------------------------------------------------------------------------------
     port1 |              1203866 |     0.0589 Mbps ||               607952 |     0.0572 Mbps |
     port2 |               926709 |     0.0096 Mbps ||               271735 |     0.0066 Mbps |
    port15 |               946147 |    10.1138 Mbps ||               555959 |     0.6932 Mbps |
    port31 |           2317423405 |    95.4970 Mbps ||             11871670 |     0.0005 Mbps |
    port33 |             25184668 |     6.8820 Mbps ||           2354407837 |   111.5847 Mbps | 
    port34 |               495277 |     0.0681 Mbps ||               237767 |     0.2867 Mbps |
    port37 |           2317423405 |    95.4970 Mbps ||             11871670 |     0.0005 Mbps |
    port38 |           2326454809 |    95.4972 Mbps ||              6727966 |     0.0004 Mbps |
    port39 |           2314043581 |    95.4972 Mbps ||               221024 |     0.0005 Mbps |
    port40 |           2317457156 |    95.4972 Mbps ||              2477727 |     0.0004 Mbps |
    port41 |           2315731504 |    95.6839 Mbps ||              1457394 |     0.0005 Mbps |
    port42 |           2317193800 |    95.4973 Mbps ||              2085409 |     0.0004 Mbps |
  internal |               178992 |     0.0328 Mbps ||               256514 |     0.0321 Mbps |
-----------------------------------------------------------------------------------------------
                                  |   703.1987 Mbps ||                      |   130.0114 Mbps |
  
Here, port33 of SW1 is connected to port33 of SW2. Since SW2 has the loop, the RX rate for port33 is high and multiple ports on the SW1 have a similarly high data rate under the TX rate.
 
Verify all connected switches on a FortiSwitch by running the following command on it:
 
get switch lldp neighbors-summary
get switch lldp neighbors-detail <port-no>
 
Upon discovering the FortiSwitch causing issues, disconnect the loop.
 
As a safety measure, enable loop-guard on these ports:
 
config switch interface
edit <interface_name>
set loop-guard {enabled | disabled}
next
end