This issue may occur during FGCP member failover or as a result of various routing changes, and it can lead to traffic outages due to misrouting. Follow the steps shared below for the verification of the scenario. Note: '10.0.0.1' is used as the test destination server IP in the test commands.
Check routing for the destination IP:#
test-fgt # get router info routing-table details 10.0.0.1
Routing table for VRF=0
Routing entry for 10.0.0.0/8
Known via "bgp", distance 20, metric 0, best
Last update 06w1d14h ago
* vrf 0 192.168.1.1 priority 1 (recursive is directly connected, V10)
* vrf 0 192.168.1.2 priority 1 (recursive is directly connected, V10)
Run debug flow.
diagnose debug flow filter addr 10.0.0.1
diagnose debug flow show iprope enable
diagnose debug flow show function-name enable
diagnose debug enable
diagnose debug flow trace start 100
Check the debug flow output, which shows traffic is forwarded to Vlan 200 instead of Vlan 10.
test-fgt # id=65308 trace_id=121 func=print_pkt_detail line=5894 msg="vd-root:0 received a packet(proto=6, 10.5.5.100:56547->10.0.0.1:1234) tun_id=0.0.0.0 from V100. flag [S], seq 2556196302, ack 0, win 64240"
id=65308 trace_id=121 func=init_ip_session_common line=6080 msg="allocate a new session-1a0bb231, tun_id=0.0.0.0"
id=65308 trace_id=121 func=iprope_dnat_check line=5281 msg="in-[V100], out-[]"
id=65308 trace_id=121 func=iprope_dnat_tree_check line=824 msg="len=0"
id=65308 trace_id=121 func=iprope_dnat_check line=5293 msg="result: skb_flags-02000000, vid-0, ret-no-match, act-accept, flag-00000000"
id=65308 trace_id=121 func=__vf_ip_route_input_rcu line=1990 msg="find a route: flag=00000000 gw-192.168.100.1 via V200"
id=65308 trace_id=121 func=iprope_fwd_check line=768 msg="in-[V100], out-[V200], skb_flags-02000000, vid-0, app_id: 0, url_cat_id: 0"
Check kernel routes.
test-fgt # get router info kernel
tab=254 vf=1 scope=0 type=1 proto=19 prio=2147483649 0.0.0.0/0.0.0.0/0->10.0.0.0/29 pref=0.0.0.0
gwy=192.168.100.1 flag=04 hops=0 oif=102(V200)
gwy=192.168.100.2 flag=04 hops=0 oif=102(V200)
It can be found that the routing table in Step 1 and FIB in Step 2 are different than each other and FortiGate is forwarding the traffic to V200 as shown in the debug flow output. Kernel routing proto is 19, which means 'HA Route on Secondary FortiGate'.
All the details about kernel routing can be found on the page Technical Tip: FortiGate - Viewing FIB/RIB routing information in CLI. Workaround:
Add a test route for the subnet seen in the kernel and delete it. If adding and removing routes does not resolve the issue, proceed with a reboot.
Fixed release:
The issue is fixed with the v7.4.8 GA release. |