Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
chr1zzo
New Contributor II

OnPrem Data Gateway looses Connection

Hey all,

 

my on Prem Data Gateway connects to Office 365 via 443(no inspection), but after few minutes it looses connection. 

The issue is definitely the forti. With a direct uplink the connection is persistent.

 

How can i debug? What kind of trace can i perform to find the issue? Any direct ideas?

 

Thanks and best regards

1 Solution
hbac

@chr1zzo,

 

Are on Prem Data Gateway connected directly to the FortiGate or through a switch. You can connect directly to rule out switch issue. 

 

Are you using SDWAN? It would be useful to connect packet capture when it is disconnecting. 

 

Regards, 

View solution in original post

5 REPLIES 5
hbac
Staff
Staff

Hi @chr1zzo,

 

What is the FortiGate firmware version? Do you notice high CPU/Memory when the connection is lost. Does it only happen to Office 365? 

 

Regards, 

chr1zzo
New Contributor II

Hi @hbac ,

 

There are two Forti's (601F) in an active passive cluster.

 

The Firmware is "Version: FortiGate-601F v7.0.13,build0566,231024 (GA.M)"

 

There are no CPU/Memory Peaks i noticed.

CPU states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU0 states: 1% user 0% system 0% nice 99% idle 0% iowait 0% irq 0% softirq
CPU1 states: 0% user 1% system 0% nice 99% idle 0% iowait 0% irq 0% softirq
CPU2 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU3 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU4 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU5 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU6 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU7 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU8 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU9 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU10 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
CPU11 states: 0% user 0% system 0% nice 100% idle 0% iowait 0% irq 0% softirq
Memory: 16394768k total, 4162252k used (25.4%), 11440372k free (69.8%), 792144k freeable (4.8%)
Average network usage: 68861 / 68984 kbps in 1 minute, 69618 / 69476 kbps in 10 minutes, 66378 / 66171 kbps in 30 minutes
Maximal network usage: 163840 / 165348 kbps in 1 minute, 244627 / 242009 kbps in 10 minutes, 244627 / 246152 kbps in 30 minutes
Average sessions: 7916 sessions in 1 minute, 6863 sessions in 10 minutes, 6278 sessions in 30 minutes
Maximal sessions: 8424 sessions in 1 minute, 8434 sessions in 10 minutes, 8434 sessions in 30 minutes
Average session setup rate: 81 sessions per second in last 1 minute, 91 sessions per second in last 10 minutes, 90 sessions per second in last 30 minutes
Maximal session setup rate: 150 sessions per second in last 1 minute, 207 sessions per second in last 10 minutes, 256 sessions per second in last 30 minutes
Average NPU sessions: 1005 sessions in last 1 minute, 855 sessions in last 10 minutes, 900 sessions in last 30 minutes
Maximal NPU sessions: 1062 sessions in last 1 minute, 1113 sessions in last 10 minutes, 1390 sessions in last 30 minutes
Average nTurbo sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 0 sessions in last 30 minutes
Maximal nTurbo sessions: 0 sessions in last 1 minute, 0 sessions in last 10 minutes, 1 sessions in last 30 minutes
Virus caught: 0 total in 1 minute
IPS attacks blocked: 0 total in 1 minute

Office 365 is the only mentioned so far. The is an/are established sessions, but it seems no to be there

 

session info: proto=6 proto_state=06 duration=8 expire=2 timeout=3600 flags=00000000 socktype=0 sockport=0 av_idx=0 use=4
origin-shaper=
reply-shaper=
per_ip_shaper=
class_id=0 ha_id=0 policy_dir=0 tunnel=/ vlan_cos=0/255
state=log may_dirty npu synced f00
statistic(bytes/packets/allow_err): org=563/7/1 reply=2367/3/1 tuples=2
tx speed(Bps/kbps): 0/0 rx speed(Bps/kbps): 0/0
orgin->sink: org pre->post, reply pre->post dev=59->9/9->59 gwy=x.x.x.x/0.0.0.0
hook=post dir=org act=snat x.x.x.x:51566->x.x.x.x:80(x.x.x.x:51566)
hook=pre dir=reply act=dnat x.x.x.x:80->x.x.x.x:51566(x.x.x.x:51566)
pos/(before,after) 0/(0,0), 0/(0,0)
src_mac=x:x:x:x:x:x
misc=0 policy_id=22 pol_uuid_idx=1181 auth_info=0 chk_client_info=0 vd=0
serial=013d1fda tos=ff/ff app_list=0 app=0 url_cat=0
rpdb_link_id=80000000 ngfwid=n/a
npu_state=0x4000c00 ofld-O ofld-R
npu info: flag=0x81/0x81, offload=9/9, ips_offload=0/0, epid=130/176, ipid=176/130, vlan=0x0028/0x0000
vlifid=176/130, vtag_in=0x0028/0x0000 in_npu=1/1, out_npu=1/1, fwd_en=0/0, qid=6/4

 

session info: proto=6 proto_state=06 duration=9 expire=2 timeout=3600 flags=00000000 socktype=0 sockport=0 av_idx=0 use=4
origin-shaper=
reply-shaper=
per_ip_shaper=
class_id=0 ha_id=0 policy_dir=0 tunnel=/ vlan_cos=0/255
state=log may_dirty npu synced f00
statistic(bytes/packets/allow_err): org=707/10/1 reply=2367/3/1 tuples=2
tx speed(Bps/kbps): 0/0 rx speed(Bps/kbps): 0/0
orgin->sink: org pre->post, reply pre->post dev=59->9/9->59 gwy=x.x.x.x/0.0.0.0
hook=post dir=org act=snat x.x.x.x:51563->x.x.x.x:80(x.x.x.x:51563)

hook=pre dir=reply act=dnat x.x.x.x:80->x.x.x.x:51563(x.x.x.x:51563)
pos/(before,after) 0/(0,0), 0/(0,0)
src_mac=x:x:x:x:x:x
misc=0 policy_id=22 pol_uuid_idx=1181 auth_info=0 chk_client_info=0 vd=0
serial=013d1fc9 tos=ff/ff app_list=0 app=0 url_cat=0
rpdb_link_id=80000000 ngfwid=n/a
npu_state=0x4000c00 ofld-O ofld-R
npu info: flag=0x81/0x81, offload=9/9, ips_offload=0/0, epid=130/176, ipid=176/130, vlan=0x0028/0x0000
vlifid=176/130, vtag_in=0x0028/0x0000 in_npu=1/1, out_npu=1/1, fwd_en=0/0, qid=1/6

 

session info: proto=6 proto_state=01 duration=178 expire=3595 timeout=3600 flags=00000000 socktype=0 sockport=0 av_idx=0 use=4
origin-shaper=
reply-shaper=
per_ip_shaper=
class_id=0 ha_id=0 policy_dir=0 tunnel=/ vlan_cos=0/255
state=log may_dirty npu synced f00
statistic(bytes/packets/allow_err): org=3999/31/1 reply=13522/37/1 tuples=2
tx speed(Bps/kbps): 23/0 rx speed(Bps/kbps): 47/0
orgin->sink: org pre->post, reply pre->post dev=59->9/9->59 gwy=x.x.x.x/0.0.0.0
hook=post dir=org act=snat x.x.x.x:51558->x.x.x.x:443(x.x.x.x:51558)
hook=pre dir=reply act=dnat x.x.x.x:443->x.x.x.x:51558(x.x.x.x:51558)
pos/(before,after) 0/(0,0), 0/(0,0)
src_mac=x:x:x:x:x:x
misc=0 policy_id=22 pol_uuid_idx=1181 auth_info=0 chk_client_info=0 vd=0
serial=013cc7ef tos=ff/ff app_list=0 app=0 url_cat=0
rpdb_link_id=80000000 ngfwid=n/a
npu_state=0x4000c00 ofld-O ofld-R
npu info: flag=0x81/0x81, offload=9/9, ips_offload=0/0, epid=130/176, ipid=176/130, vlan=0x0028/0x0000
vlifid=176/130, vtag_in=0x0028/0x0000 in_npu=1/1, out_npu=1/1, fwd_en=0/0, qid=7/6
total session 3

chr1zzo
New Contributor II

i have seen something strange while performing an debug flow trace:

id=20085 trace_id=5996 func=init_ip_session_common line=6117 msg="allocate a new session-013b9d00, tun_id=0.0.0.0"
id=20085 trace_id=5996 func=__vf_ip_route_input_rcu line=1999 msg="find a route: flag=00000000 gw-x.x.x.x via port3"
id=20085 trace_id=5996 func=get_new_addr line=1227 msg="find SNAT: IP-x.x.x.x(from IPPOOL), port-51547"
id=20085 trace_id=5996 func=fw_forward_handler line=888 msg="Allowed by Policy-22: SNAT"
id=20085 trace_id=5996 func=ip_session_confirm_final line=3185 msg="npu_state=0x4000000, hook=4"
id=20085 trace_id=5996 func=__ip_session_run_tuple line=3529 msg="SNAT x.x.x.x->x.x.x.x:51547"
[2062] fap_fsw_lst_req: buf of https is too small: 581

 

792831

[2062] fap_fsw_lst_req: buf of https is too small: 853 debug message appears in console when upgrading to certain builds.

 

But i can't connect it. Can this be an issue?

hbac

@chr1zzo,

 

Are on Prem Data Gateway connected directly to the FortiGate or through a switch. You can connect directly to rule out switch issue. 

 

Are you using SDWAN? It would be useful to connect packet capture when it is disconnecting. 

 

Regards, 

chr1zzo
New Contributor II

Got it. 

There are several Forti Switches behind managed by the Fortigate and i think someone configured LACP directly on the Switches and the config got lost. 

 

So it was a mix of misconfigured/missing LACP config and Spanning Tree.

 

Thanks a lot for the hint. Would have taken a while to disscover.

Labels
Top Kudoed Authors