Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
andybarker
New Contributor II

FortiOS 7.4.2 Bug Causes IPsec VPN Tunnel Phase 2 Instability

I have had many site-to-site IPsec tunnels working fine for several years until I upgraded to FortiOS 7.4.2. Shortly afterward, my tunnels began dropping connections on random Phase 2 connections. I have had to bring down the phases or entire tunnel to get traffic flowing again many times. I opened a ticket with Fortinet and had three technicians working with me at various times but none found a solution.

 

I finally downgraded to 7.4.1 and all my problems went away. There is obviously a bug in 7.4.2 and I hope Fortinet finds and acknowledges it and fixes it for the next release.

61 REPLIES 61
BillH_FTNT
Staff
Staff

Hi andybarker,

 

Please help to share more details about your network and device. What is your device version?  What kind of traffic is in your VPN tunnel? if it is okay for you, pls share the ticket number we can access to take a look at your configuration. Thanks

Regards

Bill

andybarker

Thanks for your reply, Bill. We have two 200Fs in HA Active-Passive mode on both ends of the tunnel. This tunnel had been in place and working flawlessly for over two years. I worked with three different support techs on ticket 9118289. They tried many things, including disabling DPD and replay detection on the Phase 2s. They also disabled NPU offload. None of that helped at all. Only downgrading to 7.4.1 fixed everything.

BillH_FTNT

Hi Andybarker

 

1. To check any replay error counters in NP6 (depending on your NP, you can type 2 commands for sure)

diag npu np6 dce 0

dia npu np6xlite dce 0

 2. Check RX error (rxe) reported at your IPsec interface. Check "rxe" counters at "stat"

           diag netlink interface list xxx

 3. You can test disable Anti-Replay :

config vpn ipsec phase2(-interface)

 edit …

 set replay { enable* | disable } 

 next

end

4. To check anti-replay , you can use the command below, with replaywin=0, which means it is disabled, and different 0, which means it is still enabled.

diag vpn tunnel list name “xxx”  

 

5. You can try disabling ipsec-inbound-cache. 

config system npu

 set ipsec-inbound-cache disable

End

 6.Try to reduce MTU 

FGT # config system interface

FGT (interface) # edit vpninterface

FGT (vpninterface) # set mtu-override enable

FGT (vpninterface) # set mtu 1300

FGT (vpninterface) # end

 

Let's try to apply these on non-business time. Then, monitor the result and roll back after testing.

HTH

Bill

BrettAtNOAH

We have Fortigate 200Fs at all of our locations.  We recently upgraded to 7.4.3 and are having the same VPN tunnel issues.

 

What's worse is that I don't have any npu configuration options in the CLI at all.  I also can't enable diagnose commands.  It errors every time I try to enable it.  So I can't make any of your suggested changes.

wendelin
New Contributor II

Hi andybarker, hi Bill,

i experienced the exact same behaviour on an FG101F running OS 7.4.2 talking to a Sophos appliance. To restart data flow if have to bring down the stalled Phase2 tunnel manually (!); tunnel restarts automatically.

The FG didnt recognized the stopped data flow by itself; there is no automatic detection/restart and the subtunnel is shown as green until manual stop.

We have 5 Phase2 tunnels active at that connection, only the most used one stops working from time to time. The duration between the stops vary from a minute to serveral hours.

Same behaviour have been seen (but only a few times) on an other ipsec tunnel talking to an FG60E (OS 7.0.13)

regards

Michael

maulishshah

@wendelin, can you please confirm if are you still on version 7.4.2? 

 

If yes, could you please disable the npu-offloading on the affected phase2 and monitor whether the issue persist

 

config vpn ipsec phase2-interface

edit <name of phase2>

     set npu-offload disable

end

 

By following changes, the issue has been resolved it means the issue with the NPU and we need to collect the logs for the same. 

Maulish Shah
andybarker
New Contributor II

Thanks for this, wendelin! You seem to be describing exactly what we experienced shortly after moving to 7.4.2. It was extremely frustrating!

wendelin
New Contributor II

@maulish, yes i am still at 7.4.2., collecting infos to figure out whats going on there.
I am unable to touch the FG while many colleagues are using it ...
@andybarker, yes its frustrating to work as a human watchdog.

 

Edit 01-30:

5 days ago, i followed andybarkers way and downgraded to 7.4.1.

No more problems observed until now.

maulishshah
Staff
Staff

@andybarker, could you please share your Fortinet Case number to get more information?

 

Thank you. 

Maulish Shah
Labels
Top Kudoed Authors