Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
severach
New Contributor

Site to site IPSec VPN Tunnel failure on restart

I have a pair of Fortigate 60 3.0 MR7 Patch 2. I have set up a site to site IPSec VPN between them. The tunnel works. If I restart one of the routers then one or both of the routers are unable to bring up the tunnel until the phase 1 keylife expires on the router that didn' t restart. I can edit the phase 1 on the router that didn' t restart then bring up the tunnel from either router. Debug logs show that one router believes the tunnel is up and the other router believes the tunnel is down. When the tunnels won' t come up the router with the up tunnel refuses to discard the stuck SA. The router with the up tunnel may or may not discard the connection from the router with the down tunnel as unsolicited. The router with the down tunnel never gets a connection attempt because the router with the up tunnel communicates only over the stuck SA. Editing the phase 1 discards the phase 1 SA after which I can bring up the tunnel from either router. I have other Fortigate routers with a variety of firmware from 2.80 to 3.00 and all have the same IPSec VPN problem. A short keylife, DPD, auto-negotiate, and autokey keep alive are not acceptable solutions to this problem. I need Fortigate tunnels to be as reliable as Netscreen and Linksys tunnels which don' t have this problem.
3 REPLIES 3
emnoc
Esteemed Contributor III

DPD, auto-negotiate, and autokey keep alive are not acceptable solutions to this problem. I need Fortigate tunnels to be as reliable as Netscreen and Linksys tunnels which don' t have this problem.
So why are these not acceptable solutions? Also, what does your debug vpn shows? When the connection is down?

PCNSE 

NSE 

StrongSwan  

PCNSE NSE StrongSwan
red_adair
New Contributor III

#1 First it would be important to understand _why_ the VPN goes down. #2 I don' t understand why DPD is not a solution. DPD IMHO always is the choice for proper VPN handling. Despite #1, half-dead VPNs are a common problem if not using DPD. #3 I' d upgrade to MR7_recent patch. (patch2 looks rather old) -R.
severach

#1 First it would be important to understand _why_ the VPN goes down.
The tunnel goes down because there' s a monkey at the controls pressing reset. Why it won' t come up on command until the monkey fixes it by hand is the problem. The real problem is that I have tunnels with real peers that won' t come up several times per week. After manually fixing only able to see one debug log at a time I bought another Fortigate for home so I could see both debug logs and verify that this isn' t a multi vendor problem. There are several distinct problems but the major one I' ve reduced down to this simple demonstration between two Fortigate so others can easily replicate the problem. Why my peers drop the tunnels without notification my logs don' t explain. They won' t admit to or fix anything so they just keeps going down. The Pix 501 that the Fortigate replaced handled whatever problem it is just fine.
#2 I don' t understand why DPD is not a solution. DPD IMHO always is the choice for proper VPN handling. Despite #1, half-dead VPNs are a common problem if not using DPD.
Have you watched the debug logs? DPD spams the other router and the debug logs every 10-60 seconds. That would be fine for VPN where the DPD traffic would be dwarfed by the real traffic but my traffic is tiny files a couple of times per day. DPD would account for more than 99.9% of the traffic. Some of my peers do not support DPD and the tunnels won' t work at all if I enable DPD. All other brands I' ve tested recover from a stuck SA without so much as a second try and I' m steadily obtaining more to test. My peers, none of which use Fortigate, think DPD is not needed since none have enabled it. The Cisco Pix 501 that I replaced only produced stuck SA when I made changes and no tunnel had DPD. Why does Fortigate require DPD just to have tunnels that work? DPD is for quickly diverting continuous use high priority traffic down an alternate tunnel when the main tunnel goes down. DPD is not for coaxing functional tunnels from a defective IPSec implementation.
#3 I' d upgrade to MR7_recent patch. (patch2 looks rather old)
I have many Fortigate routers with FortiOS versions spanning several years including some very new ones. The behavior is identical. If Fortinet thought there was a problem it would have been fixed by now so they must think it' s not a problem.
So why are these not acceptable solutions?
All tunnels failed regularly when I copied the settings over from the Pix. I reviewed documentation and obtained correct keylife from peer admin and this improved many tunnels. I studied the debug logs recorded over several days and discovered that the keylife that they see is not always correct. A tunnel with a Netscreen was supposed to have a 3600 second phase 2 but I noticed in the logs that it was expiring every 3300 seconds. Odd corrections like these made more tunnels reliable. Then I discovered the auto connect options and almost all tunnels were completely reliable. No matter how well I correct the mismatches and how many options I turn on a few tunnels still fail. More options only made them fail less often. With all the options turned on the tunnels fail no less than once per week. Some days I get many calls. The Pix had rock solid tunnels with outrageous keylife mismatches and no special settings. The Fortigate does not even when the keylife is carefully tuned and the settings are maxed out.
Also, what does your debug vpn shows? When the connection is down?
The router reboots without notifying the other router that the tunnels are going down. The non rebooted router has an active phase 1 and phase 2. When the rebooted router recovers the non rebooted router continues to send traffic through the orphaned phase 2 SA for the remainder of its life. I can bring down the phase 2 but it won' t come back up. The non rebooted sends phase 2 initiations through the orphaned phase 1 for the remainder of its life. diagnose vpn tunnel flush brings down all phase 2 but does not bring down phase 1. Since Fortinet doesn' t give us observation and control of phase 1 I must edit the phase 1 to destroy all of phase 1 and phase 2 SA. Then the tunnels will come up on request.
Announcements

Select Forum Responses to become Knowledge Articles!

Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.

Labels
Top Kudoed Authors