VPN thinks its dead when its not. Causing WAN traffic to suddenly stop.
Been having problem shifting over to our new WAN environment consisting of Fortigate Routers. The VPN that is connecting the branch seems to just suddenly drop dead when the underlying network is alive).
For clarification below is our current environment that were trying to shift over to.
- 3 branches (ill just call them A,B,C) with Branch A being the main office
B<>A<>C (B and C only has VoIP traffic going toward each other)
- 2 Fortigate Router in each branch, these routers are in Active/Standby mode
- The Traffic that’s flowing is numerous including HTTPS, VoIP, RDP etc.
- Fortigate we are using is 60F
- We have had to upgrade our FortiOS numerous times.
- We have reverted back to our old setup about 3 times per branch
First incident.
We were using FortiOS 6.4.8 for all branches, with this, we were using Policy Based Routing as well as SDWAN. When the initial turn over to the new fortigate was executed to all 3 branches, it had seemed to be working fine. That was until that night when the traffic just stopped flowing and the VPN seemed to just die. But the underlying internet connection (2 x 1Gbps) was alive. After contacting support, we were told there was a bug with PBR on the version we were using (6.4.8) So we updated it 7.0
Second Incident.
After giving it some time to update our OS to 7.0, we decided not to use PBR for the second turn over plan and just use SDWAN with normal routing configured on the routers. The second turn over seemed to execute fine just like the first time but after some time it died again like the first time. Now, for this turn over, we only shifted the environment from Branch A to Branch B, Branch C was still using the old WAN set up.
Third Incident
This time the OS was still 7.0 but we stopped using SDWAN. Seemed fine after shifting to all Fortigate environment but after sometime it died. Now, for this turn over, we only shifted the environment from Branch A to Branch B, Branch C was still using the old WAN set up.
Fourth Incident
We suspected the 7.0 had either a bug or it was a faulty lot. Since we can’t figure out if it is a faulty lot, we decided to revert back to version 6.4.8 and this time we did not use PBR or SDWAN. After executing the shift, it seemed to work fine but after sometime (usually a few hours to a few days) the VPN dies (underlying internet is alive) and the traffic stopped flowing. Now, for this turn over, we only shifted the environment from Branch A to Branch B, Branch C was still using the old WAN set up.
Fifth Incident(kind of)
This time we updated the FortiOS to 6.4.9 on the branch B and A, and this time we decided not to send a certain traffic between A and B, this traffic consist of video data. After making the change to Forti Setup, it seemed to work fine. After 2 weeks its still working fine so we figured it’s the traffic that’s causing the down state but we haven’t looked into it too much yet.
Sixth Incident
Since we were abel to make the change to Forti Setup between branch A and B we figured A and C would work fine with the same concept. We made the shift to the Forti setup between A and C just TODAY. After the turn over it seemed to work for a few hours but just about an hour ago, the VPN died again (again, the underlying internet connection was alive) .
We seriously have no idea what the hell is happening, we went as far as not using fancy features and just using Fortigate as it was intended, which was routing with a bit of VPN for WAN usage.
If anyone can give us an feedback on what could be happening. I can give you guys more information if needed. If this keeps persisting, we cannot make outrchange to all FortiOS routers between branches.
Thanks in advance.
edit:
when i say the VPN dies, i mean the VPN connection seems to be alive but the there is no traffic going through what so ever after sometime and we dont know what triggers it