Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
kelv1n
New Contributor

Site-2-Site VPN, slow in 1 direction

Hi Guys

 

We have 2 locations, lets calls them Site A and Site B each with a 1Gb link, each with a pair of 200D's in HA, the sites are connected via an IPSec VPN.

 

Traffic flowing from Site A to Site B flows at about 500Mbp/s, but traffic flowing the other way (B to A) only hits between 100-200Mbp/s.

 

Any idea what could be causing this?

 

This is definitely config related, as I've tested this by resetting and building a new config for Site A, which got traffic flowing at 500Mbp/s both ways, but I couldn't identify which settings might have been causing. It was working fine until a few days ago, but we experienced a VPN issue (traffic started intermittently timing out) and we had to fix it quickly so we just rebuilt the VPN and this bottleneck started reoccurring again.

 

The VPN is setup with 

 

Phase 1 -

Preshared Key

IKE Version 1

Mode Main

Propose Algorithms - DES-SHA1

DH Groups 2,1

 

Phase 2 -

Propose Algorithms - DES-SHA1

Enable Replay Detection: Off

PFS: Off

Autokey Keep Aline: Off

Auto-negotiate: Off

 

I know these are not ideal from a security stand point, but they are just for testing.

 

This is not policy related as the VPN polices are at the top and currently have no UTM active on them.

 

I've seem other people report similar issues, but the recommendations to them was to the NPU settings, but I'm not sure how applicable these are to the 200 series as they have the LITE processor.

 

Any thoughts on what might be causing this? Or how to diagnose it?

11 REPLIES 11
kelv1n
New Contributor

The actual config is as follows -

 

config vpn ipsec phase1-interface
    edit "VPN"
        set interface "wan1"
        set local-gw <SITE-A Public Addr>
        set proposal des-sha1
        set dhgrp 2 1
        set remote-gw <SITE-B Public Addr>
        set psksecret ENC KEY
    next
end

config vpn ipsec phase2-interface
    edit "VPN"
        set phase1name "VPN"
        set proposal des-sha1
        set pfs disable
        set replay disable
    next
end

rwpatterson
Valued Contributor III

As a further troubleshooting test, have you run speed tests (speedtest.net, for example) from both sites? Are the speeds comparable?

Bob - self proclaimed posting junkie!
See my Fortigate related scripts at: http://fortigate.camerabob.com

Bob - self proclaimed posting junkie!See my Fortigate related scripts at: http://fortigate.camerabob.com
kelv1n

Hi Bob

 

Yes, I should have stated how I'm testing this.

 

I've set up a VIP at each site pointing to a Linux server at each end, so the Linux servers are accessible both across the VPN, and a public VIP.  I then use iperf to test the throughput.

 

Last week pushing traffic via a VIP and VPN, I got 400-500Mbp/s in both directions, now I get 400-500Mbp/s via the VIP, but since reconfiguring the VPN I get anywhere between 50-200Mb/s accross it..

 

I don't know if its relevant, but iperf reports going via the VIP the TCP Window Size if 85k, going via the VPN its 45k. Also I can get the VPN to 350-400Mbp/s when I do a parallel send, i.e. having 6-8 threads sending/receiving concurrently. 

ashukla_FTNT

kelv1n wrote:

 

I don't know if its relevant, but iperf reports going via the VIP the TCP Window Size if 85k, going via the VPN its 45k. Also I can get the VPN to 350-400Mbp/s when I do a parallel send, i.e. having 6-8 threads sending/receiving concurrently. 

This may be because typically the vpn interface mtu is 1436 but if you just sending traffic to wan the typical mtu will be 1500.

Now if your pc is doing path mtu discovery, it will detect lower mtu in vpn and eventualy it will cause the pc to use lower TCP window size.

As per RFC: 1191:

TCP performance can be reduced if the sender's maximum window size is not an exact multiple of the segment size in use (this is not the congestion window size, which is always a multiple of the segment size). In many system (such as those derived from 4.2BSD), the segment size is often set to 1024 octets, and the maximum window size (the "send space") is usually a multiple of 1024 octets, so the proper relationship holds by default. If PMTU Discovery is used, however, the segment size may not be a submultiple of the send space, and it may change during a connection; this means that the TCP layer may need to change the transmission window size when PMTU Discovery changes the PMTU value. The maximum window size should be set to the greatest multiple of the segment size (PMTU - 40) that is less than or equal to the sender's buffer space size.

 

So as emonc correctly pointed out is the limitation of TCP. And patch mtu discovery is the actual bottelneck.

You can do the following two things:

1) Do a UDP based test

2)Disable patch mtu discovery

 

Hope this helps.

emnoc
Esteemed Contributor III

if your using iperf for benchmarking I would avoid  tcp imho& re-run theses tests using udp and a slow night and maybe 5x 120sec runs and i will bet you will have a different outcome. Try todo this maybe after 12:00PM

 

Also, keep in mind the packet that flows from site A to site B might not flow the exact same path. If I had to bet, they problem DO NOT flow the same path.

 

So path and segment, loading of the links locally & in-between will have many factors involved. This is all b4 you looking at tcp sliding windows, WSCALING, buffering, etc.......

 

just my 2cts.

PCNSE 

NSE 

StrongSwan  

PCNSE NSE StrongSwan
kelv1n
New Contributor

Thanks emnoc.

 

They are all valid points, the connection is very local on a  network run by a local ISP, with only a couple of hops between Site A and Site B, and like I mentioned the VIP and VPN tests return different results, even when they are run immediately after each other.

 

Whilst I can get the higher speed accross the VPN running parallel threads, that is not what I was getting previously before having to redo the VPN config, and this will be carrying DR traffic, so it will typically just be 1 continuous stream of data (rather than split into parallel).

 

EDIT -

 

Actually emnoc, sound logic, this is different to last time. I'll try some more out of hours tests.

 

But if anybody else has any suggestions, I'm open :)

emnoc
Esteemed Contributor III

bingo

 

Also it best to conduct test in and out of VPn due to the overhead of the ESP and any CP/NPU/CPU processing. The MTU is going to have a direct impact on the tcp MSS " during the setup and life of the tcp session. This is why you should use UDP for pure path benchmark  " imho

PCNSE 

NSE 

StrongSwan  

PCNSE NSE StrongSwan
kelv1n
New Contributor

It looks like the NPU is not being used at the Site with the bandwidth issue

 

 

OFFICE-FG-200D-1 # diagnose vpn ipsec status
All ipsec crypto devices in use:
npl
null: 0 0
des: 0 0
3des: 0 0
aes: 0 0
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 0 0
sha256: 0 0
sha384: 0 0
sha512: 0 0
NPU HARDWARE
null: 0 0
des: 0 0
3des: 0 0
aes: 0 0
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 0 0
sha256: 0 0
sha384: 0 0
sha512: 0 0
CP8:
null: 0 0
des: 0 0
3des: 0 0
aes: 132977390 97061173
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 132977512 97061173
sha256: 0 0
sha384: 0 0
sha512: 0 0
SOFTWARE:
null: 0 0
des: 0 0
3des: 0 0
aes: 0 0
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 0 0
sha256: 0 0
sha384: 0 0
sha512: 0 0

 

 

But it is at the other one

 

 

DC-FG-200D-1 # diagnose vpn ipsec status
All ipsec crypto devices in use:
npl
null: 0 0
des: 0 0
3des: 0 0
aes: 74355968 106491008
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 74355968 106491008
sha256: 0 0
sha384: 0 0
sha512: 0 0
NPU HARDWARE
null: 0 0
des: 0 0
3des: 0 0
aes: 1760307 0
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 1760307 0
sha256: 0 0
sha384: 0 0
sha512: 0 0
CP8:
null: 0 0
des: 0 0
3des: 0 0
aes: 19 49937
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 19 49937
sha256: 0 0
sha384: 0 0
sha512: 0 0
SOFTWARE:
null: 0 0
des: 0 0
3des: 0 0
aes: 0 0
aria: 0 0
seed: 0 0
null: 0 0
md5: 0 0
sha1: 0 0
sha256: 0 0
sha384: 0 0
sha512: 0 0

 

Any thoughts on why this would happen? I have explicitly set the phase-1 local-gw value via CLI at each end.

rwpatterson
Valued Contributor III

Are all interfaces accelerated? (Did you use the same port on each end...)

Bob - self proclaimed posting junkie!
See my Fortigate related scripts at: http://fortigate.camerabob.com

Bob - self proclaimed posting junkie!See my Fortigate related scripts at: http://fortigate.camerabob.com
Announcements

Select Forum Responses to become Knowledge Articles!

Select the “Nominate to Knowledge Base” button to recommend a forum post to become a knowledge article.

Labels
Top Kudoed Authors