Skip to main content
kelv1n
New Member
March 7, 2015
Question

Site-2-Site VPN, slow in 1 direction

  • March 7, 2015
  • 4 replies
  • 15907 views

Hi Guys

 

We have 2 locations, lets calls them Site A and Site B each with a 1Gb link, each with a pair of 200D's in HA, the sites are connected via an IPSec VPN.

 

Traffic flowing from Site A to Site B flows at about 500Mbp/s, but traffic flowing the other way (B to A) only hits between 100-200Mbp/s.

 

Any idea what could be causing this?

 

This is definitely config related, as I've tested this by resetting and building a new config for Site A, which got traffic flowing at 500Mbp/s both ways, but I couldn't identify which settings might have been causing. It was working fine until a few days ago, but we experienced a VPN issue (traffic started intermittently timing out) and we had to fix it quickly so we just rebuilt the VPN and this bottleneck started reoccurring again.

 

The VPN is setup with 

 

Phase 1 -

Preshared Key

IKE Version 1

Mode Main

Propose Algorithms - DES-SHA1

DH Groups 2,1

 

Phase 2 -

Propose Algorithms - DES-SHA1

Enable Replay Detection: Off

PFS: Off

Autokey Keep Aline: Off

Auto-negotiate: Off

 

I know these are not ideal from a security stand point, but they are just for testing.

 

This is not policy related as the VPN polices are at the top and currently have no UTM active on them.

 

I've seem other people report similar issues, but the recommendations to them was to the NPU settings, but I'm not sure how applicable these are to the 200 series as they have the LITE processor.

 

Any thoughts on what might be causing this? Or how to diagnose it?

    4 replies

    kelv1n
    kelv1nAuthor
    New Member
    March 7, 2015

    The actual config is as follows -

     

    config vpn ipsec phase1-interface
        edit "VPN"
            set interface "wan1"
            set local-gw <SITE-A Public Addr>
            set proposal des-sha1
            set dhgrp 2 1
            set remote-gw <SITE-B Public Addr>
            set psksecret ENC KEY
        next
    end

    config vpn ipsec phase2-interface
        edit "VPN"
            set phase1name "VPN"
            set proposal des-sha1
            set pfs disable
            set replay disable
        next
    end

    rwpatterson
    New Member
    March 7, 2015

    As a further troubleshooting test, have you run speed tests (speedtest.net, for example) from both sites? Are the speeds comparable?

    kelv1n
    kelv1nAuthor
    New Member
    March 7, 2015

    Hi Bob

     

    Yes, I should have stated how I'm testing this.

     

    I've set up a VIP at each site pointing to a Linux server at each end, so the Linux servers are accessible both across the VPN, and a public VIP.  I then use iperf to test the throughput.

     

    Last week pushing traffic via a VIP and VPN, I got 400-500Mbp/s in both directions, now I get 400-500Mbp/s via the VIP, but since reconfiguring the VPN I get anywhere between 50-200Mb/s accross it..

     

    I don't know if its relevant, but iperf reports going via the VIP the TCP Window Size if 85k, going via the VPN its 45k. Also I can get the VPN to 350-400Mbp/s when I do a parallel send, i.e. having 6-8 threads sending/receiving concurrently. 

    emnoc
    New Member
    March 7, 2015

    if your using iperf for benchmarking I would avoid  tcp imho& re-run theses tests using udp and a slow night and maybe 5x 120sec runs and i will bet you will have a different outcome. Try todo this maybe after 12:00PM

     

    Also, keep in mind the packet that flows from site A to site B might not flow the exact same path. If I had to bet, they problem DO NOT flow the same path.

     

    So path and segment, loading of the links locally & in-between will have many factors involved. This is all b4 you looking at tcp sliding windows, WSCALING, buffering, etc.......

     

    just my 2cts.

    kelv1n
    kelv1nAuthor
    New Member
    March 7, 2015

    Thanks emnoc.

     

    They are all valid points, the connection is very local on a  network run by a local ISP, with only a couple of hops between Site A and Site B, and like I mentioned the VIP and VPN tests return different results, even when they are run immediately after each other.

     

    Whilst I can get the higher speed accross the VPN running parallel threads, that is not what I was getting previously before having to redo the VPN config, and this will be carrying DR traffic, so it will typically just be 1 continuous stream of data (rather than split into parallel).

     

    EDIT -

     

    Actually emnoc, sound logic, this is different to last time. I'll try some more out of hours tests.

     

    But if anybody else has any suggestions, I'm open :)

    emnoc
    New Member
    March 8, 2015

    bingo

     

    Also it best to conduct test in and out of VPn due to the overhead of the ESP and any CP/NPU/CPU processing. The MTU is going to have a direct impact on the tcp MSS " during the setup and life of the tcp session. This is why you should use UDP for pure path benchmark  " imho

    kelv1n
    kelv1nAuthor
    New Member
    March 8, 2015

    It looks like the NPU is not being used at the Site with the bandwidth issue

     

     

    OFFICE-FG-200D-1 # diagnose vpn ipsec status
    All ipsec crypto devices in use:
    npl
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 0 0
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 0 0
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    NPU HARDWARE
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 0 0
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 0 0
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    CP8:
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 132977390 97061173
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 132977512 97061173
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    SOFTWARE:
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 0 0
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 0 0
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0

     

     

    But it is at the other one

     

     

    DC-FG-200D-1 # diagnose vpn ipsec status
    All ipsec crypto devices in use:
    npl
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 74355968 106491008
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 74355968 106491008
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    NPU HARDWARE
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 1760307 0
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 1760307 0
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    CP8:
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 19 49937
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 19 49937
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0
    SOFTWARE:
    null: 0 0
    des: 0 0
    3des: 0 0
    aes: 0 0
    aria: 0 0
    seed: 0 0
    null: 0 0
    md5: 0 0
    sha1: 0 0
    sha256: 0 0
    sha384: 0 0
    sha512: 0 0

     

    Any thoughts on why this would happen? I have explicitly set the phase-1 local-gw value via CLI at each end.

    rwpatterson
    New Member
    March 8, 2015

    Are all interfaces accelerated? (Did you use the same port on each end...)

    emnoc
    New Member
    March 8, 2015

    And we are assuming the same FortiOS?

     

    For what bob mention earlier NPU acceleration is between the 2 ports. You need to be cautious of pairing & what you have enable on the policies and ports.This goes back to FP ( fast-path ) and non-FP. You probably have something enable that's mixing the 2. But you should still conduct benchmark test using  udp on a policy-id that has NOTHING enable accept an allow between the 2 hosts.

     

    Do a  " get hardware status" and look at the ASIC version, you probably have np4 lite if I had to guess

     

     

    kelv1n
    kelv1nAuthor
    New Member
    March 8, 2015

    Hi Guys

     

    I think I've managed to resolve this, by sheer luck I stumbled across a this -

     

    https://forum.fortinet.com/tm.aspx?tree=true&m=121303&mpage=1

     

    Somebody reported a problem with sflow causing problems with their NPU offloading, we were playing with sflow and netflow recently, so I unset these configs. I rebooted the Master firewall, but it did not resolve it. Then I remembered Firewalls on that site are an Active-Active HA pair, so the sessions would sync. I did a simultaneous reboot of both Master/Standby to clear everything.

     

    Both sites are now using NPU, and we're getting 500Mb/s a second in both directions. I've quite shocked at the impact the NPU has, especially since its labelled "NPULite".

     

    Thanks for all your help.