Skip to main content
Contributor
February 17, 2011
Question

IPsec VPN latency issues with (not really) large packages.

  • February 17, 2011
  • 7 replies
  • 6094 views
  Hi folks    we' re experiencing a weird IPsec latency issue on a   Version: Fortigate-310B v4.0,build0196,100319 (MR1 Patch 4)  with policy based vpn, where we' ve run out of ideas how to further  diagnose and possibly fix this issue.    The problem is that " large"  packages aren' t properly sent out the  external Fortigate interface into the vpn tunnel. Where " large"  seems  to start with 484 bytes, and where this behavior is consistently  reproducible e.g. with pings.    Example:     client 95.10.11.12 <-vpn-> 129.20.0.2 firewall <-lan-> 129.24.60.8     ( Note: ip addresses are fake ... )    External IPsec client 95.10.11.12 sends a 418 byte payload ping  through the tunnel into our net     $ ping -c 1 -s 418 129.24.60.8    and on firewall we' re seeing     # diagnose sniffer packet any ' ( host 95.10.11.12 and esp ) or ( host 129.24.60.8 and icmp )'  4   interfaces=[any]   filters=[( host 95.10.11.12 and esp ) or ( host 129.24.60.8 and icmp )]   2.304990 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 484   2.304990 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request   2.305070 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request   2.305393 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply   2.305417 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply   2.305454 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 484    So far so good. But with a larger packet size the ping response gets  stuck in the firewall:     $ ping -c 1 -s 800 129.24.60.8     21.638467 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 868   21.638467 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request   21.638558 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request   21.638812 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply   21.638827 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply    The final esp response on port2 is missing. Interestingly enough, when  sending another smaller ping, the missing packet is pushed out:     $ ping -c 1 -s 418 129.24.60.8     54.701092 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 484   54.701092 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request   54.701187 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request   54.701546 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply   54.701561 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply   54.701606 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 868   54.701609 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 484    as you can see on the second last line. So that 800 byte payload ping  response packet was stuck in the firewall about half a minute.      We' ve dug a little deeper with " debug flow" , and this is what we  found:     2011-02-17 14:49:17 id=36870 trace_id=417 func=resolve_ip_tuple_fast line=3276 msg=" vd-root received a packet(proto=1, 129.24.90.10:45673->129.24.60.8:8) from VPN_GW_0."    2011-02-17 14:49:17 id=36870 trace_id=417 func=resolve_ip_tuple line=3398 msg=" allocate a new session-15b612f5"    2011-02-17 14:49:17 id=36870 trace_id=417 func=vf_ip4_route_input line=1609 msg=" find a route: gw-129.24.60.8 via port1"    2011-02-17 14:49:17 id=36870 trace_id=417 func=fw_forward_handler line=440 msg=" Allowed by Policy-104:"    2011-02-17 14:49:17 id=36870 trace_id=418 func=resolve_ip_tuple_fast line=3276 msg=" vd-root received a packet(proto=1, 129.24.60.8:45673->129.24.90.10:0) from port1."    2011-02-17 14:49:17 id=36870 trace_id=418 func=resolve_ip_tuple_fast line=3312 msg=" Find an existing session, id-15b612f5, reply direction"    2011-02-17 14:49:17 id=36870 trace_id=418 func=vf_ip4_route_input line=1609 msg=" find a route: gw-129.24.90.10 via VPN_GW_0"    2011-02-17 14:49:17 id=36870 trace_id=418 func=ipsecdev_hard_start_xmit line=122 msg=" enter IPsec interface-VPN_GW_0"    2011-02-17 14:49:17 id=36870 trace_id=418 func=esp_output4 line=467 msg=" encrypted, and send to 95.10.11.12 with source 129.20.0.2"    # here the delay is happening (in this case 7 seconds)   2011-02-17 14:49:24 id=36870 trace_id=418 func=ipsec_output_finish line=138 msg=" send to 129.20.0.1 via intf-port2"     This is from a trace where we had a 7 second delay. And it' s  apparently happening right in " ipsec_output_finish" .    Now, this can' t be MTU, which is 1500. Also, we' re seeing no " frag  needed"  response from whichever party. The route 129.20.0.1 is  static and properly working for all other traffic.    So, maybe somebody has some advice how to further proceed from here?  It' d sure be appreciated, because right now we' re at a loss ...    Thanks a bunch, G.  

    7 replies

    emnoc
    New Member
    February 17, 2011
    Have you tried the following; Upgrading or maybe downgrading Contacting Fortinet TAC for existance Also, how does regular non-ICMP perform over the tunnel? Any file transfers or other application data going over the tunnel and does it exhibits any performance issues? A 800byte packet should be very small
    Contributor
    February 17, 2011
    Regarding non ICMP traffic: that' s how we initially noticed. E.g. telnet shell sessions over the tunnel with curses applications that redraw the screen and send large packets back were impossible to work with. I hit Enter and nothing happened. Hit Enter again, the action the previous Enter should have triggered was carried out. So I started sniffing traffic on both ends and saw tcp retransmissions and other irregular behavior. We finally used ping only for demonstration purposes, but at least TCP is affected just the same. On Linux, I can tell you the ugly workaround I' m using on my workstation inside my home LAN: ip route add 129.24.60.0/24 via 192.168.10.1 mtu 446 advmss 406 And before having anybody ask about Linux: we' ve reproduced the issue with FortiClient on Windows, of course. We' ve reproduced it on Mac, too. As to downgrading, I actually don' t know how long this problem has been persisting. I joined the place I' m now working at comparatively recently, and all people can tell me is that vpn has always been " slow" with certain applications. Others, like Windows Remote Desktop, work fine of course because they' re continuously pumping out small packets. And we' ve looked through all the later MR1 patches' release notes. None seems to address the issue we' re looking at. Maybe we make this a support call. Although I' m really not looking forward to assemble all the info they want, to maybe help them fix what could very well be a bug on their end instead of an actual support request on ours ...
    emnoc
    New Member
    February 17, 2011
    Will this is new new hardware ( 310 ) compared to what I' m used to using. So I would not be quick to rule out a bug. Make the call, and see if support has a work around or other known issues. Escalated it to a priority 1
    rwpatterson
    New Member
    February 18, 2011
    Take a quick look at the duplex settings and make sure they match on both the FGT and whatever it' s connected to.
    Contributor
    February 18, 2011
    Thanks, good pointer. Both full duplex though: Fortigate:
      # diag hardware device nic port2  Driver Name: NP2  Version: 0.92  Chip Revision: 2  BoardSN: N/A  Module Name: 310B  DDR Size: 256 MB  Bootstrap ID: 11  PCIX-64bit-@133MHz bus: 03:01.0  Admin: up  MAC: 00:09:0f:09:30:02  Permanent_HWaddr: 00:09:0f:89:15:5e  Link: up Speed: 1000Mbps Duplex: Full  Rx Pkts: 4174984716  Tx Pkts: 1709646918  Rx Bytes: 3677915136  Tx Bytes: 2100737024  MAC0 Rx Errors: 0  MAC0 Rx Dropped: 0  MAC0 Tx Dropped: 0  MAC0 FIFO Overflow: 0  MAC0 IP Error: 0    TAE Entry Used: 5  TSE Entry Used: 0  Host Dropped: 15  Shaper Dropped: 85  EEI0 Dropped: 0  EEI1 Dropped: 0  EEI2 Dropped: 0  EEI3 Dropped: 0  IPSEC QFIFO Dropped: 0  IPSEC DFIFO Dropped: 0  PBA: 123/1019/251  Forwarding Entry Used: 0  Offload IPSEC Antireplay ENC Status: Disable  Offload IPSEC Antireplay DEC Status: Disable  Offload Host IPSEC Traffic: Disable  
    3com 4200G:
      >display interface GigabitEthernet1/0/1   GigabitEthernet1/0/1 current state : UP   IP Sending Frames'  Format is PKTFMT_ETHNT_2, Hardware address is 0024-733d-eb63   Media type is twisted pair, loopback not set   Port hardware type is 1000_BASE_T   1000Mbps-speed mode, full-duplex mode   Link speed type is autonegotiation, link duplex type is autonegotiation   Flow-control is not enabled   The Maximum Frame Length is 1522   Broadcast MAX-pps: 3000   Unicast MAX-ratio: 100%   Multicast MAX-ratio: 100%   Unknown Multicast Packet drop: Disable   Unknown Unicast Packet drop: Disable   Forbid jumbo frame to pass   PVID: 1   Mdi type: auto   Port link-type: access    Tagged   VLAN ID : none    Untagged VLAN ID : 1   Last 300 seconds input:  725 packets/sec 280476 bytes/sec   Last 300 seconds output:  1023 packets/sec 1215396 bytes/sec   Input(total):  7997813332 packets, - bytes           - broadcasts, - multicasts, - pauses   Input(normal):  7997813332 packets, 3653700343084 bytes           201 broadcasts, 281 multicasts, 0 pauses   Input:  0 input errors, 0 runts, 0 giants,  - throttles, 0 CRC           0 frame,  0 overruns, 0 aborts, - ignored, - parity errors   Output(total): 10065435185 packets, - bytes           - broadcasts, - multicasts, - pauses   Output(normal): 10065435185 packets, 10461332485707 bytes           842269 broadcasts, 117663100 multicasts, 0 pauses   Output: 0 output errors,  - underruns, - buffer failures           0 aborts, 0 deferred, 0 collisions, 0 late collisions           - lost carrier, - no carrier  
    FortiRack_Eric
    New Member
    February 18, 2011
    Try setting MTU and TCP-MSS to lower values TCP-MSS is MTU-40 for the tunnel (on both sides)
    Contributor
    February 18, 2011
    Well, that would be the MTU preferably, because the issue can be triggered with ICMP already. One might try that if one had a test system - which I haven' t - but that' d imply hunting for a driver bug, where the standard 1500 MTU would cause problems specifically in the presence of IPsec. As we' ve already seen anyway that the issue depends on packet sizes, and that all other non IPsec traffic is unaffected. I guess I would leave such experiments really to the dealings of a support call, if Fortinet had actual reason to believe that we' re looking at a driver bug related to MTU and thus justify fiddling with production in such a manner. But thanks anyway.
    Contributor
    March 7, 2011
    Just a brief update, in case anybody stumbles across this and wonders about the solution. It' s been a known issue, decribed in Bug ID 118774: IPSec AES encryption performance for packets of length greater than 418 bytes is significantly lower than previous The issue was fixed in v4.0 MR1 Patch 5. We now updated to v4.0 MR1 Patch 9, and the problem is indeed gone. Thanks to everybody for help and advice.