IPsec VPN latency issues with (not really) large packages.

Report Inappropriate Content · ‎02-17-2011

 Hi folks
 
 we' re experiencing a weird IPsec latency issue on a
  Version: Fortigate-310B v4.0,build0196,100319 (MR1 Patch 4)
 with policy based vpn, where we' ve run out of ideas how to further
 diagnose and possibly fix this issue.
 
 The problem is that " large"  packages aren' t properly sent out the
 external Fortigate interface into the vpn tunnel. Where " large"  seems
 to start with 484 bytes, and where this behavior is consistently
 reproducible e.g. with pings.
 
 Example:
 
  client 95.10.11.12 <-vpn-> 129.20.0.2 firewall <-lan-> 129.24.60.8
 
  ( Note: ip addresses are fake ... )
 
 External IPsec client 95.10.11.12 sends a 418 byte payload ping
 through the tunnel into our net
 
  $ ping -c 1 -s 418 129.24.60.8
 
 and on firewall we' re seeing
 
  # diagnose sniffer packet any ' ( host 95.10.11.12 and esp ) or ( host 129.24.60.8 and icmp )'  4
  interfaces=[any]
  filters=[( host 95.10.11.12 and esp ) or ( host 129.24.60.8 and icmp )]
  2.304990 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 484
  2.304990 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request
  2.305070 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request
  2.305393 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply
  2.305417 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply
  2.305454 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 484
 
 So far so good. But with a larger packet size the ping response gets
 stuck in the firewall:
 
  $ ping -c 1 -s 800 129.24.60.8
 
  21.638467 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 868
  21.638467 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request
  21.638558 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request
  21.638812 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply
  21.638827 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply
 
 The final esp response on port2 is missing. Interestingly enough, when
 sending another smaller ping, the missing packet is pushed out:
 
  $ ping -c 1 -s 418 129.24.60.8
 
  54.701092 port2 in 95.10.11.12 -> 129.20.0.2:  ip-proto-50 484
  54.701092 VPN_GW_0 in 129.24.90.10 -> 129.24.60.8: icmp: echo request
  54.701187 port1 out 129.24.90.10 -> 129.24.60.8: icmp: echo request
  54.701546 port1 in 129.24.60.8 -> 129.24.90.10: icmp: echo reply
  54.701561 VPN_GW_0 out 129.24.60.8 -> 129.24.90.10: icmp: echo reply
  54.701606 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 868
  54.701609 port2 out 129.20.0.2 -> 95.10.11.12:  ip-proto-50 484
 
 as you can see on the second last line. So that 800 byte payload ping
 response packet was stuck in the firewall about half a minute.
 
 
 We' ve dug a little deeper with " debug flow" , and this is what we
 found:
 
  2011-02-17 14:49:17 id=36870 trace_id=417 func=resolve_ip_tuple_fast line=3276 msg=" vd-root received a packet(proto=1, 129.24.90.10:45673->129.24.60.8:8) from VPN_GW_0." 
  2011-02-17 14:49:17 id=36870 trace_id=417 func=resolve_ip_tuple line=3398 msg=" allocate a new session-15b612f5" 
  2011-02-17 14:49:17 id=36870 trace_id=417 func=vf_ip4_route_input line=1609 msg=" find a route: gw-129.24.60.8 via port1" 
  2011-02-17 14:49:17 id=36870 trace_id=417 func=fw_forward_handler line=440 msg=" Allowed by Policy-104:" 
  2011-02-17 14:49:17 id=36870 trace_id=418 func=resolve_ip_tuple_fast line=3276 msg=" vd-root received a packet(proto=1, 129.24.60.8:45673->129.24.90.10:0) from port1." 
  2011-02-17 14:49:17 id=36870 trace_id=418 func=resolve_ip_tuple_fast line=3312 msg=" Find an existing session, id-15b612f5, reply direction" 
  2011-02-17 14:49:17 id=36870 trace_id=418 func=vf_ip4_route_input line=1609 msg=" find a route: gw-129.24.90.10 via VPN_GW_0" 
  2011-02-17 14:49:17 id=36870 trace_id=418 func=ipsecdev_hard_start_xmit line=122 msg=" enter IPsec interface-VPN_GW_0" 
  2011-02-17 14:49:17 id=36870 trace_id=418 func=esp_output4 line=467 msg=" encrypted, and send to 95.10.11.12 with source 129.20.0.2" 
  # here the delay is happening (in this case 7 seconds)
  2011-02-17 14:49:24 id=36870 trace_id=418 func=ipsec_output_finish line=138 msg=" send to 129.20.0.1 via intf-port2" 
 
 This is from a trace where we had a 7 second delay. And it' s
 apparently happening right in " ipsec_output_finish" .
 
 Now, this can' t be MTU, which is 1500. Also, we' re seeing no " frag
 needed"  response from whichever party. The route 129.20.0.1 is
 static and properly working for all other traffic.
 
 So, maybe somebody has some advice how to further proceed from here?
 It' d sure be appreciated, because right now we' re at a loss ...
 
 Thanks a bunch, G.

emnoc · ‎02-17-2011

Have you tried the following; Upgrading or maybe downgrading Contacting Fortinet TAC for existance Also, how does regular non-ICMP perform over the tunnel? Any file transfers or other application data going over the tunnel and does it exhibits any performance issues? A 800byte packet should be very small

PCNSE

NSE

StrongSwan

Report Inappropriate Content · ‎02-17-2011

Regarding non ICMP traffic: that' s how we initially noticed. E.g. telnet shell sessions over the tunnel with curses applications that redraw the screen and send large packets back were impossible to work with. I hit Enter and nothing happened. Hit Enter again, the action the previous Enter should have triggered was carried out. So I started sniffing traffic on both ends and saw tcp retransmissions and other irregular behavior. We finally used ping only for demonstration purposes, but at least TCP is affected just the same. On Linux, I can tell you the ugly workaround I' m using on my workstation inside my home LAN: ip route add 129.24.60.0/24 via 192.168.10.1 mtu 446 advmss 406 And before having anybody ask about Linux: we' ve reproduced the issue with FortiClient on Windows, of course. We' ve reproduced it on Mac, too. As to downgrading, I actually don' t know how long this problem has been persisting. I joined the place I' m now working at comparatively recently, and all people can tell me is that vpn has always been " slow" with certain applications. Others, like Windows Remote Desktop, work fine of course because they' re continuously pumping out small packets. And we' ve looked through all the later MR1 patches' release notes. None seems to address the issue we' re looking at. Maybe we make this a support call. Although I' m really not looking forward to assemble all the info they want, to maybe help them fix what could very well be a bug on their end instead of an actual support request on ours ...

emnoc · ‎02-17-2011

Will this is new new hardware ( 310 ) compared to what I' m used to using. So I would not be quick to rule out a bug. Make the call, and see if support has a work around or other known issues. Escalated it to a priority 1

PCNSE

NSE

StrongSwan

rwpatterson · ‎02-18-2011

Take a quick look at the duplex settings and make sure they match on both the FGT and whatever it' s connected to.

Bob - self proclaimed posting junkie!
See my Fortigate related scripts at: http://fortigate.camerabob.com

Report Inappropriate Content · ‎02-18-2011

Thanks, good pointer. Both full duplex though: Fortigate:

 # diag hardware device nic port2
 Driver Name: NP2
 Version: 0.92
 Chip Revision: 2
 BoardSN: N/A
 Module Name: 310B
 DDR Size: 256 MB
 Bootstrap ID: 11
 PCIX-64bit-@133MHz bus: 03:01.0
 Admin: up
 MAC: 00:09:0f:09:30:02
 Permanent_HWaddr: 00:09:0f:89:15:5e
 Link: up Speed: 1000Mbps Duplex: Full
 Rx Pkts: 4174984716
 Tx Pkts: 1709646918
 Rx Bytes: 3677915136
 Tx Bytes: 2100737024
 MAC0 Rx Errors: 0
 MAC0 Rx Dropped: 0
 MAC0 Tx Dropped: 0
 MAC0 FIFO Overflow: 0
 MAC0 IP Error: 0
 
 TAE Entry Used: 5
 TSE Entry Used: 0
 Host Dropped: 15
 Shaper Dropped: 85
 EEI0 Dropped: 0
 EEI1 Dropped: 0
 EEI2 Dropped: 0
 EEI3 Dropped: 0
 IPSEC QFIFO Dropped: 0
 IPSEC DFIFO Dropped: 0
 PBA: 123/1019/251
 Forwarding Entry Used: 0
 Offload IPSEC Antireplay ENC Status: Disable
 Offload IPSEC Antireplay DEC Status: Disable
 Offload Host IPSEC Traffic: Disable

3com 4200G:

 >display interface GigabitEthernet1/0/1
  GigabitEthernet1/0/1 current state : UP
  IP Sending Frames'  Format is PKTFMT_ETHNT_2, Hardware address is 0024-733d-eb63
  Media type is twisted pair, loopback not set
  Port hardware type is 1000_BASE_T
  1000Mbps-speed mode, full-duplex mode
  Link speed type is autonegotiation, link duplex type is autonegotiation
  Flow-control is not enabled
  The Maximum Frame Length is 1522
  Broadcast MAX-pps: 3000
  Unicast MAX-ratio: 100%
  Multicast MAX-ratio: 100%
  Unknown Multicast Packet drop: Disable
  Unknown Unicast Packet drop: Disable
  Forbid jumbo frame to pass
  PVID: 1
  Mdi type: auto
  Port link-type: access
   Tagged   VLAN ID : none
   Untagged VLAN ID : 1
  Last 300 seconds input:  725 packets/sec 280476 bytes/sec
  Last 300 seconds output:  1023 packets/sec 1215396 bytes/sec
  Input(total):  7997813332 packets, - bytes
          - broadcasts, - multicasts, - pauses
  Input(normal):  7997813332 packets, 3653700343084 bytes
          201 broadcasts, 281 multicasts, 0 pauses
  Input:  0 input errors, 0 runts, 0 giants,  - throttles, 0 CRC
          0 frame,  0 overruns, 0 aborts, - ignored, - parity errors
  Output(total): 10065435185 packets, - bytes
          - broadcasts, - multicasts, - pauses
  Output(normal): 10065435185 packets, 10461332485707 bytes
          842269 broadcasts, 117663100 multicasts, 0 pauses
  Output: 0 output errors,  - underruns, - buffer failures
          0 aborts, 0 deferred, 0 collisions, 0 late collisions
          - lost carrier, - no carrier

FortiRack_Eric · ‎02-18-2011

Try setting MTU and TCP-MSS to lower values TCP-MSS is MTU-40 for the tunnel (on both sides)

Rackmount your Fortinet --> http://www.rackmount.it/fortirack

Report Inappropriate Content · ‎02-18-2011

Well, that would be the MTU preferably, because the issue can be triggered with ICMP already. One might try that if one had a test system - which I haven' t - but that' d imply hunting for a driver bug, where the standard 1500 MTU would cause problems specifically in the presence of IPsec. As we' ve already seen anyway that the issue depends on packet sizes, and that all other non IPsec traffic is unaffected. I guess I would leave such experiments really to the dealings of a support call, if Fortinet had actual reason to believe that we' re looking at a driver bug related to MTU and thus justify fiddling with production in such a manner. But thanks anyway.

Report Inappropriate Content · ‎03-07-2011

Just a brief update, in case anybody stumbles across this and wonders about the solution. It' s been a known issue, decribed in Bug ID 118774: IPSec AES encryption performance for packets of length greater than 418 bytes is significantly lower than previous The issue was fixed in v4.0 MR1 Patch 5. We now updated to v4.0 MR1 Patch 9, and the problem is indeed gone. Thanks to everybody for help and advice.

IPsec VPN latency issues with (not really) large packages.

Nominate a Forum Post for Knowledge Article Creation