Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
omedirk
New Contributor II

SD-WAN data de-duplication issue for UDP protocol

Dear All,

 

please help if you have experience with this issue or something similar.

 

we have a SD-WAN setup quite alike:

https://community.fortinet.com/t5/FortiGate/Technical-Tip-Configure-FortiGate-SD-WAN-with-an-IPSEC-V...

we changed the following:

- branch has two wan interfaces

- crossed VPN tunnels added for dual point of failure resistance

 

We run a special automation protocol between the branches. It's a 'dumb' UDP stream from both sides, no reply response etc.

 

To make the setup as reliable as possible we tried duplicating the messages over all 4 VPN's. This after some issues we made work (FGT firewall did not understand the protocol so we switched to separate sending and recieving NIC's on the automation side).

 

However, we now seem to have a problem in the de-duplication where either the order or something else is not coming out as it was coming in. The recieving end gives a CRC error once in a while.

 

How to proceed in debugging this? Fortinet support says UDP might have these kind of issues, but i still hope we might fix this.

 

Rgds,

Rene

 

5 REPLIES 5
AlexC-FTNT
Staff
Staff

Hello Rene,

 

The CRC error occurs because the sent file is damaged or incomplete, and I assume that you further isolated this to UDP deduplication.
But when it comes to order of packets that should not cause a problem - you would maybe get retransmissions. And if a packet "is not coming out as it was coming in", I would first check the link quality. 
Not to dismiss completely the deduplication issue, but... do you have separate UDP streams/files sent over each link? And isn't it easier to simplify the external monitoring and use ICMP, or the SLA monitor under the SDWAN setup?


- Toss a 'Like' to your fixxer, oh Valley of Plenty! and chose the solution, too00oo -
omedirk
New Contributor II

Hi Alexc,

 

thanks for your reply.

I will further investigate the UDP packets internally today, to see if both packets are equal on byte level.

it's however not a file. It's a safety protocol that crc's it's own packets.

 

I am quite sure the problem is in duplication / de-duplication because as soon as we drop duplication rules it works fine.

 

Also duplication rules work for the non-safety, but the safety protocol is more critical to order and retransmissions.

duplication is done with SD-WAN rules, this duplicates the packets and de-duplicates on the other end.

I am trying to make duplication work because the SLA based switching between the 4 connections is slow (apparently 4s round trip) before it's fully swithced to another connection. we still use this of other, less critical, traffic.

making this faster is also an option.

 

ps. i am in contact with support for this issue also, but was hoping to find someone out here that might have similar issues.

omedirk
New Contributor II

PS.

the slow switching is solved now by using a 'quality' based selection instead of SLA in the sdwan sla config.

switching is now done continuously based on quality and after around 1s when a link dies.

omedirk
New Contributor II

The issue has been investigated further.

After choosing general duplication rules (not sd-rule bound) the duplication and de-duplication works as long as nothing changes in the paths.

 

it is then possible to take out links and the connection remains.

 

However, when links come back up, the safety controller detects a wrong order (i expect) of udp packets and stops communications.

 

Wrong order also results in CRC error, so the packet itself is not incorrect, it's just the wrong packet at the wrong time. The safety protocol is pretty critical in regards to these things, it must be, because lives could depend on this communication stream.

 

so at this point the issue seems to be packet order when the links come back up.

DanielLeblanc
New Contributor

Hi Omedirk,

did you find a final solution as I have the exact same problem for a safety PLC.

Thanks

Labels
Top Kudoed Authors