Description | This article describes how to fix a performance SLA issue with a VSAT link SD-WAN member configured with default parameters. |
Scope | Any supported version of FortiOS. |
Solution |
VSAT links are prone to high latency: sometimes it can exceed 2000 milliseconds (2 seconds). Some documentation estimates VSAT latency at a range between 600 ms – 2000 ms, but this is a theoretical range that may be surpassed in a production environment. If Performance SLA is configured with default parameters on an SD-WAN member (VSAT link), 'check interval' and 'probe-timeout' will be set to 500 ms (this is the default timer). These default timers mean the targeted server will be probed 4 times in 2 seconds, despite how the link's latency is 2000 ms. As such, the 'probe-timeout' timer will register any probe it did not receive a response from within 500 ms as dead. This situation can lead to high CPU usage on 'lnkmtd process' and consequently cause the performance SLA on this VSAT link to fail (i.e. fall out of an acceptable SLA window), as packet losses will mount up quickly. When this happens, the following output is generated in 'link-monitor debug' to alert the admin of the issue:
The link-monitor debug command:
diagnose debug app link-monitor -1 diagnose debug enable
Example log output (only the relevant lines are shown):
lnkmtd::ping_socket_set(95): ---> Fail to connect ping socket for monitor… lnkmtd::ping_socket_set(95): ---> Fail to connect ping socket for monitor…
If the 'lnkmtd' process is restarted, the issue will subside and return later. The system can work for a few hours before the issue occurs again.
Use the following command to restart the process:
diagnose sys kill 11 <process ID>
The fix to this issue is to increase both 'check interval' and 'probe-timeout' timers. The check interval can be increased to 2000ms, for example, and the probe-timeout can be changed to 3000ms. Note that the probe-timeout timer can only be changed in the CLI:
config system sdwan config health-check edit <name> set probe-timeout 3000 set interval 2000 end |