FortiGate
FortiGate Next Generation Firewall utilizes purpose-built security processors and threat intelligence security services from FortiGuard labs to deliver top-rated protection and high performance, including encrypted traffic.
alafrance
Staff & Editor
Staff & Editor
Article Id 396631
Description This article describes an optional performance optimization on high-speed PPPoE connections that can be done via the affinity-packet-redistribution settings available in FortiOS 7.4.0
Scope FortiOS 7.4.0+.
Solution

When PPPoE is enabled on an interface, the interface mode also presents potential performance issues as PPPoE frames being unsupported for acceleration by the NP6, NP6lite, NP6xlite, NP7, and NP7lite.

As a result, no sessions traversing this link will be offloaded. Packets will be handled by the FortiGate CPU only.

 

By default, load distribution is primarily done on interface ingress and the resulting interrupts are distributed towards available CPUs via a hash-based distribution (from whichever L2/L3/L4 headers are available).

 

Due to the NP6, NP6lite, NP6xlite, NP7, and NP7lite not being able to understand PPPoE session frames (ethertype 0x8864) and being unable to look beyond the PPPoE/PPP headers into the IP headers situated within, only the L2 headers are used to distribute to available CPUs. The active PPPoE session would only be sourced from the modem's single MAC address towards the FortiGate interface MAC address, resulting in all ingress PPPoE packets being sent to one CPU only.


Capture_PPPoE.PNG

 

Because of this, performance will vary depending on the single-thread performance of the CPU installed in a given FortiGate.

 

For models with X86-64 based processors (typically FG200+), this may be less of a concern depending on the speed of the service, but for an ARM-based FortiGate (SOC1,SOC2,SOC3,SOC4,SP5), the impacts can be more significant where the down-link speeds are not as expected. The CPU type can be confirmed with the following command:


FG101FTK1900XYZ# diagnose hardware sysinfo cpu
processor : 0
CPU Frequency : 1400 MHZ
model name : ARMv8 Processor rev 4 (v8l)
BogoMIPS : 100.00
Features : fp asimd aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4

From 7.4.0+ onward, it is possible to modify the load-distribution behavior of packets received on an interface from hash-based to round-robin based. This can help better utilize all of the available CPUs to handle packets in the down-link direction for interfaces in PPPoE mode.

 

config system affinity-packet-redistribution

    edit 2

        set interface "x2"

        set round-robin enable

        set affinity-cpumask "ff" <----- Signifies all CPUs.

    next
end

 

Note that ingress packet re-ordering can occur and that some applications/protocols can be sensitive to this.Round-robin should only be used on the physical underlying PPPoE interface. Ideally the PPPoE interface should be a dedicated physical port and not an aggregate or trunk shared with other VLAN's.

The following is an example of default behavior (FG101F, X2 is PPPoE, Client on X1, Service Down:1.5G/Up:1G) where one core is significantly higher than the rest and downlink speeds cannot be reached. 

 

alafrance_0-1749850362372.png

alafrance_1-1749850362646.png

 

In the same environment as above with round-robin distribution, utilization is more even across all cores and the downlink speed is reached:

 

alafrance_2-1749850362585.png

alafrance_3-1749850362646.png

Contributors