Express Data Path
|Original author(s)||Brenden Blanco, Tom Herbert|
The idea behind XDP is to add an early hook in the RX path of the kernel, and let a user supplied eBPF program decide the fate of the packet. The hook is placed in the network interface controller (NIC) driver just after the interrupt processing, and before any memory allocation needed by the network stack itself, because memory allocation can be an expensive operation. Due to this design, XDP can drop 26 million packets per second per core with commodity hardware.
The eBPF program must pass a preverifier test before being loaded, to avoid executing malicious code in kernel space. The preverifier checks that the program contains no out-of-bounds accesses, loops or global variables.
The program is allowed to edit the packet data and, after the eBPF program returns, an action code determines what to do with the packet:
XDP_PASS: let the packet continue through the network stack
XDP_DROP: silently drop the packet
XDP_ABORTED: drop the packet with trace point exception
XDP_TX: bounce the packet back to the same NIC it arrived on
XDP_REDIRECT: redirect the packet to another NIC or user space socket via the AF_XDP address family
XDP requires support in the NIC driver but, as not all drivers support it, it can fallback to a generic implementation, which performs the eBPF processing in the network stack, though with slower performance.
XDP has infrastructure to offload the eBPF program to a network interface controller which supports it, reducing the CPU load. At the time only Netronome cards supports it, with Intel and Mellanox working on it.
Along with XDP, a new address family entered in the Linux kernel starting 4.18. AF_XDP, formerly known as AF_PACKETv4 (which was never included in the mainline kernel), is a raw socket optimized for high performance packet processing and allows zero-copy between kernel and applications. As the socket can be used for both receiving and transmitting, it supports high performance network applications purely in user space.
- "[GIT] Networking - David Miller". lore.kernel.org. Retrieved 2019-05-14.
- Høiland-Jørgensen, Toke (2019-05-03), Source text and experimental data for our paper describing XDP: tohojo/xdp-paper, retrieved 2019-05-21
- "A thorough introduction to eBPF [LWN.net]". lwn.net. Retrieved 2019-05-14.
- "net: Generic XDP". www.mail-archive.com. Retrieved 2019-05-14.
- "BPF, eBPF, XDP and Bpfilter… What are these things and what do they mean for the enterprise? - Netronome". www.netronome.com. Retrieved 2019-05-14.
- "XDP acceleration using NIC metadata" (PDF).
- "kernel/git/torvalds/linux.git - Linux kernel source tree". git.kernel.org. Retrieved 2019-05-16.
- "Questions about AF_PACKET V4 and AF_XDP". Kernel.org.
- "Accelerating networking with AF_XDP [LWN.net]". lwn.net. Retrieved 2019-05-16.
- XDP documentation on Read the Docs
- AF_XDP documentation on kernel.org
- XDP walkthrough at FOSDEM 2017 by Daniel Borkmann, Cilium
- AF_XDP at FOSDEM 2018 by Magnus Karlsson, Intel
- eBPF.io - Introduction, Tutorials & Community Resources
- L4Drop: XDP DDoS Mitigations, Cloudflare
- Unimog: Cloudflare's edge load balancer, Cloudflare
- Open-sourcing Katran, a scalable network load balancer, Facebook
- Cilium's L4LB: standalone XDP load balancer, Cilium
- Kube-proxy replacement at the XDP layer, Cilium
- eCHO Podcast on XDP and load balancing