Berkeley Packet Filter

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Berkeley Packet Filter (BPF) provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received. It is available on most Unix-like operating systems. In addition, if the driver for the network interface supports promiscuous mode, it allows the interface to be put into that mode so that all packets on the network can be received, even those destined to other hosts.

BPF supports filtering packets, allowing a userspace process to supply a filter program that specifies which packets it wants to receive. For example, a tcpdump process may only want to receive packets that initiate a TCP connection. BPF only returns packets that pass the filter that the process supplies. This avoids copying unwanted packets from the operating system kernel to the process, greatly improving performance.

BPF is sometimes used to refer just to the filtering mechanism, rather than to the entire interface. Some systems, such as Linux and Tru64 UNIX, provide a raw interface to the data link layer other than the BPF raw interface but use the BPF filtering mechanisms for that raw interface.

Raw interface[edit]

BPF provides pseudo-devices that can be bound to a network interface; reads from the device will read buffers full of packets received on the network interface, and writes to the device will inject packets on the network interface.

In 2007, Robert Watson and Christian Peron added zero-copy buffer extensions to the BPF implementation in the FreeBSD operating system,[1] allowing kernel packet capture in the device driver interrupt handler to write directly to user process memory in order to avoid the requirement for two copies for all packet data received via the BPF device. While one copy remains in the receipt path for user processes, this preserves the independence of different BPF device consumers, as well as allowing the packing of headers into the BPF buffer rather than copying complete packet data.[2]


BPF's filtering capabilities are implemented as an interpreter for a machine language for the BPF virtual machine. Programs in that language can fetch data from the packet, perform arithmetic operations on data from the packet, and compare the results against constants or against data in the packet or test bits in the results, accepting or rejecting the packet based on the results of those tests.

Traditional Unix-like BPF implementations can be used in userspace, despite being written for kernel-space. This is accomplished using preprocessor conditions.

Extensions and optimizations[edit]

Some projects use BPF instruction sets or execution techniques different from the originals.

Some platforms, including FreeBSD, NetBSD, and WinPcap, use a just-in-time (JIT) compiler to convert BPF instructions into native code in order to improve performance. Linux includes a BPF JIT compiler which is disabled by default.

Kernel-mode interpreters for that same virtual machine language are used in raw data link layer mechanisms in other operating systems, such as Tru64 Unix, and for socket filters in the Linux kernel and in the WinPcap packet capture mechanism. Since version 3.18, the Linux kernel includes an extended BPF virtual machine, termed extended BPF (eBPF). It can be used for non-networking purposes, such as for attaching eBPF programs to various tracepoints.[3][4][5] Since kernel version 3.19, eBPF filters can be attached to sockets,[6][7] and, since kernel version 4.1, to traffic control classifiers for the ingress and egress networking data path.[8][9]

A user-mode interpreter for BPF is provided with the libpcap/WinPcap implementation of the pcap API, so that, when capturing packets on systems without kernel-mode support for that filtering mechanism, packets can be filtered in user mode; code using the pcap API will work on both types of systems, although, on systems where the filtering is done in user mode, all packets, including those that will be filtered out, are copied from the kernel to user space. That interpreter can also be used when reading a file containing packets captured using pcap.


The original paper was written by Steven McCanne and Van Jacobson in 1992 while at Lawrence Berkeley Laboratory[10][11]

In August 2003, SCO Group publicly claimed that the Linux kernel was infringing Unix code which they owned. Programmers quickly discovered the code in question was the Berkeley Packet Filter, which in fact SCO never owned. SCO has not explained or acknowledged the mistake but the ongoing legal action may eventually force an answer.

Security concerns[edit]

Spectre attack may leverage Linux kernel eBPF JIT compiler to extract data from other kernel processes and allow user-space to read it. [12]

See also[edit]


  1. ^ "bpf(4) Berkeley Packet Filter". FreeBSD. 2010-06-15.
  2. ^ Watson, Robert N. M.; Peron, Christian S. J. (2007-03-09). "Zero-Copy BPF" (PDF).
  3. ^ "Linux kernel 3.18, Section 1.3. bpf() syscall for eBFP virtual machine programs". December 7, 2014. Retrieved January 19, 2015.
  4. ^ Jonathan Corbet (September 24, 2014). "The BPF system call API, version 14". Retrieved January 19, 2015.
  5. ^ Jonathan Corbet (July 2, 2014). "Extending extended BPF". Retrieved January 19, 2015.
  6. ^ "Linux kernel 3.19, Section 11. Networking". February 8, 2015. Retrieved February 13, 2015.
  7. ^ Jonathan Corbet (December 10, 2014). "Attaching eBPF programs to sockets". Retrieved February 13, 2015.
  8. ^ "Linux kernel 4.1, Section 11. Networking". June 21, 2015. Retrieved October 17, 2015.
  9. ^ "BPF and XDP Reference Guide". April 24, 2017. Retrieved April 23, 2018.
  10. ^ McCanne, Steven; Jacobson, Van (1992-12-19). "The BSD Packet Filter: A New Architecture for User-level Packet Capture" (PDF).
  11. ^ McCanne, Steven; Jacobson, Van (January 1993). "The BSD Packet Filter: A New Architecture for User-level Packet Capture". USENIX.
  12. ^ "Reading privileged memory with a side-channel". Project Zero team at Google. January 3, 2018. Retrieved January 20, 2018.

External links[edit]