Categories
An Introduction to the Extended Berkeley Packet Filter
We recently undertook a small project to find out some more about the extended Berkeley Packet Filter (eBPF), what it can do, and about some of the challenges and limitations of working with it.
In this initial article we give a high level overview of eBPF and offer some advice which might help you to get started with eBPF development projects.
In a subsequent post we will explore a small eBPF program in more depth to give some working examples of eBPF use.
This thing called eBPF
The extended Berkeley Packet Filter (eBPF) first officially appeared in Linux 3.18, which was released in December 2014 -- nearly a decade ago at the time of writing.
At the cutting edge of technology, ten years is a long time.
But on the other hand, in the world of stable software, it's not that long. Operating systems which were contemporary with Linux 3.18 are still being supported today. The cutting edge may move quickly, but not everyone keeps pace with it!
So it is remarkable, perhaps, to see how far eBPF adoption has spread in the relatively short span of a decade.
A visitor to the eBPF website can find a list of prominent users which reads like a who's who of big tech names.
Google, Microsoft, and Netflix use eBPF for monitoring. Meta and Cloudflare use it for load balancing. Apple use it for security monitoring. Android uses it for network and power state tracing.
And yet, for all this success and widespread use, eBPF is a strangely anonymous technology. Unless you've had reason to work with it directly, you'd be forgiven for wondering what, in fact, it actually is.
So what is eBPF?
Essentially, eBPF is a sandboxed environment in the Linux kernel. Userspace can load more-or-less arbitrary programs as bytecode which can be set to run when certain events occur.
There are a set of predefined events that programs can be attached to (these are often known as "program types"). Should no pre-existing attachment point exist for the functionality you want, kprobes can be used to attach to almost any point of the kernel.
eBPF programs are compiled to bytecode using the clang/LLVM toolchain. The bytecode itself is defined by a special BPF instruction set architecture. The bytecode is executed by a small virtual machine implemented inside the kernel.
Since eBPF code runs in the kernel, it is necessary to ensure the code won't cause negative side effects. To achieve this, eBPF programs are checked by a verifier which runs as a part of the loading process.
The verifier does extensive checking of the program before allowing it to be used in the kernel. It checks each instruction path, validating register and memory accesses. It also checks for excessive looping. While the verifier is primarily intended to ensure that eBPF programs are safe and cannot compromise the system, it is also useful during development as it helps to catch bugs prior to running the code.
eBPF code running in kernel context has access to a set of helper APIs. These APIs provide various useful pieces of functionality. For example, there is a helper for logging, and another for access to the network forwarding table. More on the helper API later.
Finally, the kernel allows for interactions with userspace, or between eBPF programs, by means of BPF maps. Maps are persistent data structures which can be populated with arbitrary data by eBPF code or userspace. Different map types exist to fulfil different requirements: for example, there is a simple array type, a hashmap type, etc.
Although this high-level overview may sound relatively dry, it's hard to overstate how powerful eBPF can potentially be. eBPF programs can offer anything from rich custom-defined runtime system metrics to flexible re-imagining of the networking dataplane: the world is your oyster.
Of course, its not necessary to do anything dramatic with eBPF. Even if the existing Linux kernel out of the box is just fine for your usecase, eBPF may be able to provide more sophisticated debugging or analytics tools which would be useful. Getting some insight into eBPF technologies may therefore be beneficial even if it doesn't massively disrupt your existing software stack.
Essential reading for eBPF fundamentals
When we initially started looking at eBPF we found a confusing array of different documents and examples available online.
It can be quite hard to get a sense of how the various moving parts fit together!
Perhaps the best advice for a newcomer starting with eBPF is to start with the basics:
- an understanding of the program types,
- the details of the kernel/userspace interface,
- an understanding of the in-kernel helper API.
eBPF program types
As described earlier, eBPF program types are a way to describe the event that triggers a given eBPF program, and the context that is available to that program when it executes. Discussion of the full set of program types is a blog post in its own right.
The bpf(2) syscall
Once you have a reasonable sense of what the different program types look like, you need to know how to actually load a program.
At the lowest level, all eBPF interactions with the kernel are driven by the bpf(2) syscall. If you want to load code, you do so with bpf(2). If you want to access a map, you do so with bpf(2).
Therefore a good way to get a fundamental understanding of what the kernel/userspace interface looks like is to read the bpf(2) manpage before delving too deep into the eBPF ecosystem.
bpf-helpers
The other very useful manpage to at least skim before beginning your eBPF journey is bpf-helpers(7), which documents the in-kernel helper functions that eBPF programs can call.
A crucial thing to note when exploring the in-kernel helpers API is that not all helpers are compatible with all program types, nor with all map types.
Figuring out the details of what helper you can call when is a matter of examining the appropriate kernel code: happily the manpage points you to specific kernel sources to look at to determine what calls apply to what contexts.
Understanding which map types and helper APIs are available to each program type is essential when designing your eBPF program.
Understanding eBPF portability
Since eBPF is compiled to bytecode for a defined ISA, eBPF portability is not so much about portability across architectures so much as portability across different kernels.
This is a potential issue since Linux kernel does not guarantee stable internal APIs from one release to the next.
This naturally presents a bit of an issue for eBPF code which gains its utility from being able to poke around in kernel internals.
Don't we just end up with much the same issues as we see when trying to maintain kernel modules out-of-tree?
Well, yes and no.
Data structure dependencies
Depending on your program, you may find that the existing eBPF API in the kernel provides a stable view of the data structure you're interested in.
An example of such a view is struct __sk_buff.
This structure is a mirror of the in-kernel struct sk_buff that represents a network frame.
The mirrored version of the structure exposes a subset of fields from the internal structure, and has a stable layout (new fields added over time are added to the end of the structure).
eBPF programs which deal with network frames are passed a pointer to a struct __sk_buff when they execute. When the eBPF program accesses the network frame through the pointer, the interpreter modifies these accesses on the fly as required to read and write the underlying internal data structure.
By implementing this layer of indirection, a eBPF program written to access a field in struct __sk_buff doesn't need to know about changes in the underlying structure. It can rely on the interpreter in the runtime kernel modifying accesses if necessary such that reads and writes end up accessing the correct memory location.
However not all kernel structures have a corresponding stable eBPF view associated with them. eBPF programs accessing internal structures directly do have to make accommodation in some way for internal fields being renamed or relocated.
There have been two approaches to dealing with this to date.
The first can be found in the BPF Compiler Collection (bcc). This project provides a set of tooling which allows eBPF C code to be embedded in a userspace program as a string literal which is then compiled using clang/LLVM, and loaded into the kernel when the userspace program is executed.
This allows you to side-step data structure changes by compiling your eBPF code against the running kernel prior to running it.
The bcc approach does come with some downsides: because the eBPF code is compiled on the end-user machine, the bcc code has to bundle the clang/LLVM toolchain, and the machine must have kernel header packages installed. The build overhead may also be undesirable in some contexts.
More recently, the BPF Compile Once Run Anywhere (BPF CO-RE) initiative has taken an alternative approach to the problem.
BPF CO-RE works by abstracting type information from the kernel that is running when the eBPF bytecode is built, and then using that information to reconcile data structure accesses on the kernel the bytecode runs on.
Support for CO-RE is provided as a part of libbpf, which is developed as a part of the Linux kernel tree.
Kernel eBPF helper dependencies
Data structure dependencies are one aspect of eBPF portability.
Another important factor is the kernel-side helper API, which is also evolving over time.
This can be an issue if your eBPF program depends on a helper function which isn't available in the run-time kernel, or whose behaviour has changed over time.
Projects which control the deployment environment (for example, embedded devices, or some Cloud services) may be able to work around this by ensuring that the target kernel supports the eBPF functionality required.
On the other hand, if you need to deploy to arbitrary kernels, it's a more difficult problem to solve. Realistically you'll need to call out the dependency on a specific minimum kernel version required by the eBPF program.
Dependencies on the kernel eBPF API are certainly worth bearing in mind when working on eBPF code. It makes sense to consider them in the same way you'd consider any other kernel dependency when specifying requirements for the target runtime environment.
In conclusion
In this article we've presented a high-level overview of what eBPF is and what it can offer. We have given some pointers on where to start with eBPF development, and summarised some of the issues you might need to consider when planning to use eBPF in a project.
In our next eBPF post we plan to present a worked example of an eBPF program, and explore the journey we went on when developing the program.