eBPF and API Security with Traceable

Amod Gupta
Dan Gordon
|
July 15, 2022

You might hear the term “eBPF” (or just “BPF”) come up when talking to DevOps and DevSecOps folks about network, application, infrastructure, or security management. eBPF is based on a well-established kernel technology that has gained renewed interest in today’s containerized world. eBPF opens the possibilities of how monitoring and other capabilities can be done on top of the operating systems used most for the cloud.

As developers learn to use the capabilities eBPF provides, it promises to radically advance what our infrastructure, application, and security tools will be able to do. As one of those security tools, a fair question is how Traceable is compatible with or leveraging eBPF. Here’s some background on what eBPF is, how it works, and why Traceable is the leading API Security solution working with eBPF.

What is eBPF?

eBPF (extended Berkeley Packet Filter) is a kernel feature that has been shipped with Linux Kernels since 2014, the same year the first Kubernetes commit was made [1, 2, 3]. Unlike most developer code that gets written in user space, using eBPF requires writing code in the kernel space which brings distinct advantages in terms of performance and resource consumption.

eBPF is very popular with teams that need to operate in high-performance environments. For example, Netflix has about 15 eBPF programs running on every server instance, Facebook, in contrast, has about 40 eBPF programs that are active on every server with another 100 eBPF programs that get spawned and killed as needed [4].


How does eBPF work?

eBPF provides a low-level event-driven framework that enables programmers to run code in the kernel’s privileged context & capture how the kernel reacts to defined triggers like network events, system calls, function entries, and kernel tracepoints. Effectively, this means that eBPF enables the ability for user space programs to read and react to data from the actions of the kernel.

eBPF ensures the safety of the kernel and other processes running on it by requiring authorization and validation before it runs programs in a kernel sandbox. eBPF is native in all modern-day Linux kernels and is also available in Windows, so the framework is already widespread, especially in cloud-based applications.

Applications in the industry

eBPF was initially used as a way to increase observability and security when filtering network packets. However, over time, it has become a way to make the implementation of user-supplied code safer, more convenient, and better performing. eBPF has become increasingly popular and is now used for many applications.

Big cloud companies like Netflix, Facebook, AWS, Google, and Microsoft are providing new cloud capabilities and tools using eBPF. And there are several new management tools from companies such as Isovalent’s Cilium for cloud-native networking, security, and observability, New Relic’s Pixie for Kubernetes observability, and Meta's Katran, a high performance layer 4 load balancer. Examples of other projects using eBPF can be found on the eBPF site at https://ebpf.io/projects.

How Traceable uses eBPF

Traceable is the only API security vendor that provides an eBPF-based solution for capturing API security-related data from application environments. Due to its agentless nature and low resource requirements, it’s a popular option with our customers in production, especially in industries such as FinTech that have high-performance requirements.

The eBPF-based data collection has the ability to show deep API traffic data (request/response headers and bodies/payloads) for both North-South and East-West traffic. This data collection is out-of-band, non-invasive, and fast and highly efficient because it’s running at the kernel level. What’s more, this high eBPF efficiency, combined with Traceable technology, results in a near-zero overhead (< 1 ms latency difference) on instrumented applications.

The data collected by the eBPF instrumentation is processed by the same Traceable platform (AI/ML, data processing, analysis, etc) as the data from all the other Traceable data collectors, enabling the same set of capabilities: API discovery and risk management; API protection; and API security analytics.

How does Traceable eBPF-based data collection work?

Traceable eBPF instrumentation is one of our agentless deployment options that can be deployed via helm charts/scripts as Kubernetes daemonsets. It attaches probes to kernel functions and collects the data on function execution. The types of functions the probes are attached to are network socket transactions like open, connect, read, write, and close calls.

The eBPF deployment can be configured to capture data from all or specific pods/containers, based on deployment time configuration (annotations). For instance, Traceable can choose to capture data only from nginx-ingress containers, or from backend services. Traceable can additionally choose to collect ingress or egress data depending on whether the container in question is a gateway or backend service.

Unlike pod mirroring solutions or sidecar-based data capture solutions, which need to be deployed in each pod, ebpf-based daemonset can be run 1 pod per node, thereby significantly reducing the resource footprint across the cluster. For the same reason, it also reduces network chattiness.

The following diagram shows a high-level flow of how Traceable's eBPF-based data collection works:

Where is this headed?

Traceable is continuing our leadership in leveraging eBPF to increase the effectiveness and efficiency of API security. We are continuing to work closely with our customers using eBPF in their production and pre-production environments to continue to push the envelope on how eBPF can help improve their API security posture.

In general, as eBPF continues to become more and more mainstream, and the industry learns more about how to create value on top of it, the list of innovations will continue to grow. With eBPF, the future of infrastructure, application, and security management will be very exciting.

Learn more

Want to learn more about eBPF and Traceable from our CTO and our technical team? Join our upcoming webinar, where we'll dive into the details.

About the Authors

Amod Gupta is Product Manager at Traceable AI.

Dan Gordon is a Technical Evangelist at Traceable AI.

You can learn more about Traceable AI and how it observes, analyzes, and secures APIs. Depending on your role and the needs of your organization, there are multiple options to get started with Traceable AI and its many options for observability and API security:

Download Blog Post

The Inside Trace

Subscribe for expert insights on application security.

Thanks! Your subscription has been recorded.

or subscribe to our RSS Feed

Read more

See Traceable in Action

Learn how to elevate your API security today.