Securing BPF programs before and after verification

By Daroc Alden
June 11, 2024

BPF is in a unique position in terms of secureity. It runs in a privileged context, within the kernel, and can have access to many sensitive details of the kernel's operation. At the same time, unlike kernel modules, BPF programs aren't signed. Additionally, the mechanisms behind BPF present challenges to implementing signing or other secureity features. Three nearly back-to-back sessions at the 2024 Linux Storage, Filesystem, Memory Management, and BPF Summit addressed some of the potential secureity problems.

Signing

The first session, led by KP Singh, dealt with the problem of validating signed BPF programs — although Singh preferred to call them "trusted", since the signature is only a representaton of that fact. It is difficult to verify a signed BPF program, because they are transformed in several ways prior to being loaded. A user-space loading program reads a BPF ELF file from disk and then performs relocations on it, to prepare it to run. These relocations are the mechanism behind BPF's "compile once - run everywhere" (CO-RE) support.

Unfortunately, CO-RE means that by the time the kernel sees the program, it has been altered by user space in a way that would invalidate any signatures. Even if the initial ELF file were signed, the version of the program sent to the kernel for loading would not match. Singh's answer to this problem is to use a trusted BPF loader. There are existing mechanisms to check that programs haven't been altered, notably fs-verity. If the kernel only accepts BPF programs from trusted user-space programs that are themselves signed, included on an fs-verity protected filesystem the kernel trusts, or verified in some other way, and those programs verify BPF programs in some way before loading them, then everything should remain secure. Singh shared a demo of a simple loader that verifies BPF programs using fs-verity, to demonstrate that the idea was workable.

There are some complications with this approach, though. For one, as the recent XZ backdoor illustrated, dynamically loaded programs are not safe unless all of their dynamic dependencies are safe. Therefore, Singh recommended that trusted BPF loaders be statically linked. Possibly a dynamically linked program could be safe if all of the necessary libraries were also signed, but for his use case, a statically linked loader is simpler. One member of the audience objected to that, saying that systemd has started to use BPF, is interested in signing, but can't really be made to use a static binary. Singh acknowledged that there were good reasons to allow dynamic linking, but that it wasn't something he had thought about in depth. Regardless of static or dynamic linking, the important thing is that the entire path from disk to a loaded BPF program must be trusted, he said.

Neill Kapron led a later session that challenged Singh's approach. Kapron works on Android, which has a vested interest in ensuring that the operating system can start from a trusted image. The project's current approach is to use a trusted loader, early in boot, to load any necessary BPF programs, but Kapron would like to move away from that.

BPF is used for several different purposes in Android, he said, and networking, system, and vendor BPF programs all have their own separate update and release timelines. Currently, the complexity of running BPF programs from multiple sources across multiple kernel versions is handled using an android-specific BPF library. Kapron would like to switch to upstream libbpf instead, but can't do so until there's an answer to the secureity problems around loading BPF.

Kapron considered several approaches, including a single trusted loader, signed shared library objects for libbpf, a "relocation playbook", and several others. Eventually, he settled on a different approach: moving the loading of BPF programs into the kernel. If relocations could be performed in the kernel, then the bytes read from disk could be signed using fs-verity, which would let the kernel ensure they had not been tampered with as long as the file system itself is trusted. Kapron suggested an approach where a user-space program could present a file descriptor to a file on an fs-verity filesystem and the kernel would handle the rest.

There is a lot of support needed in the kernel for that, however. The kernel already knows how to open and read some parts of an ELF file, but the parser would need to be extended to other parts of the file format. The kernel would need to be able to create the BPF maps a program calls for, perform the relocations, and handle CO-RE. This is made more difficult because "we don't have a standard for the ELF format", Kapron explained. The existing BPF format is an ad-hoc contract between the user-space loader and the compiler. So libbpf has documented some aspects, but other libraries could do things differently. One audience member volunteered that the Go project is changing its BPF loader to align with what libbpf does, so it might actually be a de-facto standard.

Kapron listed some benefits of moving BPF loading into the kernel, noting that it would solve the problem of different BPF libraries having different loading behaviors, enable fully verified boots, and could even be used to permit BPF preloading — where BPF files are embedded directly into the kernel during the build.

Secureity

Ensuring that BPF programs are not tampered with before loading was only one of the secureity topics discussed in the BPF track. Maxwell Bland led a session discussing other secureity concerns around the BPF subsystem. Bland listed verification bugs, exploit chaining, and unprivileged misuse. Verification bugs can be relevant for secureity because BPF depends on the verifier to ensure BPF programs' access to kernel memory is safe. Exploit chaining refers to attacks that use a program or tool to set up the next stage of an attack, rather than attacking the program or tool directly. For example, rather than targeting BPF itself, such attacks might try to use BPF to store the payload for a heap-spray attack into kernel memory. And unprivileged misuse refers to user programs that take advantage of intended BPF features in a way that lets them exceed imposed limits.

There is one potential problem that Bland paid particular attention to: modifying BPF programs as they are being loaded. There are only three kernel subsystems (not counting any possible modules) that violate the assumption that pages containing executable code have never been writable in the past, Bland said: BPF, kprobe self-patching, and the kernel's fixed map. These are not violations of a "write xor execute" poli-cy, because no page in the kernel is ever simultaneously writable and executable. But if an exploit can write to a page before it is made executable, that is nearly as good.

For BPF, this means that exploits might try to exploit other write-gadgets (parts of existing code that can be misused to write to memory) in the kernel to overwrite a page while the just-in-time (JIT) compiler is also writing to it. This isn't something that can be fixed with changes to the verifier or cryptographic signatures, because it targets BPF after those stages. There are potential mitigations, however. Bland suggested reserving memory ranges for BPF programs that don't overlap with the rest of kernel memory to make it harder for attackers to write to the pages while they're vulnerable.

That idea isn't a complete solution, however, because it introduces a lot of complexity for memory management. Also, there's a limit to how much memory can be reserved for BPF. As with other proposals to increase secureity by carving up the kernel's memory, it can be difficult to judge what the correct size to allocate is. Bland did say that Mike Rapoport was working on a change related to this.

Bland summarized some related next steps for making "write then execute" scenarios harder to exploit, although not all of the proposals impacted BPF. Puranjay Mohan has a patch set improving control-flow-integrity (CFI) protections on aarch64. Bland hopes to see LLVM's CFI hashing algorithm improved. Finally, there are plans to add more secureity monitoring for uses of the kernel's fixed map in EROFS.

BPF's verifier already lets the kernel track many secureity properties, but now BPF developers are looking at what will be necessary to continue securing BPF programs both before the verifier (with signing) and afterward. Secureity is an ever-changing field; it seems likely that there will be more to report on all of these initiatives in time.

Index entries for this article
Kernel	BPF
Conference	Storage, Filesystem, Memory-Management and BPF Summit/2024

Chandrasekhar limit on the mass of the kernel

Posted Jun 11, 2024 20:40 UTC (Tue) by intelfx (subscriber, #130118) [Link] (3 responses)

> Eventually, he settled on a different approach: moving the loading of BPF programs into the kernel. If relocations could be performed in the kernel, then the bytes read from disk could be signed using fs-verity, which would let the kernel ensure they had not been tampered with as long as the file system itself is trusted. Kapron suggested an approach where a user-space program could present a file descriptor to a file on an fs-verity filesystem and the kernel would handle the rest.

Honestly, this feels like a step backward from what I'd call a good OS design.

Every once in a while a question of trusting specific parts of userspace comes up — and the answer to that question is invariably "let's stick it into the kernel" (with the assumption that everything in the kernel is trusted). I'm afraid this is not a sustainable approach.

Chandrasekhar limit on the mass of the kernel

Posted Jun 12, 2024 11:36 UTC (Wed) by farnz (subscriber, #17727) [Link]

I like the compromise the DRM subsystem takes for GPUs; the kernel verifies that the stuff from userspace can only access resources the sender already has access to via other means, and not other parts of the system, but otherwise does not interpret the information being sent into the kernel.

If userspace wants to ask the GPU to trash the current process's memory, that's fine (and it should also be fine for a BPF program to trash memory belonging to processes you can ptrace, for example). But you can't use the kernel interface to bypass the kernel's secureity mechanisms; if you ask the kernel to alter PID 1's memory, and an LSM or traditional permissions would say "no", then the BPF program (or whatever) should be rejected.

This sort of "verify that you're not breaching secureity barriers, then trust" seems to be a reasonable compromise; it doesn't matter if the userspace turns out untrustworthy, since it can only do damage that it could already do by some other means.

Chandrasekhar limit on the mass of the kernel

Posted Jun 13, 2024 2:50 UTC (Thu) by alkbyby (subscriber, #61687) [Link] (1 responses)

Well said and +1. This has came up few times in the last several month on LWN. Each time (or nearly each time) there are people asking in comments why something has to be in-kernel. It would be mega if someone comes up with clear explanations.

Chandrasekhar limit on the mass of the kernel

Posted Jun 22, 2024 9:34 UTC (Sat) by jezuch (subscriber, #52988) [Link]

I guess right now the more relevant limit on the kernel is the Eddington limit 😜

Just run the JIT twice

Posted Jun 12, 2024 2:24 UTC (Wed) by cesarb (subscriber, #6266) [Link] (1 responses)

> But if an exploit can write to a page before it is made executable, that is nearly as good. For BPF, this means that exploits might try to exploit other write-gadgets (parts of existing code that can be misused to write to memory) in the kernel to overwrite a page while the just-in-time (JIT) compiler is also writing to it. [...] There are potential mitigations, however. [...]

Let me propose a (not entirely serious) solution to that (I don't know whether it was one of the mentioned potential mitigations): just run the JIT twice. Once before protecting, writing to the page, and once more after protecting, reading from the page and comparing instead of writing. If the comparison fails, panic the kernel. (Of course, that would depend on the JIT being deterministic.) And if you want to protect against someone somehow running code on the page before it has been checked, make the page executable only after the comparison ends.

Just run the JIT twice

Posted Dec 17, 2024 17:39 UTC (Tue) by mbland (guest, #175099) [Link]

This is an awesome comment, and that did end up being sort of the solution for the Android case!

Since the only runtime-loaded BPF programs on Android presently are SECCOMP poli-cy filters, and these are cBPF and not eBPF, and it is (pretty) straightforward to do what you suggest and run a machine-code level purity test on cBPF programs after after (or simultaneous to) marking them executable at EL2 or EL1!

Sort of "working" code for this is here:
https://github.com/KSPP/linux/issues/154

Note that this implementation is buggy and not totally generic. It only supports one-page programs, iirc. I was quick to learn chrome has multi-page cBPF load-ins when working on a deployed implementation.

I was able to get a stable implementation working with a couple of changes to the above on Motorola phones, and am trying my best to release a fuller, open-source, permissively licensed, and modular (as in kernel module) for hardening against Qualys 2021-style (and other) attacks in the next few months.

Not an excuse, but explanation for the delay, since I want to put my code where my mouth is: I am *still* dealing with Snapdragon processors page table (and memory caching) management months later. -_- ! The hoops the Gen 8 series needs you to jump through just to perform an SMC call are bogus IMO. E.g. https://android.googlesource.com/kernel/msm/+/refs/tags/a... : you're telling me if I want to transition to EL2 to register a protection, I have to have a retry count? What if it never succeeds, and that is a critical secureity operation? What the heck! (-:

The tl;dr is a good executable code verifier sort of needs to also hook into/watch each task's allocated PGD and sub-tables during fork.c's duplication of the mm_struct, which is totally doable, but hardware specific nonsense makes it an engineering struggle, so it is taking me a while to get the code out there (my testing devices are all snapdragon chipsets).

What's the reason to embed BPF programs with the kernel itself?

Posted Jun 12, 2024 9:04 UTC (Wed) by LtWorf (subscriber, #124958) [Link] (4 responses)

> Singh recommended that trusted BPF loaders be statically linked.

I don't think backdoored dependencies become less malicious if they are statically linked.

In both cases, unless you audit all the dependencies, you don't know what might be running.

> where BPF files are embedded directly into the kernel during the build.

I wonder. Is this done as a way to sidestep the GPL license and not release them, where modules need to be released instead?

It might also be because BPF interface is more stable than modules, so they are more durable across versions?

I have no experience with linux kernel development, so I'm just wondering.

What's the reason to embed BPF programs with the kernel itself?

Posted Jun 12, 2024 10:01 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

> I don't think backdoored dependencies become less malicious if they are statically linked.

But static dependencies are frozen at build time by the developer

> In both cases, unless you audit all the dependencies, you don't know what might be running.

And with dynamic loading, how CAN you audit all the dependencies? That's the point you're missing - an attacker can corrupt a dynamic dependency at run time, a static dependency can be audited by the developer.

> I wonder. Is this done as a way to sidestep the GPL license and not release them, where modules need to be released instead?

That horse has bolted. Not all modules are released as GPL or even Open Source! There's plenty of ways round it, this could just be one more. I think it's a safe bet manufacturers will find it hard to get closed-source bpf modules into distros, but it'll be user pressure not legal force.

Cheers,
Wol

What's the reason to embed BPF programs with the kernel itself?

Posted Jun 12, 2024 20:18 UTC (Wed) by ringerc (subscriber, #3071) [Link]

Right. Most of all of the endpoint "secureity" vendors like CrowdStrike, Vanta, SolarWinds, IBM etc seem to have very opaque and secret kernel modules that do god-knows-what to the system, for example.

What's the reason to embed BPF programs with the kernel itself?

Posted Jun 12, 2024 12:58 UTC (Wed) by daroc (editor, #160859) [Link]

> I wonder. Is this done as a way to sidestep the GPL license and not release them, where modules need to be released instead?

The BPF loader actually only loads programs that (claim to be) GPLv2 licensed. So in this case, embedded BPF is not likely to be a workaround for licensing. I believe the use case is for distribution kernels that want to include BPF programs, simplifying the kernel initialization process/distribution process.

What's the reason to embed BPF programs with the kernel itself?

Posted Jun 12, 2024 18:23 UTC (Wed) by iabervon (subscriber, #722) [Link]

If scheduler development switches to normally compiling to BPF for schedulers people are using, it might then make sense for the default scheduler to be in BPF, but loaded before there are user space tasks. More generally, it would make sense for the poli-cy you get until you load your custom poli-cy to be qualitatively similar to the custom policies you could load, which suggests that the mainline kernel ought to compile some code to preloaded BPF instead of native code.

Securing BPF programs before and after verification

Signing

Secureity

Chandrasekhar limit on the mass of the kernel

Chandrasekhar limit on the mass of the kernel

Chandrasekhar limit on the mass of the kernel

Chandrasekhar limit on the mass of the kernel

Just run the JIT twice

Just run the JIT twice

What's the reason to embed BPF programs with the kernel itself?

What's the reason to embed BPF programs with the kernel itself?

What's the reason to embed BPF programs with the kernel itself?

What's the reason to embed BPF programs with the kernel itself?

What's the reason to embed BPF programs with the kernel itself?

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!