Software-tag-based KASAN

By Jonathan Corbet
September 26, 2018

The kernel address sanitizer (KASAN) is a kernel debugging tool meant to catch incorrect use of kernel pointers. It is an effective tool, if the number of KASAN-based bug reports showing up on the mailing lists is any indication. The downside of KASAN is a significant increase in the amount of memory used by a running system. The software-tag-based mode proposed by Andrey Konovalov has the potential to address that problem, but it brings some limitations of its own.

KASAN works by allocating a shadow memory map to describe the addressability of the kernel's virtual address space. Each byte in the shadow map corresponds to eight bytes of address space and indicates how many of those eight bytes (if any) are currently accessible to the kernel. When the kernel allocates or frees a range of memory, the shadow map is updated accordingly. Using some instrumentation inserted by the compiler, KASAN checks each kernel pointer dereference against the shadow map to ensure that the kernel is meant to be accessing the pointed-to memory. If the shadow map indicates a problem, an error is raised.

It is an effective technique and, thanks to the support from the compiler, the run-time CPU overhead is tolerable in many settings. But the shadow map requires a great deal of memory, and that does affect the usability of KASAN in the real world, especially when it is used on memory-constrained systems. This overhead is particularly painful for users who would like to run KASAN on production systems as an additional security measure.

The new mode uses a different approach that takes advantage of an ARM64 feature called top-byte ignore (TBI). A 64-bit pointer allows for a large address space, rather larger than is actually needed on current systems, even if a web browser is running. When TBI is enabled, the system's memory-management unit will ignore the top byte of any address, allowing that byte to be used to store eight bits of arbitrary information. One possible use for that byte is to ensure that pointers into memory are pointing where they were intended to.

In the software-tag-based mode, KASAN still allocates the memory map, but with some changes. Each byte in the map now corresponds to 16 bytes of real memory rather than eight, cutting the size of the map in half. Whenever the kernel allocates memory, a random, eight-bit tag value will be chosen. The pointer to the allocated object (which is aligned to a 16-byte boundary) will have that tag value set in the top byte; the tag value is also stored into the shadow memory map at the location(s) corresponding to that object. Whenever the returned pointer is dereferenced, its embedded tag value will be compared (using instrumentation from the compiler again) against the tag stored in the shadow memory map; if they do not match, an error will be logged.

There are some clear advantages to this mode, starting with the halving of the amount of memory required for the shadow map. Current KASAN can only catch references to memory that the kernel is not meant to access at all; the new mode can catch the use of pointers that have strayed into the wrong part of kernel memory. On the other hand, the new mode will fail to catch a reference just beyond an allocated object if it falls within the 16-byte resolution of the map. There is a small possibility that an errant pointer will hit another region of memory that happened to get the same tag; such an access would not be detected. This mode will also only work on ARM64 processors, and it requires at least version 7 of the Clang compiler.

There is another potential issue with the use of the software-tag-based mode. Address translation will ignore the top byte of a pointer when TBI is turned on, but other operations, such as pointer arithmetic and pointer comparisons, will not. Subtracting one pointer from another is a common operation in C programs; if those two pointers have different tag values, though, the result is unlikely to be what the developer intended. An erroneous subtraction is likely to make itself known quickly, but a comparison for equality that fails because two otherwise equal pointers have different tags could lead to rather more subtle problems. One can argue that pointers with different tags will have originated from different allocations and should not be compared anyway, but worries about the possibility of breaking things have led to some long discussions after previous postings of this work.

In an attempt to address these concerns, Konovalov ran some extensive tests to try to find potential problems:

All pointer comparisons/subtractions have been instrumented in an LLVM compiler pass and a kernel module that would print a bug report whenever two pointers with different tags are being compared/subtracted (ignoring comparisons with NULL pointers and with pointers obtained by casting an error code to a pointer type) has been used.

The test turned up a number of places where such operations were taking place, but none of them turned out to be situations where the pointer tags changed the kernel's behavior; see the patch posting linked above for the full discussion.

There is a small set of benchmark results included in the patch as well; it shows that software-tag-based KASAN performs similarly to regular KASAN in terms of CPU usage, though network bandwidth does drop somewhat. The new mode does use quite a bit less memory, though, as expected. KASAN remains far from free in either mode, though, tripling the time required for the test system to boot and reducing the networking performance to less than half of what is otherwise possible. So it is still going to be hard to use KASAN in production systems most of the time.

Upcoming technologies, such as Arm's memory tagging, promise to support much of this functionality in hardware, which may change the equation somewhat. For the time being, though, KASAN must be implemented in software. It has found a number of bugs in the kernel, and would certainly find more if it were able to run in more contexts. The software-tag-based mode should make it possible to use KASAN on systems where its memory overhead is currently prohibitive, and that seems like a good thing.

Index entries for this article
Kernel	Development tools
Kernel	KASan
Security	Linux kernel/Tools

Software-tag-based KASAN

Posted Sep 27, 2018 0:01 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> rather larger than is actually needed on current systems, even if a web browser is running.
LOL

Software-tag-based KASAN

Posted Sep 27, 2018 17:16 UTC (Thu) by stonedown (subscriber, #2987) [Link]

I really appreciate articles like this, which are helpful for those of us debugging embedded systems.

Software-tag-based KASAN

Posted Sep 27, 2018 17:26 UTC (Thu) by NHO (guest, #104320) [Link] (1 responses)

I thought ASM goto caused kernel to stop building with Clang. It's restored with Clang 7?

Software-tag-based KASAN

Posted Oct 1, 2018 13:00 UTC (Mon) by cmarinas (subscriber, #39468) [Link]

That's x86, on arm64 we can still build the kernel without asm goto (albeit a small performance degradation).

Software-tag-based KASAN

Posted Sep 30, 2018 15:42 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link] (3 responses)

I'm pretty certain that the kernel can have more then 256 live allocations at any given time, hence, tag collisions are guaranteed to happen even when regarding "about 4 collisions in 1000 allocations" as "small probability" (doesn't seem small to me). And there's of course Knuth's classic statement that a arbitrarily long sequence of twos is a perfectly random sequence. This means if this mechanism considers an access valid, there's absolutely no way to determine that it's actually valid short of examining the code for correct pointer usages. And if this was such an easy thing to do, surely, there shouldn't be any incorrect pointer uses.

Software-tag-based KASAN

Posted Oct 2, 2018 11:07 UTC (Tue) by mtaht (subscriber, #11087) [Link]

One of the stumbling blocks on the Mill cpu's compiler (which has tagged pointers also), was that llvm considered long ints and pointers to be equivalent in various stages of the optimizer. Is this fixed now in llvm 7?

Software-tag-based KASAN

Posted Oct 4, 2018 22:22 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

> I'm pretty certain that the kernel can have more then 256 live allocations at any given time, hence, tag collisions are guaranteed to happen even when regarding "about 4 collisions in 1000 allocations" as "small probability" (doesn't seem small to me).

If you've only got 256 possible tag values, then I would think that the probability of a collision is more than 50% with just 25 live allocations. Think birthday paradox.

You only need 30 people, chosen at random, for the chances of two of them sharing a birthday to go over 50%. That's 30 people, 365 days, and a collision. Extrapolating (yes I know that's very dangerous with statistics), I would guess that once a hash table is over 10% full, the chances of a collision go over 50%.

Cheers,
Wol

Software-tag-based KASAN

Posted Oct 23, 2018 20:17 UTC (Tue) by mcortese (guest, #52099) [Link]

Right. However you shouldn't worry about two unrelated pointers sharing the same tag. You should wonder what's the probability of one pointer gone rogue pointing out of its original bounds into the range of another pointer, and *those* two pointers sharing the same tag.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Software-tag-based KASAN

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.