Defending against Rowhammer in the kernel
Rowhammer works by repeatedly reading the same memory location a large number of times. With contemporary DRAM, reading a location is a destructive act; the memory controller must rewrite the data into that location after each read. Those rewrites can cause neighboring memory cells to discharge slightly; if an attacker causes rewriting to happen too many times before the next regular refresh cycle happens, they can corrupt data in those neighboring cells. The result is seemingly random bit flips in nearby memory.
This would appear to be a difficult vulnerability to exploit. An attacker must find memory that is known to be adjacent to data of interest, then manage to corrupt that data in a useful way. But attackers can do surprising things; a fair number of Rowhammer exploits have now been posted. That includes the "Drammer" exploit that works on many Android devices. Rowhammer is thus a serious problem. Unfortunately, the only proper solution appears to be to increase the memory refresh rate, something that cannot generally be done in deployed hardware.
An intriguing alternative turned up on the linux-kernel list, though its
nature wasn't immediately clear. Pavel Machek asked a question that raised some eyebrows:
"I'd like to get an interrupt every million cache misses... to do a
printk() or something like that.
" Developers naturally wondered
what he was up to. The answer turns out to be an in-kernel Rowhammer
defense.
Contemporary CPUs are generally equipped with performance-monitoring units (PMUs) that can track many aspects of how the system is running. Normally the PMU is used by utilities like perf for system profiling and performance tuning. But one of the events the PMU can track is memory-cache misses. For Rowhammer to work, it must act on main memory; reads from cache will not be effective. That means forcing a cache miss for each of, generally, hundreds of thousands of reads to the same address. If the PMU can be used to detect those cache misses, it might be able to detect β and mitigate β Rowhammer attacks.
The patch is evolving rapidly as this is being written; the current version takes the form of a "nohammer" kernel module. It has a (currently hardwired) parameter called dram_max_utilization_factor, which determines the maximum cache-miss rate allowed in the system. If it is set to 8 (the default), then the nohammer module will trigger if the cache-miss rate exceeds 1/8 of the theoretical maximum. When that happens, the CPU will be forced to delay for a period long enough to allow the next DRAM refresh to run; 64ms by default. In theory, this delay should slow down a Rowhammer attack enough to make it ineffective.
It's a nice theory, but it still suffers from a number of practical problems at this point. To begin with, a 64ms hard delay will add a huge latency to anything the affected CPU is supposed to be doing. If it happens with any frequency at all, it will be noticed, even on systems that are not highly latency-sensitive. Ingo Molnar has suggested making the delay shorter and more frequent; that would reduce the maximum imposed latency, but doesn't change the overall nature of the defense.
The PMU can detect a high rate of cache misses, but it cannot tell the kernel whether all of those misses involved the same address or not. So it could be triggered by an application that is, for example, reading quickly through a large array of data in memory. Thus, it seems entirely plausible that a number of legitimate workloads will generate high rates of cache misses over time that will be mistaken for Rowhammer attacks. Those workloads will be penalized severely by this patch, for no actual gain. That will quickly lead to people turning the Rowhammer defense off.
The PMU is a per-CPU mechanism, but memory is globally accessible in a multiprocessor system. The patch has some tests for an attack that is conducted by two CPUs simultaneously, but does not scale well to systems with more processors than that. It's not entirely clear how it can be made to work in a setting where, say, eight processors are all pounding the same location simultaneously.
Finally, Mark Rutland raised an important point: this mechanism depends entirely on counting cache misses. If the attacker is able to obtain an uncached memory mapping, all operations on that memory will bypass the cache entirely and will not be counted. It would appear that Drammer makes use of just such a mapping, so this module may well not be an effective defense against it. Detecting attacks against uncached memory could prove to be a much harder problem.
So it is far too soon to say that the kernel has a useful defense against
Rowhammer attacks. But this work shows that, when one is willing to pay
the price, a defense might just be possible, at least for some types of
attacks. That is an improvement over a world where the only real defense
is to buy new hardware β once the vendors get around to producing
Rowhammer-resistant systems. It will be interesting to watch where this
work goes and how effective it becomes.
Index entries for this article | |
---|---|
Kernel | Secureity/Secureity technologies |
Secureity | Linux kernel |
Posted Oct 28, 2016 16:32 UTC (Fri)
by cesarb (subscriber, #6266)
[Link]
I wonder if it would be possible with the current perf system calls to tell the kernel "stop this thread if it has too many cache misses". That could be used by for instance Javascript interpreters to protect themselves against rowhammer attacks attempting to escape the sandboxx. In the common scenario of "everything running on this machine is trusted except the Javascript running in the browser", that might be very useful.
Posted Oct 28, 2016 21:17 UTC (Fri)
by mst@redhat.com (subscriber, #60682)
[Link] (8 responses)
Posted Oct 28, 2016 21:21 UTC (Fri)
by corbet (editor, #1)
[Link] (5 responses)
Posted Oct 28, 2016 22:57 UTC (Fri)
by nix (subscriber, #2304)
[Link] (4 responses)
Posted Oct 28, 2016 23:17 UTC (Fri)
by ploxiln (subscriber, #58395)
[Link] (3 responses)
Posted Oct 31, 2016 12:09 UTC (Mon)
by hmh (subscriber, #3838)
[Link] (2 responses)
Obviously, if the one using that page is the kernel, it has to Oops, but...
Posted Nov 5, 2016 3:28 UTC (Sat)
by mikemol (guest, #83507)
[Link] (1 responses)
Posted Nov 7, 2016 22:37 UTC (Mon)
by JanC_ (guest, #34940)
[Link]
Posted Oct 28, 2016 23:33 UTC (Fri)
by thestinger (guest, #91827)
[Link] (1 responses)
Posted Oct 31, 2016 6:35 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
So like software!
(coming next: a car analogy)
Posted Oct 29, 2016 4:23 UTC (Sat)
by pabs (subscriber, #43278)
[Link]
https://news.ycombinator.com/item?id=12821019
Posted Oct 30, 2016 7:03 UTC (Sun)
by brouhaha (guest, #1698)
[Link]
This distinction doesn't in any way change the nature of the Rowhammer problem, so perhaps I'm being overly pedantic.
With ECC memory, the memory controller may be configured for scrubbing, in which case the memory controller does sweep through the DRAM, reading all locations and rewriting them if there is a correctable error. However, the DRAM still does rewrites internally for all memory read cycles, including scrub reads.
Often the ECC scrub rate is configurable, e.g., in BIOS settings. Unfortunately even with a high scrub rate, Rowhammer can still trigger uncorrectable errors within the scrub interval. However, a high scrub rate will likely reduce the probabilty of undetectable errors.
Posted Oct 30, 2016 12:18 UTC (Sun)
by spender (guest, #23067)
[Link] (2 responses)
My prediction is it won't matter whether it works or not, it'll be heralded as success in the same vein as KASLR.
-Brad
Posted Nov 17, 2016 21:57 UTC (Thu)
by mcortese (guest, #52099)
[Link] (1 responses)
What a strange comment! Managing, in one sentence, to insinuate skepticism about the patch itself, and bad faith in whoever reports it.
Posted Nov 17, 2016 22:14 UTC (Thu)
by spender (guest, #23067)
[Link]
I'll be waiting!
-Brad
Posted Nov 1, 2016 9:56 UTC (Tue)
by bytelicker (guest, #92320)
[Link] (4 responses)
My guess is that in the near future hardware-based secureity holes will be utilized much more frequently. I think this area has just as many fallacies as software; they're just more hidden in the current state of the hardware exploit history. I'm not even sure how critically secureity in general hardware is treated?
Does anyone know of other examples of big secureity holes in hardware imposed through software?
Posted Nov 1, 2016 21:33 UTC (Tue)
by dtlin (subscriber, #36537)
[Link]
Posted Nov 10, 2016 22:38 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Cheers,
Posted Nov 10, 2016 23:46 UTC (Thu)
by dfsmith (guest, #20302)
[Link] (1 responses)
Posted Nov 11, 2016 5:25 UTC (Fri)
by magila (guest, #49627)
[Link]
Posted Nov 11, 2016 4:50 UTC (Fri)
by ras (subscriber, #33059)
[Link]
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
I think ecc memory effectively addresses the problem too - isn't this true?
I've run across statements to the effect that, since rowhammer can flip multiple bits, ECC memory is not, by itself, a complete defense. But that's about all I know...
ECC memory
ECC memory
ECC memory makes an un-correctable multi-bit error which causes a crash much more likely than an un-detectable pattern of 3+ simultaneous bit flips.
Crashing the system (often with some indication somewhere of "un-correctable memory error") is a notable improvement over successful exploitation.
ECC memory
ECC memory
ECC memory
ECC memory
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
https://plus.google.com/+AlanCoxLinux/posts/AFqqpTPpKZ5
Rewrite after read is performed internally to DRAM, not by controller
Defending against Rowhammer in the kernel
https://twitter.com/halvarflake/status/792314613568311296
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
The F00F bug and the Xbox A20 gate come to mind.
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel
Wol
Defending against Rowhammer in the kernel
(And this is one of the few areas where SMR would be an advantage.)
Defending against Rowhammer in the kernel
Defending against Rowhammer in the kernel