Fighting Spectre with cache flushes
Speculative-execution vulnerabilities are only exploitable if they leave a sign somewhere else in the system. As a general rule, that "somewhere else" is the CPU's memory cache. Speculative execution can be used to load data into the cache (or not) depending on the value of the data the attacker is trying to exfiltrate; timing attacks can then be employed to query the state of the cache and complete the attack. This side channel is a necessary part of any speculative-execution exploit.
It has thus been clear from the beginning that one way of blocking these attacks is to flush the memory caches at well-chosen times, clearing out the exfiltrated information before the attacker can get to it. That is, unfortunately, an expensive thing to do. Flushing the cache after every system call would likely block a wide range of speculative attacks, but it would also slow the system to the point that users would be looking for ways to turn the mechanism off. Security is all-important — except when you have to get some work done.
Kristen Carlson Accardi recently posted a patch that is based on an interesting observation. Attacks using speculative execution involve convincing the processor to speculate down a path that non-speculative execution will not follow. For example, a kernel function may contain a bounds check that will prevent the code from accessing beyond the end of an array, causing an error to be returned instead. An attack using the Spectre vulnerability will bypass that check speculatively, accessing data that the code was specifically (and correctly) written not to access.
In other words, the attack is doing something speculatively that, when the speculation is unwound, results in an error return to the calling program — but, by then, the damage is done. The error return is a clue that there maybe something inappropriate going on. So Accardi's patch will, in the case of certain error returns from system calls, flush the L1 processor cache before returning to user space. In particular, the core of the change looks like this:
__visible inline void l1_cache_flush(struct pt_regs *regs) { if (IS_ENABLED(CONFIG_SYSCALL_FLUSH) && static_cpu_has(X86_FEATURE_FLUSH_L1D)) { if (regs->ax == 0 || regs->ax == -EAGAIN || regs->ax == -EEXIST || regs->ax == -ENOENT || regs->ax == -EXDEV || regs->ax == -ETIMEDOUT || regs->ax == -ENOTCONN || regs->ax == -EINPROGRESS) return; wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH); } }
The code exempts some of the most common errors from the cache-flush policy, which makes sense. Errors like EAGAIN and ENOENT are common in normal program execution but are not the sort of errors that are likely to be generated by speculative attacks; one would expect an error like EINVAL in such cases. So exempting those errors should significantly reduce the cost of this mitigation without significantly reducing the protection that it provides.
(Of course, the code as written above doesn't quite work right, as was pointed out by Thomas Gleixner, but the fix is easy and the posted patch shows the desired result.)
Alan Cox argued for this patch, saying:
Andy Lutomirski is not convinced, though. He argued that there are a number of possible ways around this protection. An attacker running on a hyperthreaded sibling could attempt to get the data out of the L1 cache between the speculative exploit and the cache flush, though Cox said that the time window available would be difficult to hit. Fancier techniques, such as loading the cache lines of interest onto a different CPU and watching to see when they are "stolen" by the CPU running the attack could be attempted. Or perhaps the data of interest is still in the L2 cache and could be probed for there. In the end, he said:
Answering Lutomirski's criticisms is probably necessary to get this patch
set merged. Doing so would require providing some numbers for what the
overhead of this change really is; Cox claimed
that it is "pretty much zero
" but no hard numbers have been
posted. The other useful piece would be to show some current exploits that
would be blocked by this change.
If that information can be provided, though (and the bug in the patch
fixed), flushing the L1 cache could yet prove to be a relatively cheap and
effective way to block Spectre exploits that have not yet been fixed by
more direct means. As a way of hardening the system overall, it seems
worthy of consideration.
Index entries for this article | |
---|---|
Kernel | Security/Meltdown and Spectre |
Security | Meltdown and Spectre |
Posted Oct 16, 2018 6:46 UTC (Tue)
by blackwood (guest, #44174)
[Link] (3 responses)
We use EINVAL in drm to iteratively discover an optimal configuration in the atomic display api. But it's only one specific source of EINVAL, all others should still be treated as possible exploits. So anything at the global level, or even just at the ioctl level, isn't a fine-grained enough filter. And I suspect there's lots of other places.
The flag would also serve as nice documentation for the fast-path error case.
Posted Oct 16, 2018 7:12 UTC (Tue)
by josh (subscriber, #17465)
[Link] (2 responses)
Posted Oct 16, 2018 8:07 UTC (Tue)
by lkundrak (subscriber, #43452)
[Link] (1 responses)
Posted Oct 16, 2018 10:38 UTC (Tue)
by blackwood (guest, #44174)
[Link]
Posted Oct 16, 2018 7:13 UTC (Tue)
by josh (subscriber, #17465)
[Link]
Posted Oct 16, 2018 7:44 UTC (Tue)
by johill (subscriber, #25196)
[Link] (2 responses)
Posted Oct 22, 2018 7:04 UTC (Mon)
by cpitrat (subscriber, #116459)
[Link] (1 responses)
Posted Oct 22, 2018 13:00 UTC (Mon)
by johill (subscriber, #25196)
[Link]
Regarding netlink: "syscall" exit point is different there (it's a message reporting the error, not the syscall itself), so similar code would have to be added there
Posted Oct 16, 2018 8:45 UTC (Tue)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Oct 22, 2018 7:05 UTC (Mon)
by cpitrat (subscriber, #116459)
[Link]
Posted Oct 16, 2018 8:46 UTC (Tue)
by anton (subscriber, #25547)
[Link] (5 responses)
If the non-whitelisted error returns are as rare as Alan Cox suggests, flushing the L2 and the L3 should not impose a significant performance penalty, either.
Posted Oct 16, 2018 10:19 UTC (Tue)
by epa (subscriber, #39769)
[Link]
Posted Oct 16, 2018 11:50 UTC (Tue)
by nhaehnle (subscriber, #114772)
[Link]
Posted Oct 16, 2018 20:11 UTC (Tue)
by luto (subscriber, #39314)
[Link] (2 responses)
Posted Oct 17, 2018 8:30 UTC (Wed)
by anton (subscriber, #25547)
[Link] (1 responses)
Concerning DoS attacks, unprivileged programs can thrash the caches anyway (with ordinary memory accesses). Is it easier for a remote attacker to trigger such OS error returns than to induce the attacked program to perform thrashing memory accesses? It could be.
Another, cheaper, thing in the same vein that could be done is to CLFLUSH(OPT) the speculatively accessed memory. Problems: This would need compiler support to make it generally applicable with little programmer effort; CLFLUSH reports page faults, so probably has to be protected from that (how?); it would open a side channel that allows determining which cache lines in one process conflict with which cache lines in another process (can attackers do something with that that they cannot do now?).
Posted Oct 17, 2018 8:59 UTC (Wed)
by matthias (subscriber, #94967)
[Link]
Triggering this kind of error is trivial, while the cache replacement strategies of the CPU should take care of not throwing away heavily used content in the cache. Also trashing the cache is very slow, as by the nature of trashing every memory access is a cache miss.
Posted Oct 16, 2018 12:05 UTC (Tue)
by geert (subscriber, #98403)
[Link]
Posted Oct 16, 2018 12:28 UTC (Tue)
by roc (subscriber, #30627)
[Link] (3 responses)
* Some applications close all open file descriptors before they exec a subprocess. There are various ways to do this but I've seen code that uses getrlimit() to get the maximum FD number and then simply calls close() on every possible FD value up to that limit. (Maybe they don't want to depend on /proc being mounted.) That could be tens of thousands or maybe even a million close() calls returning EBADF. I guess this patch would make each of those significantly more expensive.
* Some applications want to probe their address space to see if memory is mapped at an address. You can do this using various syscalls that return ENOMEM or EFAULT if the memory is not mapped ... maybe faster and definitely more conveniently than using a signal handler. I guess those operations would get significantly slower with this patch.
I don't think it's safe to assume that a particular error path is not a hot path for any application.
Posted Oct 16, 2018 14:10 UTC (Tue)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Oct 16, 2018 21:48 UTC (Tue)
by roc (subscriber, #30627)
[Link]
> Doing it with a whitelist is a safer approach than playing whack-a-mole trying to find all the cases where there might be a vulnerability.
Maybe so if this approach was a comprehensive bulletproof fix to a defined class of Spectre vulnerabilities. But it's really just a heuristic to make exploitation harder in some set of cases that aren't easily characterized.
Adding patches to the kernel to achieve not-very-well-understood security benefits, in exchange for not-very-well-understood performance costs, should make people nervous.
Posted Oct 16, 2018 19:19 UTC (Tue)
by xorbe (guest, #3165)
[Link]
Posted Oct 16, 2018 12:56 UTC (Tue)
by azilian (guest, #47340)
[Link]
Posted Oct 16, 2018 13:53 UTC (Tue)
by abatters (✭ supporter ✭, #6932)
[Link]
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Instead of flushing, return on a different CPU?
Instead of flushing, return on a different CPU?
Lutomirski is probably right that flushing the L1 is not enough; extracting information from the L2 or L3 may take longer because virtual-to-physical mapping distributes the accesses over more potential places, but it's probably still possible.
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
L2 is per-core on mainstream Intel and AMD CPUs. L3 is per-package. Overall, with 20-30GB/s of memory bandwidth and 2-16MB L3 cache, the flush costs on the order of 1ms for refilling the L3; I doubt that the WBINVD itself makes that much worse. But if there is <1 such error per second, the performance impact will only be <0.1%.
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
TLB
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Fighting Spectre with cache flushes
Flush+Flush
And as it was stated above, hyperthreads are still a real problem. On busy systems it is not so hard to run your code in the same time window, when your first thread is working.
rowhammer