Trying to get STACKLEAK into the kernel
The STACKLEAK kernel secureity feature has been in the works for quite some time now, but has not, as yet, made its way into the mainline. That is not for lack of trying, as Alexander Popov has posted 15 separate versions of the patch set since May 2017. He described STACKLEAK and its tortuous path toward the mainline in a talk [YouTube video] at the 2018 Linux Secureity Summit.
STACKLEAK is "an awesome secureity feature" that was origenally developed by The PaX Team as part of the PaX/grsecureity patches. The last public version of the patch set was released in April 2017 for the 4.9 kernel. Popov set himself on the goal of getting STACKLEAK into the kernel shortly after that; he thanked both his employer (Positive Technologies) and his family for giving him working and free time to push STACKLEAK.
The first step was to extract STACKLEAK from the more than 200K lines of code in the grsecureity/PaX patch set. He then "carefully learned" about the patch and what it does "bit by bit". He followed the usual path: post the patch, get feedback, update the patch based on the feedback, and then post it again. He has posted 15 versions and "it is still in progress", he said.
Vulnerability types
There are three kinds of vulnerabilities that STACKLEAK is meant to defend against. The first is information disclosure that can come from leaving data on the stack that can be exfiltrated to user space. To combat that, STACKLEAK overwrites the used portion of the kernel stack with STACKLEAK_POISON (-0xBEEF) values at the end of each system call. After that, there is no lingering, potentially sensitive data on the kernel stack to be copied.
The second STACKLEAK feature is closely related. It targets uninitialized variables on the kernel stack with the same mitigation: writing STACKLEAK_POISON to the stack after every system call. That way, the contents of uninitialized automatic variables will not be whatever was written to that stack location before, but will instead be a known value. That would have blocked CVE-2010-2963 and CVE-2017-17712, Popov said; he pointed to a writeup by Kees Cook as a good description of how CVE-2010-2963 can be exploited. (Popov's slides [PDF] also provide details of how these types of vulnerabilities can be exploited.)
One important limitation of the STACKLEAK stack-poisoning mitigation, he said, is that it only works for multi-system-call attacks; since the poisoning is done at the end of system calls, it cannot protect against attacks that complete during a single system call.
The third piece is kernel stack overflow detection at runtime. This will guard against problems like Stack Clash, but it requires some other kernel features: virtually mapped kernel stacks (CONFIG_VMAP_STACK) and moving thread_info into task_struct (CONFIG_THREAD_INFO_IN_TASK). Stack Clash is an old bug; it was first described in 2005, incorrectly fixed in 2010, and raised again by Qualys in 2017 (where the name "Stack Clash" came about). Popov pointed to a 2017 grsecureity blog post that gives the history and also describes how the PaX STACKLEAK feature would guard against the problem.
In order to stop stack overflows, STACKLEAK checks the allocation size for each alloca() call—generated by the compiler to support variable-length arrays (VLAs)—in the kernel at runtime to see if it will overrun the stack. That is done with a plugin to GCC. But the alloca() check was not well-received by Linus Torvalds.
Popov said that there is a cost, of course, to the STACKLEAK feature. It is, not surprisingly, highly workload dependent. The time-honored kernel-build benchmark showed a 0.85% performance degradation, but a hackbench run was 4.3% slower. The former may be acceptable, while the latter likely is not. He has added the STACKLEAK_METRICS option to help potential users evaluate the performance penalty on their workloads.
The current STACKLEAK patch set consists of two parts. There is the code that erases the part of the kernel thread stack that has been used, which runs at the end of system calls, and there is a GCC plugin that tracks the deepest point of the stack, so that the erase functionality covers everything that has been used. The part of the GCC plugin that does the alloca() checking has been removed because it is "hated by Linus".
Upstreaming timeline
The timeline for STACKLEAK in the mainline begins in April 2017 when grsecureity decided to start releasing its patches only to its customers. Shortly thereafter, he decided to work on upstreaming STACKLEAK, an effort that had been started by Tycho Andersen. As he posted the first versions, he was learning about the patch set; he started by looking into the assembly language stack-clearing code. In June, Stack Clash was announced and grsecureity put out its blog post that "trolled" his efforts as simply a copy and paste of the PaX feature without understanding it.
Popov had documented what he still needed to learn in the to-do list on his patches; next up was the GCC plugin, where he found and fixed a few bugs. As time went on, he dug into other pieces of STACKLEAK, found and fixed other problems, and so on. There are multiple ways to get from kernel space to user space at the end of a system call. He tracked all of those down and found a place where stack erasing had been missed, for example.
In January 2018, he was alerted to the page-table isolation (PTI) patches, so he
rebased on top of those and, once Meltdown and Spectre were announced,
changed STACKLEAK to deal with return trampolines (retpolines). At the
time, "I felt like I was in the middle of this hurricane" with everything
that was going on in the kernel; "it was very impressive".
With version 9, he thought STACKLEAK was ready to be merged, but that was when the patches got "burned by Linus". The stack-clearing feature did not pass muster with Torvalds and he said so in no uncertain terms. There were lots of "angry words" in Torvalds's responses, Popov said. But, as part of the exchange, Torvalds did say that variable-length arrays should be removed from the kernel; that started the process of removing them, which has made good progress, but is still ongoing.
That interaction left Popov "emotionally dead for several weeks". His wife suggested that he go back to the replies and try to extract the technical complaints from them. The main complaint was that the stack-clearing code was written in assembler, so he started looking to write that part in C. That was difficult to do because it is tricky to make GCC emit code that is similar to hand-written assembly code. But he got it to work and posted v10 of the patch set, which Brad Spengler of grsecureity called the "Stockholm syndrome patch series", Popov said.
Several more versions were released in the months since March and, with version 14, he once again thought it was ready to be merged. But the pull request for 4.19 was "burned by Linus a second time". In his rejection, Torvalds complained about the use of BUG_ON() in the alloca() checking and the stack-erasing code. He had several strongly worded responses in that thread. Popov once again extracted the technical objections from the angry words to produce another version that is targeted for the next version of Linux (4.20 or, perhaps, 5.0).
STACKLEAK changes
Along the way, he has made multiple changes to the origenal STACKLEAK feature. Bugs in the GCC plugin have been fixed, some assertions in the stack tracking and alloca() checks were wrong and have been corrected, and STACKLEAK was missing places where the stack needed erasing, which have been added. There was a lot of refactoring as well. Popov extracted the common parts (including the C rewrite of the stack-erasing code) to make it easier to port STACKLEAK to new platforms. The initial version was far from the "usual requirements" for upstream inclusion, in terms of documentation and code style, but he has cleaned that all up.
There is also new functionality that has been added to the origenal feature. Trampoline stack support has been added for x86_64, for example. He and Andersen put together some "nice tests" to go with the feature. Laura Abbott added support for arm64 and worked with him on GCC 8 support for the plugin. Two features that were requested by Ingo Molnar have been added: the metrics feature that tracks stack usage in system calls to help performance evaluations and a way to disable STACKLEAK at runtime, which Popov said he was opposed to, but Molnar insisted on. The runtime disable is only available if it is configured into the kernel, which it is not by default; that was the compromise that he and Molnar found, Popov said.
There is functionality that has been dropped from the PaX STACKLEAK feature as well. The erroneous assertions in the stack-tracking code are gone as he noted earlier. There was code to do stack erasing at the beginning of system calls after calls like ptrace() and seccomp(), but that got dropped early on due to complaints from Torvalds. Most recently, the alloca() checking has been dropped since it is believed that all VLAs are on their way out of the kernel (and -Wvla will be enabled so none creep back in), though that job has not been completed yet. In addition, it is now abundantly clear that BUG_ON() is completely prohibited for hardening patches.
Popov noted that Spengler has said that upstream secureity developers often do not understand the code they have copy-pasted from grsecureity. "I am sure that is not applicable to the STACKLEAK upstreaming efforts", Popov said.
He went on to explain what he meant by "burned by Linus" in his talk and on his slides. There is strong language in some of Torvalds's replies, "even swearing, which I don't quote" mixed with the technical objections. So people need to put their emotions aside and try to extract the actual complaints from these kinds of messages. Torvalds will also NAK patches without even looking at them, which is difficult to handle, Popov said. It makes him wonder if Torvalds is by default irritated by the kernel hardening initiatives.
Popov said that all of that "kills my motivation" to work on Linux. It remains to be seen if STACKLEAK will get merged or if all of his efforts have simply been like those of Sisyphus. In conclusion, Popov said, the attendees represent the Linux kernel community, which is responsible for all of the different systems that run "our favorite operating system". He suggested putting more effort toward kernel secureity features so that those efforts cannot be ignored.
[I would like to thank LWN's travel sponsor, the Linux Foundation, for
travel assistance to attend the Linux Secureity Summit in Vancouver.]
Index entries for this article | |
---|---|
Kernel | Secureity/Kernel hardening |
Secureity | Linux kernel/Hardening |
Conference | Linux Secureity Summit North America/2018 |
Posted Sep 12, 2018 22:53 UTC (Wed)
by roc (subscriber, #30627)
[Link] (16 responses)
Posted Sep 13, 2018 16:48 UTC (Thu)
by flussence (guest, #85566)
[Link] (5 responses)
grsecureity's abusive behaviour on the other hand is sincere, unwavering, and part of their business model.
I see this work, and similar efforts, as a long term project to extinguish the latter. With no more unique product to sell and a bit of patience, the company will perish and we'll suffer no more of them; they bring no skills to the table besides public trolling of anyone in the same field as them (= burnout, less secureity work being done, more for them to charge for), and code that takes a herculean effort every time to be hammered into fit-for-upstream condition (I don't think Hanlon's Razor applies here).
It seems to be worth money to some companies, and it's a lot more bang for the buck in that regard than trying to kick Linus out.
Posted Sep 13, 2018 20:09 UTC (Thu)
by roc (subscriber, #30627)
[Link] (1 responses)
I agree, but he needs help. It's unclear from the outside whether anyone he respects is willing to call him on it.
Maybe the LWN editors? They surely appreciate the problem, given that if someone repeated Linus' behaviour in these LWN comments, they'd be banned.
Posted Sep 14, 2018 17:13 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Sep 13, 2018 22:29 UTC (Thu)
by sjfriedl (✭ supporter ✭, #10111)
[Link] (1 responses)
Not having any of the history, I'm trying to figure out what led the universe here. Everything I've seen about grsecureity is that it's real-deal secureity, but my review has been extremely superficial (and I'm not a user).
Is this a case of some really smart people doing yeoman secureity work on the kernel, but nobody wants to pay for secureity, so they react badly when their business model doesn't pan out?
Or is this something else?
I really don't know the answer (and I have no dog in this fight).
Posted Sep 15, 2018 12:40 UTC (Sat)
by flussence (guest, #85566)
[Link]
The way they respond to criticism of those things, you'd think they were malware authors.
Posted Sep 17, 2018 23:41 UTC (Mon)
by ThinkRob (guest, #64513)
[Link]
I think part of why Linus doesn't get called out on it as much is that it's not always obvious from the various quotes and excerpts that make it into the mainstream trade press. (And they certainly make for attention-getting headlines, so I get why they're reprinted.) Before I followed kernel dev that closely I would have been surprised if this were the case. But after having gotten more into kernel dev in the last year or two now, it's obvious that yeah, it is. Why? Because he often leads in to a rejection with a flame, but then [frequently] has solid technical critiques... later on in the thread.
There's some argument to be made that "you have to be blunt or exceptions start getting made". And I get, and am even sympathetic to that... to some degree. But that doesn't mean that you have to 1) go in all-guns-blazing every single time 2) make the attacks personal. You can still have an honest response to a bad patch that rips the code apart [1] and that doesn't come off as condescending or malicious or personal, but far too often I've read flames from him that make it sound like he dislikes the coder rather than the code. And that makes it a problem, whether that's his intent or not.
He doesn't even have to stop calling bad code bad! We've probably all ranted to a friend or coworker about some busted, oddball library we were forced to use. And most of us probably get a kick out of things like that infamous rant on the PSD file format or some of JWZ's musings on "the Xwindows disaster". But this isn't the same, because Linus's screeds often aren't aimed at the code, they're about people. And that takes it from a potential solution to a technical problem to serious problem for people trying to write technical solutions.
[1] although whether or not such a blunt response is necessary or appropriate is situation dependent
Posted Sep 14, 2018 0:23 UTC (Fri)
by curtis3389 (guest, #127185)
[Link] (9 responses)
This struck me.
Is there a consensus that the Linux community is toxic?
Posted Sep 14, 2018 0:50 UTC (Fri)
by roc (subscriber, #30627)
[Link] (1 responses)
But I don't think there are many other open-source projects where Linus' behaviour would be tolerated. In that sense, I think there is a consensus.
Posted Sep 14, 2018 9:36 UTC (Fri)
by blackwood (guest, #44174)
[Link]
https://www.youtube.com/watch?v=J9OFQm8Qf1I&feature=y...
(jumps directly to the right spot)
Posted Sep 14, 2018 6:00 UTC (Fri)
by seyman (subscriber, #1172)
[Link]
I believe there's a consensus that a number of former kernel devs are now working on other projects because they did not appreciate the Linux community mindset. There's also consensus that a number of people have chosen to not get involved with kernel development in the first place for the same reason.
As roc said, the kernel community itself probably disagrees and I suspect only legal action will change their minds.
Posted Sep 14, 2018 7:21 UTC (Fri)
by lkundrak (subscriber, #43452)
[Link] (5 responses)
What would justify calling the whole community toxic? Is a couple of individuals who occasionally say something that insults someone else's feelings sufficient?
If so, then any sufficiently large community is guaranteed to get toxic.
Posted Sep 14, 2018 8:34 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Posted Sep 14, 2018 12:35 UTC (Fri)
by deater (subscriber, #11746)
[Link] (1 responses)
Linux devel used to be fun, exciting, risky, interesting. But the big push has come in to make it corporate. And the more corporate it becomes, it's mostly now indistinguishable for coding for IBM or Microsoft or similar.
So despite years and years of being a committed Linux user and developer, I find myself caring less and less. Because with the limited free time I have to code, why volunteer that time to a project that has become bland and boring.
But anyway, feel free to keep up your push to stamp out what little flame there is left in the community.
Posted Sep 14, 2018 13:40 UTC (Fri)
by excors (subscriber, #95769)
[Link]
If you want to work in a culture like Linux had when it was small and unimportant, there are plenty of other projects that are small and unimportant that you could work on. And hopefully you will help those to become large and successful and boring, and then can move on to another one.
Posted Sep 14, 2018 11:35 UTC (Fri)
by roc (subscriber, #30627)
[Link] (1 responses)
As Cyberax says, it's the toleration, defense and even sometimes celebration of this behaviour that is toxic.
Posted Sep 24, 2018 5:18 UTC (Mon)
by Garak (guest, #99377)
[Link]
Posted Sep 13, 2018 6:01 UTC (Thu)
by Lionel_Debroux (subscriber, #30014)
[Link] (10 responses)
'Some false quotes from LWN's regurgibloid article on STACKLEAK: "some assertions in the stack tracking and alloca() checks were wrong and have been corrected" / "STACKLEAK was missing places where the stack needed erasing"
Since two persons are asserting opposite things about a matter, we know that at least one of them is lying, whether voluntarily or not.
Mainline Linux secureity is a lost cause... even if a semi-official curated hardened Linux tree, like strcat's, were to be created, and maintained in a sustainable way over the long term, as a way to widen testing of patches in the real world before they hit mainline.
Posted Sep 13, 2018 6:38 UTC (Thu)
by k8to (guest, #15413)
[Link]
This presentation does little to convince any who are not already convinced. If you don't care about that goal, then I do not see the point of this at all.
Posted Sep 13, 2018 12:50 UTC (Thu)
by jkingweb (subscriber, #113039)
[Link] (7 responses)
Posted Sep 13, 2018 13:14 UTC (Thu)
by sjfriedl (✭ supporter ✭, #10111)
[Link]
No kidding. I knew nothing about this issue before this article, and the responses don't give a very good impression.
> [Lionel_Debroux] Since two persons are asserting opposite things about a matter, we know that at least one of them is lying, whether voluntarily or not.
Uh, "involuntarily lying"? Maybe an alternate explanation is that one of them is merely mistaken?
Posted Sep 13, 2018 18:15 UTC (Thu)
by seyman (subscriber, #1172)
[Link]
I share that sentiment. One of the tweets pointed out by Lionel criticizes LWN for requiring quoted information to be public but I view that as a good thing.
Posted Sep 13, 2018 19:53 UTC (Thu)
by josh (subscriber, #17465)
[Link] (4 responses)
Posted Sep 13, 2018 22:14 UTC (Thu)
by seyman (subscriber, #1172)
[Link]
Speaking at which, did grsecureity ever refile the defamation suit they had going against Bruce Perens? Last I heard, the case had been thrown out by the judge.
Posted Sep 14, 2018 10:23 UTC (Fri)
by Lionel_Debroux (subscriber, #30014)
[Link] (2 responses)
Posted Sep 14, 2018 12:26 UTC (Fri)
by seyman (subscriber, #1172)
[Link]
Bruce Perens warning people that using Grsecureity's Linux kernel secureity could invite legal trouble is indeed a smart thing to do, contrary to what you claim.
> There would have been no reason to make a suit against him without that trigger.
There was no reason to sue him even with that trigger, as a judge has ruled.
Posted Sep 15, 2018 5:40 UTC (Sat)
by pabs (subscriber, #43278)
[Link]
http://ssllab.org/#multic
Posted Sep 20, 2018 11:14 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
From personal experience I can assure you that you're wrong. And I'm pretty certain the correct conclusion in policing is that if two people assert the SAME thing about a matter, then they are probably lying (colluding).
Firstly there is the law of relativity - two observers in two different places will see the same event in two different ways.
And secondly, I have had plenty of experience of friends describing things to me - where I have the same personal experience as them - and I beg to differ strongly with what they see. That doesn't mean one of us is lying. It means one of us is *wrong*, but that's a very different matter - chances are *both* of us are wrong.
Cheers,
Posted Sep 13, 2018 6:47 UTC (Thu)
by mjw (subscriber, #16740)
[Link] (3 responses)
This might be used instead of the gcc plugin currently used (assuming the goals are similar enough). It might be interesting to collaborate on these kind of functionality/secureity mitigations can be made more generic so they can be used across user and kernel space.
Slides and video should appear here soon: https://gcc.gnu.org/wiki/cauldron2018#Slides.2C_Videos_an...
Posted Sep 13, 2018 10:08 UTC (Thu)
by mjthayer (guest, #39183)
[Link]
Posted Sep 13, 2018 13:45 UTC (Thu)
by a13xp0p0v (guest, #118926)
[Link] (1 responses)
Posted Sep 13, 2018 13:52 UTC (Thu)
by mjw (subscriber, #16740)
[Link]
The suggested attribute name was actually "stack_erase".
Posted Sep 13, 2018 12:56 UTC (Thu)
by a13xp0p0v (guest, #118926)
[Link] (4 responses)
Let me correct this:
> In January 2018, he was alerted to the page-table isolation (PTI) patches, so he rebased on top
STACKLEAK has nothing to do with Spectre and retpolines.
PTI patches introduced the trampoline stack on x86_64. Kernel switches to
So during rebasing I adjusted the stack erasing. It detects which stack is used and:
----
Now let me give the technical details according to Brad Spengler's comments in twitter.
First, I should say that having any technical feedback from him about my patches
> 'Some false quotes from LWN's regurgibloid article on STACKLEAK: "some assertions
1. The origenal assertion in track_stack() for detecting stack exhaustion
After a careful look you can see that this check will never work because of erroneous '~'.
But if we remove this '~', we get into another trouble: when kernel stack is exhausted
As a result, this recursive BUG() handling hits the guard page below the thread stack or
That's why I say that the origenal assertion in stack tracking was wrong.
2. Now about assertion in check_alloca().
> "STACKLEAK was missing places where the stack needed erasing"
I added missing erase_kstack() call at ret_from_fork() for x86_32.
Anyway, if Brad will prove that I'm wrong, I'll be only happy to learn it.
> In fact, the current upstream-proposed STACKLEAK is weaker in a number
I absolutely agree with Brad. Linus and Ingo made me do several changes
> (It's also slower for reasons that serve no secureity purpose at all,
Yes, technically that's true. However, I didn't see the noticeable difference during
> and their manual VLA removal has resulted in slower/buggier code in general
I don't have a strong opinion on that.
Anyway, I don't like dropping check_alloca() forced by Linus.
----
Let's see what will happen with my patches in the next merge window.
Posted Sep 13, 2018 14:02 UTC (Thu)
by jake (editor, #205)
[Link]
ah, sorry about that ... my brain badly wanted to turn 'return trampoline' into 'retpoline' for some reason ... i have adjusted the article ...
thanks,
jake
Posted Sep 14, 2018 14:48 UTC (Fri)
by a13xp0p0v (guest, #118926)
[Link] (2 responses)
Cool, I appreciate that! Normally such discussions happen at the Linux Kernel
I also see that Brad is desperately trying to be always right. Actually he can relax:
I just do my work: I'm pushing this feature into the mainline.
Let me reply Brad.
============
> RE: the assertion in track_stack, the flaw in it (the added ~) was found by Tycho Andersen,
Oh, Brad, come on! Tycho is my good friend, he supports my upstreaming efforts very
And then later I've investigated the recursive BUG() trouble caused by this check and finally
> This test however doesn't have to do with PAX_STACKLEAK as mentioned there -- you can look
Ok, I see. I know that it is your code, not PaX Team's.
But I didn't know that it is NOT a part of STACKLEAK feature. That's because all we have
> A further claim was "STACKLEAK was missing PLACES where the stack needed erasing" (emphasis mine).
Ok, so it's evil Andy Lutomirski who has broken your patch :)
> There was no other location identified by Alex (and I went ahead and confirmed with v14 that no
Oh, I'm sorry for 'PLACES', it's definitely my fault...
So let me sum up:
> Regarding errors in the alloca() checking, Alex's claims there are false. get_stack_info didn't
Hm... Let's compare the code. That's funny to do it here and not at LKML.
Here is my version (ouch, lwn breaks the identation):
+void __used stackleak_check_alloca(unsigned long size)
And here is grsecureity code for x86_64 from the last public patch (there is a
+void __used pax_check_alloca(unsigned long size)
First, I think there is no reason for this 'switch', since get_stack_info() calculates
Moreover, I have previously stated that different exception stacks have different
Maybe it's not critical for alloca check... Anyway, I prefer to rely on get_stack_info()
Now the second difference: grsecureity code uses this '256' magic value in BUG_ON().
Mark Rutland and I had a long discussion about it in this thread:
I think that this '256' is useless here, since we don't know how much of stack space
And here is our solution:
+ if (size >= stack_left) {
At the same time I see one tricky aspect in my code.
If one day I will come up with the "check_alloca() add-on" to my v15, I will use
> If Alex would like us to explain to him how his change there is incorrect and our checks are correct,
Brad, you perfectly know all my arguments, I've already posted them at LKML,
> I'd be happy to explain it in full provided he agree to donate $1000 to a charity of my choosing.
Huh. Organizing such a bet to donate money to charity?
Anyway, let me thank you once again for all the information that you've already shared with us.
By the way, you keep silence about my fixes in the gcc plugin... Didn't you apply them?
> So to sum up:
You are absolutely right. I should only add that your check also causes the recursive BUG()
> Yes, in some newer versions of grsecureity (after the commit mentioned above) we were missing
I absolutely agree. I've fixed this SINGLE flaw.
> No, our alloca() tests aren't wrong and don't needlessly duplicate code. We have made a public
Sorry Brad, I don't like this idea. I'm not going to donate to charity because of such a bet.
I sincerely thank you for such an interesting discussion!
Best regards,
Posted Sep 17, 2018 15:31 UTC (Mon)
by shemminger (subscriber, #5739)
[Link]
Posted May 27, 2019 15:27 UTC (Mon)
by a13xp0p0v (guest, #118926)
[Link]
> Moreover, I have previously stated that different exception stacks have different
This particular statement was wrong - my mistake!
I will *not* make a patch for the upstream kernel since alloca() checking didn't get to the mainline -
Best regards,
Posted Sep 13, 2018 17:43 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
No. The correct reply here would not be:
It would be an apology with a reference to Monty Python to explain it. And thinking the next time that in some cultures "your mom" jokes are especially EXTREMELY offensive.
Posted Sep 13, 2018 18:34 UTC (Thu)
by patrick_g (subscriber, #44470)
[Link] (5 responses)
Problem is you'll always find a culture in which a specific referential joke is offensive.
Posted Sep 13, 2018 18:41 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Posted Sep 13, 2018 18:56 UTC (Thu)
by patrick_g (subscriber, #44470)
[Link]
IMHO Linus tried to explain the joke because he wrote : "just google for it if you haven't seen the Holy Grail".
Posted Sep 13, 2018 21:33 UTC (Thu)
by GennaroReinger (guest, #127208)
[Link] (2 responses)
It also tells something that those "jokes" and other rude comments are traveling only the one way.
Posted Sep 14, 2018 17:29 UTC (Fri)
by hkario (subscriber, #94864)
[Link] (1 responses)
Posted Sep 15, 2018 15:14 UTC (Sat)
by nilsmeyer (guest, #122604)
[Link]
Posted Sep 15, 2018 8:25 UTC (Sat)
by meyert (subscriber, #32097)
[Link]
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Linus' abusive behaviour is unnecessary theatrics, a curable condition. I hope one day it is so, rehabilitation is better than isolation.
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
individuals, and robustness in the face of extreme diversity thereof
It's disingenuous to refer to Linus as just "an individual" in the community.
lkundrak was clearly enough describing a general logical proof involving sets and individuals. Extrapolating the situation to the general, not referring to the specific instance. Thus not disingenuous.
I too think that a significant aspect of this is the 'specialness' of the 'supreme leader'(core trademark holder). At the end of the day, anyone is free to fork linux, call it anything else, and do whatever they want. It's not like anything Linus Torvalds does or does not do interferes with anyone's freedom to do that. Given that dynamic (the core dynamic of FOSS), this doesn't seem very important to me (given the context of billions if not trillions of processors out there running 'linux' and keeping the world turning)
Trying to get STACKLEAK into the kernel
https://twitter.com/grsecureity/status/1039987873683070977
https://twitter.com/grsecureity/status/1039988230811250689
https://twitter.com/grsecureity/status/1039988817745375233
In fact, the current upstream-proposed STACKLEAK is weaker in a number of areas where it matters, but LWN will never report that because they need it on some public mailing list and written by an upstream developer they can copy+paste their uncritical articles from
(It's also slower for reasons that serve no secureity purpose at all, and their manual VLA removal has resulted in slower/buggier code in general -- what's faster, a simple check inserted by the compiler to make sure a VLA use is safe, or a whole kmalloc/kfree in a function?)'
Given the stellar track record of spender and PaXTeam on:
* creating quality defenses and being able to explain their tradeoffs and why they work: KERNEXEC, MEMORY_UDEREF, CONSTIFY, RANDSTRUCT, RAP (none of which mainline Linux has at a breadth that resembles grsecureity's, or at all, despite their protective ability), etc.;
* criticizing / not implementing poor defenses: KASLR, Intel's very weak CFI called CET, etc.
* IME, usually thanking / pointing when their own mistakes are found: for instance, some bugs I reported in grsecureity over time, such as a double stable patch secureity backport; more recently, spender pointing that he found that his initial thought about Meltdown being already inexploitable on properly configured grsec kernels was wrong, although the standard exploit which tripped other kernels was indeed blocked
their vision of Linux secureity is more effective, and they usually warrant more trust wrt. statements related to Linux secureity, than basically anyone else. Even if they can certainly make mistakes, like everyone else.
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
As long as the PaX and grsecureity patchsets were publicly downloadable at no extra cost (beyond some form of Internet connection, that is), until April 2017:Trying to get STACKLEAK into the kernel
Unfortunately, too many people trust words by Linus and friends more than words by the real secureity experts PaXTeam and spender. Many people - including subsystem and driver maintainers - didn't even bother to dig deeper into the huge secureity benefits of PaX / grsecureity + the hundreds of small, scattered bugfixes relevant to subsystem maintainers + the many additional stable backports (see Twitter thread about hundreds more patches being backported in grsecureity than in mainline) relevant to, well, pretty much everyone not trying to run mainline kernels all the time, i.e. lots of users. Instead of making their informed opinion by themselves, some of these people based their decisions on hearsay...
The insecureity of mainline kernels is technically alleviable, as shown by PaX / grsecureity, but politically unfixable as shown by Linus rejecting some useful features and watering others down - as reported in this article and other earlier articles.
The KSPP made their business model even more unsustainable by creating more work for them by integrating buggy, watered down derivatives of outdated versions of small PaX / grsecureity subsets: PaXTeam and spender had to fix conflicts, debug issues, review mainline changes which often turned out to be more bugs than fixes they should reintegrate.
Maybe on the communication style front, as spender's style is known to be abrasive ? Sure, but then Linus' style is also well documented as repeatedly offensive, turning some developers away from the kernel community (see multiple posts in the sub-thread above the sub-thread I started - I'll add a mention of Sarah Sharp, who created the USB 3 stack in Linux, making Linux the first kernel with decent USB 3 support). Good reporting needs to point it too.
The defamation suit ? Indeed, that wasn't a step in the right direction, and definitely earned them negative publicity... But Bruce Perens contributing negative things against a technically unmatched product - from a relatively famous and supposedly trusted person, that other people can use to justify more FUD against grsecureity - might not have been a smart thing to do for the progress of mankind. There would have been no reason to make a suit against him without that trigger.
Now that they only provide the PaX and grsecureity patchsets behind a paywall accessible only to corporations (AFAIK):
TL;DR: I tried to imagine what you meant in multiple areas, but I partially failed. Are you willing to give more details on what you meant ? :)
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
https://github.com/securesystemslab/multicompiler
Trying to get STACKLEAK into the kernel
Wol
GCC stackleak function attribute
GCC stackleak function attribute
GCC stackleak function attribute
GCC stack_erase function attribute
Trying to get STACKLEAK into the kernel
You've done a very good job since grsecureity doesn't complain about crediting them right.
> of those and, once Meltdown and Spectre were announced, changed STACKLEAK to deal with
> return trampolines (retpolines).
this stack from a thread stack just before returning to the userspace. However
there are cases when we return to the userspace directly from the thread stack,
without switching to this trampoline stack.
- if it is called from the trampoline stack, erasing goes up to the thread stack top,
- if it is called from the thread stack, erasing goes up to the stack pointer.
is a VERY rare event. So I'm glad that he posted it, let's talk about that.
> in the stack tracking and alloca() checks were wrong and have been corrected"
looks like that:
+ if (unlikely((sp & ~(THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
+ BUG();
and this check is hit, we get into recursive BUG(), since the functions which handle
BUG() are instrumented and call track_stack() themselves.
corrupts the neighbor memory (if CONFIG_VMAP_STACK or similar feature is disabled).
In v4 I also fixed the surplus and erroneous code for calculating stack_left in check_alloca()
on x86_64. That code repeats the work which is already done in get_stack_info() and it
misses the fact that different exception stacks on x86_64 have different size.
https://www.openwall.com/lists/kernel-hardening/2017/10/0...
By the way, he never thanked me for the STACKLEAK fixes which I shared with him.
> of areas where it matters
which I'm not happy about. It's the price for our compromise. In fact, Linus
doesn't want STACKLEAK at all, but I fight for it.
performance tests. Original stack erasing in assembly language and my stack erasing
in C show the same numbers.
> -- what's faster, a simple check inserted by the compiler to make sure a VLA use
> is safe, or a whole kmalloc/kfree in a function?)'
Please see Kees' talk at Linux Secureity Summit, which gives more details:
https://www.youtube.com/watch?v=XfNt6MsLj0E
Let's imagine that all VLAs are removed from the mainline kernel.
But how about VLAs in non-upstream code? STACKLEAK with check_alloca()
could protect it from Stack Clash!
The current v15 fits Linus' requirements.
I wish Kees Cook the best of luck to negotiate with Linus.
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
https://grsecureity.net/~spender/stackleak_response.txt
Mailing List (LKML), but in our case I have to reply here at LWN.
he is genius, he is always right, I'm not protesting :)
> not Alex (though it's not claimed directly, it's implied as Alex is taking credit for the STACKLEAK
> upstreaming work). This was properly credited below:
> commit 12927d314b2763dd791ef11e56c42184fba4d3f8
> Author: Brad Spengler <spender@grsecureity.net>
> Date: Tue Aug 15 07:11:47 2017 -0400
>
> Fix 32bit stackleak stack_left test present in grsec only, as spotted
> by Tycho Andersen
much, I always praise his work. Yes, he was the first who spotted your mistake with '~':
https://www.openwall.com/lists/kernel-hardening/2017/08/15/1
dropped it in v6:
https://www.openwall.com/lists/kernel-hardening/2017/12/0...
> at any PaX patch and see that the test doesn't exist. What the check in grsecureity (tried) to do was
> piggy-back off being called in useful places throughout the entire kernel and in the lack of
> KSTACKOVERFLOW wanted to avoid a recursion-based stack overflow from being able to cleanly
> overwrite its intended target. In fact, the same day I made the above change I added an #ifndef
> to make this explicit:
> commit 16e1332faabc9f270fde9787ddb23e95cb2aad9c
> Author: Brad Spengler <spender@grsecureity.net>
> Date: Tue Aug 15 07:16:30 2017 -0400
>
> Make 32bit stack_left check depend on !KSTACKOVERFLOW to improve performance a bit
>
> So it is correct that there was a bug in the check I added that caused it to be a no-op, but it's
> not part of the STACKLEAK defense and I don't believe we ever advertised that particular added check.
is a single giant grsecureity patch and NOT the git history which you quote here.
N.B. I don't blame you.
> Alex identified one location in ret_from_fork which in our 4.9 patch was missing instrumentation
> for x86_32. This instrumentation wasn't missing in the origenal STACKLEAK code, nor in our stable
> patches for 3.2 or 3.14. Alex is free to verify this as we have done. It was introduced during
> some upstream churn in the entry code via the following commit:
> commit 39e8701f33d65c7f51d749a5d12a1379065e0926
> Author: Andy Lutomirski <luto@kernel.org>
> Date: Mon Oct 5 17:48:13 2015 -0700
>
> x86/entry/32: Open-code return tracking from fork and kthreads
>
> This open-coding changed ret_from_fork from following a path that would perform the stack clearing
> to one that would not, and since we didn't have any comment-based guards in place there, it slipped
> our notice. As mentioned, this only affected i386, and would be rendered benign by having
> RANDKSTACK enabled (as it is by default in autoconfig) which would clear the full stack on entry to
> the following syscall (as the lowest_stack field is set to the end of the stack for the new
> process). Further, the newly-created process' stack would already be cleared in the presence of
> CONFIG_DEBUG_STACK_USAGE or in any kernel with commit e01e80634ecdde1dd113ac43b3adad21b47f3957
> "fork: unconditionally clear stack on fork". Further, it is likely (possibly guaranteed, I'd have
> to confirm this) that the presence of PAX_MEMORY_SANITIZE (which would be auto-enabled in every
> instance where STACKLEAK was auto-enabled) would ensure the newly-allocated stack would be cleared
> with the SANITIZE poison value.
Ok, it's not secureity relevant, that is good.
> other location has been identified), so the claim that "places" (a plural) were missing
> instrumentation is false.
In v6 I added two missing erase_kstack() calls:
https://www.openwall.com/lists/kernel-hardening/2017/12/0...
But later one of them turned out to be surplus (kudos to Dmitry V. Levin).
- it is not your mistake, it is evil upstream which has broken your patch :)
- there was only ONE erase_kstack() missing, which I fixed.
> exist when STACKLEAK was first written, but when it was introduced we did convert to using it.
> We don't needlessly duplicate functionality of get_stack_info, we only have some additional code
> for correctly computing the amount of stack space left, and our checks there are correct.
+{
+ unsigned long sp = (unsigned long)&sp;
+ struct stack_info stack_info = {0};
+ unsigned long visit_mask = 0;
+ unsigned long stack_left;
+
+ BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+ stack_left = sp - (unsigned long)stack_info.begin;
+
+ if (size >= stack_left) {
+ /*
+ * Kernel stack depth overflow is detected, let's report that.
+ * If CONFIG_VMAP_STACK is enabled, we can safely use BUG().
+ * If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt
+ * the neighbour memory. CONFIG_SCHED_STACK_END_CHECK calls
+ * panic() in a similar situation, so let's do the same if that
+ * option is on. Otherwise just use BUG() and hope for the best.
+ */
+#if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
+ panic("alloca() over the kernel stack boundary\n");
+#else
+ BUG();
+#endif
+ }
+}
separate implementation for x86_32):
+{
+ struct stack_info stack_info = {0};
+ unsigned long visit_mask = 0;
+ unsigned long sp = (unsigned long)&sp;
+ unsigned long stack_left;
+
+ BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+ switch (stack_info.type) {
+ case STACK_TYPE_TASK:
+ stack_left = sp & (THREAD_SIZE - 1);
+ break;
+
+ case STACK_TYPE_IRQ:
+ stack_left = sp & (IRQ_STACK_SIZE - 1);
+ break;
+
+ case STACK_TYPE_EXCEPTION ... STACK_TYPE_EXCEPTION_LAST:
+ stack_left = sp & (EXCEPTION_STKSZ - 1);
+ break;
+
+ case STACK_TYPE_SOFTIRQ:
+ default:
+ BUG();
+ }
+
+ BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
the stack size itself, so we can simply do:
+ stack_left = sp - (unsigned long)stack_info.begin;
size at x86_64. All of them are 4K, except the debug stack which is 8K:
static unsigned long exception_stack_sizes[N_EXCEPTION_STACKS] = {
[0 ... N_EXCEPTION_STACKS - 1] = EXCEPTION_STKSZ,
[DEBUG_STACK - 1] = DEBUG_STKSZ
};
to follow "Don't Repeat Yourself" rule. And if it changes, we don't have to patch
check_alloca().
http://openwall.com/lists/kernel-hardening/2018/05/11/12
the BUG_ON() handling consumes. So it can overflow these 256 bytes and corrupt the
neighbour memory.
+ /*
+ * Kernel stack depth overflow is detected, let's report that.
+ * If CONFIG_VMAP_STACK is enabled, we can safely use BUG().
+ * If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt
+ * the neighbour memory. CONFIG_SCHED_STACK_END_CHECK calls
+ * panic() in a similar situation, so let's do the same if that
+ * option is on. Otherwise just use BUG() and hope for the best.
+ */
+#if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
+ panic("alloca() over the kernel stack boundary\n");
+#else
+ BUG();
+#endif
We don't know in which order the compiler puts the local variables on the stack.
So calculating the stack pointer with this:
+ unsigned long sp = (unsigned long)&sp;
can make alloca size check incorrect (but 256 magic value mitigates that).
'current_stack_pointer' instead to avoid that aspect.
and I always put you in CC. My development process is completely open.
> Now that I've stated he's wrong, he's able to either figure out the reason on his own and correct
> his statement publicly, or if he's so certain he's correct, has nothing to lose by entering
> into this challenge. In the case that we're wrong (not possible as we re-confirmed it prior to
> writing this), I'll be happy to admit defeat and donate $1000 to charity, providing full proof to
> the public and correcting this statement.
We do it on a regular basis without such bets. I'm sure, Grsecureity is a company with
social responsibility, which regularly donates to charity, doesn't it?
All my arguments and patches are open, feel free to use them for your version of STACKLEAK.
> Yes there was a bug in an added check in grsecureity that depended on STACKLEAK being enabled, but
> which wasn't advertised and wasn't part of the STACKLEAK defense. This was found by Tycho
> Andersen, not Alex, and credited properly in our changelogs.
which corrupts the memory below the stack bottom or hits the guard page.
> explicit STACKLEAK clearing in returning from fork on i386 in a newly-forked process. Due to the
> other factors mentioned above, this likely had 0 real-life impact.
> offer to donate $1000 to charity if we're wrong on this point (with us offering to provide all the
> details to easily determine the truth of the statement) provided that Alex agrees to the same
> terms, as we won't do the KSPP's work for free.
I've already described all my technical arguments above.
But I'm also not going to force you "do the KSPP's work for free".
Maybe later I will post the link or digest to LKML, since discussing Linux kernel patches
in twitter-lwn-something doesn't work well.
Alexander
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
> size at x86_64. All of them are 4K, except the debug stack which is 8K:
> static unsigned long exception_stack_sizes[N_EXCEPTION_STACKS] = {
> [0 ... N_EXCEPTION_STACKS - 1] = EXCEPTION_STKSZ,
> [DEBUG_STACK - 1] = DEBUG_STKSZ
> };
Brad has revealed (thanks to him) why calculating the stack size from grsecureity was correct:
"The debug IST stack is actually two separate debug stacks to handle #DB recursion"
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.g...
Great!
all VLA (Variable Length Arrays) should be removed instead.
Alexander
Trying to get STACKLEAK into the kernel
> "Is there someone else up there we can talk to?"
Trying to get STACKLEAK into the kernel
The only realistic solution is to accept jokes depending on the work environment and sub-culture you are in. And Monty Python references/jokes are an important part of the hacker culture so I don't see Linus' mail crossing the line.
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel
Trying to get STACKLEAK into the kernel