It does sound rather...
...more like a compiler issue to me from what was said in the article: Optimising-out bounds- and/or null-checking code!!!
Not impressed with the kernel devs' responses anyway.
A recently published attack exploiting newer versions of the Linux kernel is getting plenty of notice because it works even when security enhancements are running and the bug is virtually impossible to detect in source code reviews. The exploit code was released Friday by Brad Spengler of grsecurity, a developer of …
Really the bigger issue here is the SELinux vulnerability, as that does exist on all current distributions using SELinux out there right now, and that particular vulnerability likely goes back several years. No vendor yet has mentioned how long exactly the systems have been vulnerable, but both Fedora 10 and 11 are known to be vulnerable. The vulnerability allows anyone to exploit the large class of null pointer dereference bugs in the kernel, which would not be possible with a regular kernel.
-Brad
Come on, El Reg, I really expect better from you.
The following: "Although the code correctly checks to make sure the tun variable doesn't point to NULL, the compiler removes the lines responsible for that inspection during optimization routines." is completely false.
The bug is real, but it is a very simple bug. Not checking a pointer for a NULL value. No, the code does _NOT_ check for NULL and that is what causes the problem. This has been blown way out of proportion.
For those who know C, this is the relevant code:
struct sock *sk = tun->sk;
...
if (!tun) return POLLERR;
The bug is in the 1st line - it uses tun before checking it for NULL. The check is a few lines below. A very simple bug that happens to the best of us.
Now the exploit is extremely clever, but the bug itself is trivial.
It's not the fault of GCC that the kernel developers failed to use the proper optimizations to build the kernel with. There exists a specific gcc optimization flag, "-fno-delete-null-pointer-checks" that keeps these kinds of bugs with this pattern from turning exploitable like mine did. This flag will be added in the next stable version of the kernel.
-Brad
I want to know who's idea it was to have the compiler remove NULL pointer checks by default. If you are testing for NULL you are doing it for a reason!
The way I see it this is not a failure of the kernel team for not specifying the -fno-delete-null-pointer-checks compiler flag, it is a failure of the gcc team having the compiler do away with such checks by default!
I for one, would prefer a kernel that spends a few extra cycles testing for bad parameters, to getting my system reamed by some pimply script kiddie in china!
Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed.
I'm sorry you don't seem to get it, but you don't make yourself look smarter by spamming your response on every site mentioning this vulnerability.
Oh and for reference, Red Hat has marked the SELinux vulnerability I disclosed as "High Severity":
https://bugzilla.redhat.com/show_bug.cgi?id=511143
-Brad
Well, this interested me, so I wanted to check. Here is what "info" says about that gcc flag:
-fdelete-null-pointer-checks
Use global dataflow analysis to identify and eliminate useless checks for null pointers. The compiler assumes that dereferencing a null pointer would have halted the program. If a pointer is checked after it has already been dereferenced, it cannot be null.
In some environments, this assumption is not true, and programs can safely dereference null pointers. Use -fno-delete-null-pointer-checks to disable this optimization for programs which depend on that behavior.
Enabled at levels -O2, -O3, -Os.
I don't know the kernel environment, so I don't know what happens on a NULL pointer dereference there. But, with typical user code, what gcc is doing is reasonable, if a bit extreme.
The bug is in the kernel code, where the check is *after* the dereference. Even if the author knows that that works in the kernel environment, I think it is still a bad idea because it is quite non-obvious. If performance is that critical, then add a comment explaining what is going on. Adding the gcc flag to the kernel compile flags will help.
All IMHO of course - I'm not a kernel developer.
I'm about to demonstrate that I'm not an expert, but why don't non-kernel processes have their (virtual) first page/segment, into which any null (as zero[*]) would point, by default removed from the process' address space? The hardware would then catch it and hand it to the kernel on a plate.
And perhaps get GCC to report check-after-use constructs like this which are clearly wrong.
[*] null != zero in the C spec but in any current machine I'd expect it to be.
Let me check, this is a potential *compiling* problem, so the kernel code is sound. The compiler is OK, too. It´s just a matter of passing the right options at compilation time. Hardly a Linux problem then. More like a *potential* vendor problem...
Good to flag, so that self-compiling guys don get caught pants down, but hardly the end of the world. Especially as, from what I gathered, any exploit would need to run with the setuid bit set, which, let´s be honest, is not bloody likely to happen in any standard distribution, let alone hardened ones. Dubious setuid programs are likely to be prevented from running in the first place. It looks suspiciously like a ¨Oh my dog, if I run exploit code as root my system might be vulnerable!¨. Wake up people, regardless of the OS if you run exploit code as root you´re screwed. And any attack that needs admin privilege to be efficient is a non-attack to begin with. If I get admin access to your system, I am totally not going to try and exploit an obscure vuln in the kernel. There are much easier and more interesting things to be done. I side with Linus on this one. Any program running with the setuid bit set *is* a potential hazard and should be carefully reviewed, that´s why it´s considered bad practice, and that´s why it´s forbidden (or triggers massive warnings) in most serious distros. Now if your sysadmin is willing to make his system wide open, it´s hardly an OS problem, is it?
It´s still a clever attack, one of which might spread using social engineering to root Ubuntu n00bz. Oh, except that Ubuntu doesn´t seem to be vulnerable (yet).
Just one more thing, hardening a system doesn´t mean running SELinux. It means (amongst other things) that only trusted code is allowed to run, so this attack code is never going to be allowed to run in the first place. To this regard, the article is misleading: hardened systems are completely, absolutely, positively, 100% safe.
For one second I thought some of my systems could be vulnerable, I´ll just relax and have a pint or ten now...
Stepping back for a moment there seem to be a few lingering issues that are not really fully resolved.
A null pointer dereference exception is a nice debugging aid. But in code that is supposed to be secure it can never be trusted. It depends upon the system protecting an area of memory at address zero. Typically a page. Clearly it may fail to trap the dereference if the data structure referenced by the pointer is larger than a page. This is not exactly a common thing, but it isn't impossible. Array indexing through a pointer with large array indexes might also come under this failure, and is a much more common thing.
The point is that secure code can never rely upon null pointers being trapped. Code must always check. Always. The colorary is clearly that optimising out null pointer checks is always incorrect in secure code. Always.
It seems that someone forgot that in kernel space, you don't have the possibility of protecting memory like this and that, a-priori, this compiler optimisation is invalid. Always, for all code.
This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code.
One might hope that no SUID program anywhere in the OS is compiled with this optimisation. It isn't just the kernel, it is the entire OS build process potentially at risk. So this is an issue that reaches to each and every distribution packager.
Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough.
The reason that quote was included in my exploit was because of the incredible incorrectness of it, as I was indeed exploiting the kernel in every case, and in the case where SELinux was enabled, there was no setuid binary necessary at all. So Linus' analysis at the time was completely off. Linus is no security expert -- I don't understand why you Linux zealots prop him up as one. If you really want to know what Linus thinks about my exploit, why don't you ask him about it now that he's (presumably) actually seen it? I know what he's said about it in private, and he is most certainly not calling it "trivial bollocks." So with him as your idol, do you now also agree it's not "trivial bollocks" or do you have any critical thinking of your own? You ignore the response of every other legitimate security researcher and point to a quote from Linus in reference to a video of the exploit I posted last week, which was included in the exploit precisely because it was so horribly and hilariously wrong.
"Trivial bollocks" that is currently unfixed and rated by Red Hat as "High Severity."
It's exactly this "let's fix the bug, patch the software and get on with it" that perpetuates the cycle of "fix the bug, patch the software and get on with it." That's such a 1992AD security mentality, which the rest of the world has moved past, while the Linux upstream still lives in the security stone-age.
-Brad
I develop code to run on MS platforms, and I know what most of you 'Linux people' are talking about. But the fact is that all software contains bugs and vulnerabilities. Jumping up and down and crying out about whos better than who is just childish bullshit. Maybe 'we' realize that and 'you' dont? Personally I dont care which platform people choose. Right tool for the right job I say.
I mean really. When there is a bug on Windows, it is better for *everyone* if it is patched. Likewise when one is discovered on Linux. Nobody wants systems falling down or sending spam (unless you are the one doing the toppling or spamming!).
Cant we all just get along? :)
You'll find that most Windows users are quite well-adjusted and not likely to jump on the fanboy bandwagon, like how I see some Linux users/zealots do so (I am not saying that all are like this, just enough to make it noticeable)
I have grown tired of all the fanboism that comes with a story that projects Microsoft / Linux / Mac / etc in a less-than-perfect light. I wonder when the time will come that people realize that it is just a personal choice and no matter what is poster on a forum will never change the minds of others, and that extreme thinking only takes away from your argument.
I suppose this is that same as how everyone has seen the Muslims / Christians / Jews, a few extremists will cast the entire religion in bad light and then everyone else will assume that a person from another religion is a terrorist / bible-thumping racist / money-grubber.
Life has taught me that there will always be trolls harming the adoption of good ideas, people constantly using inflammatory phrases in an attempt to convert people to his or her side but rather harming their own position. Perhaps the only way to treat people like this is just to ignore them, to effectively deny them the attention that they so crave.
It's a code bug, not a compiler bug. The compiler ignores the redundant NULL check after the pointer's been used (which, assuming the compiler is smart enough to work out if a pointer may have changed its reference, is perfectly reasonable). So, surprise surprise, open source doesn't lead to perfect coding however much some people believe it does. True, this isn't a major issue, it's still slightly worrying that NULL reference checks aren't checked with a static analysis tool before releases..I would have thought that was pretty standard practice for something as important as an OS kernel.
> "Setuid is well-known as a chronic security hole," Rob Graham, CEO of
> Errata Security wrote in an email. "Torvalds is right, it's not a kernel issue,
> but it is a design 'flaw' that is inherited from Unix. There is no easy
> solution to the problem, though, so it's going to be with us for many
> years to come."
Um, so doesn't his translate as "Linux is known to have a major security hole that is unlikely to be fixed in the near future"?
This 'exploit' requires the user to have root in the first place, to inject a setuid program into the system (which would be caught by the next run of tripwire and SELinux wouldn't let it run anyway, but let's not let facts get in the way of a good story).
If the bad guy gets root = game over. Anything else they do is just icing.. even SELinux isn't an absolute defence against this.
I agree the optimisation flag on gcc is the real bug - it should be flagging these dereferences as errors not deleting the tests.
The simple bug is dereferencing tun in the line "tun->sk". The fact that after that there's a NULL test on "tun" which GCC correctly optimises away, doesn't make it a more serious or unusual bug -- although it would certainly be nice if GCC issued a warning "optimising away NULL test because you've already dereferenced it". In particular, your contention that "from a source review the bug is unexploitable" is wrong, unless the source reviewer in question somehow misses the "tun->sk" line with the bug in.
The bug in PulseAudio, which the Reg article somehow conflates with this one, is of course completely separate.
Peter
Its because the M$ Camp (me included) all have hangovers on Saturday morning as we were all out last night with real women in real pubs not geeking out over some compiler issue.
Obligatory flame
*nix sux - cry yourself to sleep cos some bloke with a beard made a mistake in your shitty OS
(I don't care really - just joining in for the sake of it)
The core problem here is this is a dangerous optimisation that should only be enabled explicitly, not bundled into -O3. It's dangerous because it assumes the privilege level of the code being compiled and the system behaviour of the target. It should default conservatively and doesn't.
The source itself is strictly correct but inherently dangerous, it assumes knowledge the compiler doesn't automatically have and could have been written more robustly. Its sloppy. being blindsided by gcc gets them off just once, they need to take this much more seriously. I want robust defensive coding in my kernel, not blame shifting.
"Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed."
I expect that's because he based his diagnosis on the much-quoted code fragment...
struct sock *sk = tun->sk;
...
if (!tun) return POLLERR;
If this is the vulnerability, then it is indeed a trivial "used before checked" bug. "tun" is clearly used before it is set and any decent data flow analysis would pick it up even if it is buried in a long and confusing routine. LINT has done this sort of thing for years. I doubt the Linux kernel is marked up with all the annotations required, but to suggest that this can't be found by examining the source code says more about you than the state of the art.
If this is not the vulnerability, perhaps you could enlighten Mikov and the rest of us.
I have not looked at the code in question and have no intention of doing so, but according to the previous comments, the problem is either a compiler bug (though I see that this is disputed) or the code is checking for a NULL reference AFTER it has dereferenced it, which even if it is legal on a particular platform, is a bloody stupid thing to be doing!!!
Either way, this should be trivial to fix (ok - fixing the compiler could take a while if that really is the issue, but it's hardly insurmountable).
But being Linux, I suppose those responsible need to slag each other off and argue and talk shite for a couple of months before anything actually happens. Mmm.... I think I'll stick with BSD, thanks.
You should have just printed the 300 lines of comment as the article, very funny, though perhaps in places not intended.
Fair play though, not a bad piece of code, but this is why we don't use brand spanking kernels. That being said, it is a bit of a non-issue, there are some very specific circumstances and dependencies needed here, and the exploit is a tad flimsy in places- though hey, if it works...
> "This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code."
They do, and the gcc team are happy to work with them and add new options or modify optimisations to make the compiler more suitable for their usage. But I don't know of anywhere the kernel team have ever sat down and made a clear list of what they do and don't want the compiler to do; they're a bit reactive rather than proactive, what tends to happen is that some optimisation turns out to cause a problem for some bit of code in the kernel, the kernel team approaches the gcc team and gets the problem addressed, then six months later it all happens again...
>" Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough. "
Now you wait just a cotton-pickin' minute there. Kernel development is hard-core stuff, and not suitable for amateurs and dabblers. You need to know how a computer works from top to bottom to do it, you need to understand everything from hardware and busses and memory accesses and caching to low level assembly and synchronisation and threading techniques up to the level of security and usage patterns and efficient algorithm design - and you need to understand how the toolchain works and what it does. Kernel developers have very special and unusual requirements, and a compiler is a general purpose tool for a broad audience. It is for kernel devs to know and clearly explain their requirements, not for non-experts (compiler devs) to attempt to second-guess them. They should absolutely be expected to read the fine print of the opimisation flags they want to use to build their code - it's only one more drop in the ocean of fine print they need to read and understand to write reliable kernel code.
There seems to be confusion here as to whether it was a compiler bug or a coding error, whether the kernel is flawed or not ...
If it is the compiler optimising away something when it shouldn't have, this is a fail for the compiler developers.
If it's a case of the source code being correct and the compiler optimised away a null check this is a fail for the developers who built the kernel and their process, but not a fail for Linux per se.
If the source code is incorrect then that's a fail for the kernel programmers and no amount of buck passing to the compiler or arguments that it's not a bug will wash.
No matter how the bug arose, if it exists, is exploitable and demonstrably so in the field, it's a huge fail for Linux either way, and trying to claim it's not worth worrying over is simply trying to downplay the issue.
If there's no source code error, the compiler did not optimise away something it should not, and no exploitable bug exist then I'll agree it's a storm in a tea-cup. Unfortunately that does not seem to be the case.
Yes, there's one problem with your theory.
The Linux kernel has a known, demonstrably exploitable security problem in the field, and the kernel developers do not wish to fix it.
Trivial or not, apparently it's not so trivial that they'll be fixing it any time soon.
No, the reality is that too many Linux zealots including the kernel developers refuse to ever accept they're wrong on anything.
This is why Linux is never going to make traction whilst this attitude is so prevalent and why it's stuck in a rut. Because Linux developers write the code that Linux developers want to write, usability be damned. Find a security exploit in their code? They'd rather let it stay in there claiming it's not their fault than accept they're not perfect and are equally capable of making simple, blatant mistakes.
There are two problems.
The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error.
That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check.
Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid.
The second problem is that some of the people on the Linux kernel do not apparently intuitively grasp the seriousness of this.
The compiler is a red herring. It doesn't delete NULL pointer checks - *UNLESS* you've already dereferenced the pointer, in which case it quite reasonably assumes that you've already crashed before you get to the check anyway - and the exploit proves that it is correct in this assumption, or rather that crashing would be a best case! But testing for a NULL pointer at the end of the routine is just too late: the bug occurs here:
struct sock *sk = tun->sk;
At that moment, because it is allowed for a user process to map memory at address zero, what you have done is inject a user-controlled data structure into the kernel, which implicitly trusts its own data structures. That is the security violation, and it's nothing to do with the compiler, it's more a consequence of a false assumption in the kernel:
1) I can trust all pointers to kernel objects, because they will only point into kernel space and only privileged (i.e. trusted) code can place anything in kernel space.
2) But NULL is a pointer value and it is not in kernel space.
3) The kernel trusts *all* pointer values, incuding the one that happens not to be in kernel space but is under user control.
Ouch. The false assumption could be stated in a single sentence as "we can trust all pointer values, including NULL, because even though it's technically not a kernel address it will always make a crash if you access it". But no, it won't, that's just not true.
What might work better would be to build an option in the compiler to use a value like 0xffffffff as a NULL pointer instead of numerical zero. Or for a few pages (maybe even a few meg?) down at the zero end of memory to be declared 'honorary kernel space', protected by the same kind of PTEs that prevent the user accessing kernel space, and not mmap'able, although that might only mitigate rather than fully block the entire class of exploit.
I don't understand grsecspender's response:
>"from a source review, the bug is unexploitable"
We've known for some time that dereferencing a possibly-NULL pointer is exploitable, it was first shown by that ARM exploit by Barnaby Jack
http://www.theregister.co.uk/2007/06/13/null_exploit_interview/
then there was the SWF/ActionScript null pointer dereference by Mark Dowd
http://documents.iss.net/whitepapers/IBM_X-Force_WP_final.pdf
so frankly, any source code audit that doesn't ask the question "Could this pointer possibly be NULL?" is not asking the right questions at all. I think it's fair to say that there are two bugs: the use-before-NULL-check bug is one, and the user-is-allowed-to-mmap-NULL is the underlying and more serious bug which is what enables this and the whole class of other similar bugs to be exploitable. (You could probably argue that a third bug is in the code that calls this routine while passing it a NULL pointer in the first place.)
That GCC was optimising-out a null-ref check when it could clearly see that the variable had already been used, is *expected behaviour* for the compiler and clearly documented in its manual. If the programmer couldn't be bothered to RTFM, that's *his* fault. not the compiler developers'.
That non-time-critical code for an allegedly modern operating system is being written in a portable assembly language that's getting on for nearly 40 years old is the real bug here. Even my humble Nokia 2630 is more powerful than the computers C was created for.
Oh yes: for those who haven't read the original article the Register's piece was based on, take note of the following quote from grsecurity's own website:
"Due to Linux kernel developers continuing to silently fix exploitable bugs (in particular, trivially exploitable NULL ptr dereference bugs continue to be fixed without any mention of their security implications) we continue to suggest that the 2.6 kernels be avoided if possible."
Note that they're referring to *multiple* instances of these kinds of bugs. The line people enjoy quoting was just one example, which the researcher used to write his proof of concept exploit. That the Linux kernel is *riddled* with such bugs reflects poorly on its developers.
This is also the kind of bug which, had the software been written using half-decent tools, would never have made it into the released code in the first place. For all the criticisms of Microsoft's ".NET" languages, the simple fact is that C# wouldn't have let you write such bad code in the first place. When a user interface—and that's all programming languages are—makes unwanted actions easy to perform, it is time to replace it.
Microsoft may have their flaws, but at least they're trying to do something about the appalling tools this industry insists on using. They're still a long way from development nirvana, but at least it's *something*.
(Oh yes: my computer is a Macbook Pro, not a Windows box. So please don't waste time accusing me of fanboyism. There's no such thing as a "best" platform. Only a "least worst".)
>"The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error."
It's not a bug. It is an explicitly documented feature in the manual, which also warns you to take care with it:
> >" `-fdelete-null-pointer-checks'
> >Assume that programs cannot safely dereference null pointers, and that no code or data element resides there. This enables simple constant folding optimizations at all optimization levels. In addition, other optimization passes in GCC use this flag to control global dataflow analyses that eliminate useless checks for null pointers; these assume that if a pointer is checked after it has already been dereferenced, it cannot be null.
> >Note however that in some environments this assumption is not true. Use `-fno-delete-null-pointer-checks' to disable this optimization for programs which depend on that behavior.
> >Some targets, especially embedded ones, disable this option at all levels. Otherwise it is enabled at all levels: `-O0', `-O1', `-O2', `-O3', `-Os'. Passes that use the information are enabled independently at different optimization levels. "
>"That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check."
I'm saying that when the user explicitly tells GCC to assume that it can do this, GCC should assume that it can do this, and if it is not true, the user should not have told lies to the compiler, and the compiler should do what the user tells it. Maybe you *want* to get a SEGV if the pointer is NULL because you plan to handle that elsewhere? The compiler can't second guess you. You told it that control flow cannot possibly reach that test if the pointer is NULL; it would be stupid of the compiler to bother inserting it.
>"Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid."
It's not stupid. The vast majority of these checks are not going to come from buggy code like this, but from places where any/and/or/all of macro expansion, inlining and templating have combined to generate inefficient code. If the compiler wasn't very aggressive with optimising this stuff, C++ templates would still be the hideous bloated monstrosities they used to be back in the 90s - i.e. practically unusable in anything that has to be the least bit efficient.
You can argue about whether this is a "dangerous optimisation", and should only go in -O3 by default (I might agree with you), or whether it shouldn't be turned on by default at all but always left to the user to request (I'd probably disagree), and you can argue that it should generate a warning (I'd certainly agree with you, but I might consider that sufficient reason to leave it enabled at lower -O levels), but calling it "stupid" is simplistic and lacks insight into the issues. As I think I mentioned once before, GCC is a general purpose tool that must work for a huge range of different applications from small realtime embedded to overnight number crunching batch jobs. No single set of optimisations is ever going to be completely right for all those applications, and the -O levels are crude guidelines, but if you have a very specialised need, you need to take control of how you use your compiler.
Sure, get rid of C, that will free up our time to concentrate on reference counting, garbage collection, bytecode inefficiencies, blah blah blah :).
Linux, Windows and Mac all run on C-based kernels. C may have 30-year-old problems, but at least they're *understood*.
It's worth pointing out that there are many demands on software, not just security. Execution speed comes out pretty high on the list, and nobody wants to cripple their PC with a kernel that already killed their performance for them before they launch their first application.
C is 'portable assembly' because that is precisely what is required for an efficient kernel implementation. If kernel developers could write in something else they would - they are not masochists!
To make it an error, doing that would pretty much go against the principles of C++, as there's plenty of reasons why you might want it to actually do that. Even putting it in as a warning is a bit dodgy as far as I'm concerned..
A far better way of managing it would be to have a compiler switch that flags them as warnings instead of just optimising them away silently, then when you add new code you could easily keep track of it, especially if you've done some pointer intensive code.
Surely the reason for the optimisation is (among other things) code like this:
inline char foo(char *p) { if (p == 0) return 0; else return *p; }
char bar(char *p) { *p = 2; return foo(p); }
int main() { char c = 0; return bar(&c); }
If foo gets inlined into bar, the compiler can spot that the null pointer check in the inlined code is unnecessary and remove it. This is a most excellent optimisation (granted, in this example foo and bar do so little work that other optimisations may render it unnecessary).
As far as the C standard is concerned, this optimisation doesn't have to assume that a null pointer dereference would halt the program. The dereference of a pointer which may or may not have been null means that the implementation can thereafter assume it wasn't null. If it was null the behaviour of the rest of the program is undefined anyway, so the tiny detail of the assumption being false doesn't make it invalid. If dereferencing null is valid and is supposed to have predictable behaviour, then you're into non-standard C, so you have to read the compiler docs. GCC's behaviour appears to be (a) standards compliant and (b) documented, so should come as no great surprise to the programmer.
For my example code, the optimisation certainly should not result in a compiler warning or error. There's nothing wrong with either function foo or function bar. It's just that one of them takes the (perfectly reasonable) approach of checking its input, and the other one takes the (also perfectly reasonably) approach of requiring that its callers not pass in null pointers. Standard functions exist taking both approaches - compare for example time() and strlen().
Maybe the point is being missed here.
I doubt that anyone regards the existence of an exploitable bug/feature in Linux to be good. However, ask yourselves why there is argument about whose "fault" it is...
In a complex system, one always tries to put the right solution in the right place. There are several ways this problem might be fixed. Choosing the wrong one might fix it more quickly, but may cause problems later. If speed is not the over-riding issue (and it seems it isn't) then thinking carefully (and this means arguing) about whose responsibility it is to protect against this problem is the correct response.
Once you have the correct protection installed in the correct place and everyone knows whose responsibility it is to look after this in future, then you have a more robust system. Failing to argue this out and fixing it the wrong way just starts you on the path towards a system that's unmanageable from a security point of view. I think you all know the example I'm thinking of...
@spendergrsec: Brad, you should really stop tooting your own horn and it would also help if you weren't unnecessarily rude . Everybody so far has acknowledged that the exploit is very impressive. Good work. I really mean that and have said it from the start. But please, don't let that go to your head.
I am trying to clarify to readers of El Reg who may not be experts in C or the Linux kernel (unlike the crowd in LWN), that contrary to what has been said, this is an ordinary run of the mill bug, which is easy to spot and fix in a regular code review, and it is not caused by a flaw in GCC.
@BlueGreen: Normally the hardware would catch the NULL pointer reference and it would result in a kernel oops. However part of the exploit is that it (relying on another bug) first maps valid memory at address 0. It really is a very clever exploit relying on unrelated kernel bugs.
The bug in question itself however is trivially noticeable and fixable. Any tool like LINT would have caught it (in theory; in practice it is not so easy to run LINT on the kernel).
When an article of this nature comes up for Mac or Windows, we flame each other, etc, but at the end of the day, the company fixes it. When this comes up for Linux, a whole bunch of code monkeys have a pissing contest ("I know more about coing than you, look at this crap I typed" and "mom, mon, he said I couldn't code, tell his mother so he gets spanked") and argue that its a compiler issue.
And this is why Linux is shite, its for code monkeys, who actually like to spend their weekends coding and compiling, rather than having a life. As they say, you get what you pay for.
TANSTAAFL
1) Big geek fight over who's fault it is - problem not addressed
2) Much finger pointing - problem not addressed
3) "It's a feature!" (of either the kernel or GCC) - problem not addressed
So Linux out in the field has (or will have...) a critical security flaw and the freetards are too busy waving their pocket protectors about to actually fix the problem. With an attitude like that, is it any wonder most organisations who rely on IT to run their business would not touch Linux with a shitty stick?
Get your act together you bunch of jumped-up primadonnas; you are not doing yourselves, Linux or the open source community any favours with your public bitch-fest.
This post has been deleted by its author
I normally wouldn't bother posting on this subject matter because I'm not a fan of Binux but I had a feeling the Binux geeks wouldn't be able to post without mentioning Microsoft and you didn't let me down. Great so you use Binux but get over your fascination in Microsoft and your hatred for those who prefer it over your free alternative. Yes Binux is free and people still prefer the paid alternative LMAO