random...
They could seed the RNG with MS' next OS release date
The pseudo-random number generator used by Microsoft in Windows is flawed, according to security researchers. A team of cryptographers led by Dr. Benny Pinkas from the Department of Computer Science at the University of Haifa, Israel were able to unravel how the CryptGenRandom function Windows 2000 worked, without assistance …
There was a Dilbert cartoon where he was touring accounts. Sitting at a desk was an accounting troll(Who according to Dilbert was made of 90% snot), repeatedlly saying 9,9,9. When Dilbert asked the Head of accounting troll what he was doing, he was told,"He's the random number generator, you never really know do you? "Heheh
An age old problem with random generators! Now, I have seen some companies giving the "trivial" task to code the random number generator to, let's say a little inexperienced, developers and never testing it. But today when random nuber generators are so important ?? It is tedious and takes time to test any new or even old generator on new system / platform but the methods are widely available.
From the Pre-Windows days. "Why bother redoing an algorithm for such a trivial thing as an RNG just because we're gluing a new front-end to DOS?"
And then as the OS is further upgraded and rebuilt from Windows 1.0 to the start of the DOS-less lineage, the thought would have been "Well, that code works. Why waste the time?"
And then the DOS-less branch starts up. "Why should we write our own RNG code? Lets just gank the one out of the DOS version."
And then as that grows into 2k, XP, Vista, etc. you get "That code works. Don't waste the time."
The idea of testing a random number stream by compressing it is both true and useless at the same time. A perfect random stream is indeed incompressible. But any pseudo-random stream is always compressible. The problem is in two parts.
To any computer user "compress" means using a standard compression utility. Which won't work. These are all crafted to take advantage of the low entropy in the target source stream. So one can compress text files due to the redundancy in the language and can use even simple coding (such as Huffman.) Similarly, loss-less music compression and loss-less image compression are designed to look for common patterns that arise in the nature of the source.
One could not take a random number stream and try to compress it with any of these. Even if created with a badly broken generator these compressors would not find any traction.
The key concept is algorithmic information content. Working out the minimum sized device that is capable of reproducing the input stream. The Kolmogorov complexity. The issue with any software RNG is that this complexity is low. It is after all nothing but the RNG function itself, plus the seed. This is irrespective of the implementation or algorithm.
What is broken in the MS RNG appears to not be the algorithm itself, but rather the surrounding implementation. This suffers from two flaws. Emitting a very large stream of deterministic data before getting a new seed, and allowing the seed to be seen externally. It matters nought what the algorithm itself is, with these two flaws in implementation it is fatally damaged. The Kolmogorov complexity requires only knowledge of the algorithm (trivially reverse engineered from the machine code) and the current value of the seed, which we hear is subject to attack and discovery. Since the algorithm itself is fixed, we have a situation where 128k numbers are encoded by a single seed value. This shows the relationship between compression and random numbers. We can create a compression of the stream that is 128 thousand to one. But again, this is true of any pseudo-random generator. This is why seed renewal is so important (and keeping the seed secure.)
Creating a good quality seed is also hard. The entropy present in many seed generators is not high. Grabbing "random" values out of various system registers and the like is very unlikely to get you anything approaching a full word width of entropy. Even physical processes are subject to contamination from nearby periodic noise sources, reducing their entropy.
In all it is a hard problem. However the two blunders in the MS implementation are unforgivable. They suggest that the code was written by someone who had little or no understanding of the issues.
If you add up that there is no LOCK mechanism for memory segments to avoid them going to swap, that compromises even more any crypto process in Windoze.
Now that I think it, you don't need to monkey around with RNG's. Just wait for IE to go over to swap space, and wham, you got your decrypted SSL keys right there in the swapping partition. Oops!
Are we sure Vista and XP still do it this way? Security was a real focus for Vista so maybe they changed it.
Also do we know if the SSL and crypto libs use this function or did they cook their own? Or did they do more to randomize the number?
Have there been any successful attacks using this method?
This is a genuine extract from a document for the Finnish Bankers' Association. Note the second line of the first paragraph particularly, regarding key-generation:
------
Appendix 3: Key management
A key common to all banks is used in the calculation of the authentication identifier. The key is generated in the Finnish Bankers’ Association by tossing a coin 64 times and entering the result so that heads is 0 and tails is 1. The 8-bit bytes of the 64-bit key are given an odd parity, the bits are converted into a hexadecimal format and the result is the key common to all banks.
The key is transferred to reliable people within the banks and they enter the keys into the same or equally protected system as the system where the PATU dongles are stored. The technical solution is bank-specific but the security level must be the same.
------
Predict that!
With no real, effective peer review of the algorithm and implementation, are we surprised?
I have reason to believe, based on knowing several of them, that the technical folks working at Microsoft are generally technically competent. Unfortunately, the Microsoft corporate culture doesn't actually allow any of that talent to create quality products, as the competent technical people are, as usual, simply peons to the mostly technically incompetent managers. Managers whose focus is strictly on the next quarter, not on long term quality.
Ah well. Eventually such a system can hopefully only crumble as people realize that they're being ripped off. Until then, it's a good lesson in how fools and their money are soon parted.
Following the recent article on the pet 2001. We noticed this problem with the rnd generator in basic. we developed a work round; seed the random generator with -ti. The negated time value was a relatively random seed. so running the same code sequentially would get a different value from the rnd function each time.
Nearly 30 feckin' years and Microsoft STILL haven't fixed it.
I loved pointing out to my fellow developers how easy it was to work out a 'random' key based on time/ticks/any system count.
Your clock resolution is in the milliseconds. you used the key at 14/11/2007 at 10:45:30 AM, how many seeds to I have to try before i hit on your seed, and this your key......
If this is such an issue for generating SSL keys then why don't those programs use their own random number generators instead of using Windows' one?
Personally I've found it a godsend that the random number generator routine has stayed the same all through every 32 bit Windows. That way you can use it in games to create random maps for example that will be the same on any version of Windows if you use the same seed number.
Don't blame MS if you're using their simple random number routine when you need something a bit more sophisticated. Get off your arse and write your own.
Quote "It's absolutely impossible for a computer to generate truly random numbers. It can generate truly huge numbers but any number generation based on a formula or algorithm cannot be truly random"
Yes, and that is why Linux "collects entropy" from physical events like network packet arrival times, keypresses and some other places to make the random numbers far less predictable. Some VIA CPU:s even contain a hardware random number generator that can be used. The article implies Windows also collects entropy, but not enough to avoid this attack.
So what's new? When I started in computers nearly 30 years ago, I was told the limitations of the so called random number that were available. Even before this I remember the great fanfare that greeted the latest Ernie - the computer that picks the winners of the premium bonds in the UK, but time has shown that it's no truly or pseudo random...
The big problem is that it's been chosen as the feed for SSL encryption, despite well known limitations. Perhaps they thought it was good enough... Perhaps they didn't think.
There's a good wikipedia article on random number generation, suggest people read it, in it there's this quote:
'John von Neumann famously said "Anyone who uses arithmetic methods to produce random numbers is in a state of sin."'
Why not hook up a webcam infront of a lava-lamp, the computer takes the image as it's source data and generates a number from that, using a formula if you like - providing the attacker never sees the lava lamp from the same angle and at the same distance, it would be impossible to crack....
In 30 years of experience with computers I have discovered only 2 truly random numbers ever to be generated...
Firstly: The length of time it took windows 98 to BSoD
Secondly: IF XP was going to install on a formatted drive.
The former gave a random time of anything up to (but it seems not over) 8 hours whilst the latter was the equivelent of a "coin flip".
Not sure why you guys think compression has anything to do anything random. My random number generator just happened to spit out 1111111111111... that should compress just fine. Well ok it didn't really but a "perfect", "truly" random generator is just as likely to spit that out as anything else so back to the drawing board for you both.
An application can add additional entropy when calling CryptGenRan(). MS have been always clear how the entropy is collected thus it is not surprising you can break it. If you have got a better source of entropy you can use it. The generator behind that call is good enough, the real problem is collecting the entropy.
Random numbers are impossible to generate on a computer. The only random things in nature are quantum mechincal based, like the momentum of an electron emitted from a radioactive decay.
Probably the best random number generator on the market at the moment is TRandom3 with a large period of 2**19937-1, found at:
http://root.cern.ch/root/html/TRandom3.html
However, as with all the best tools, this has been GPL'ed. If you need a RNG, try TRandom3.
. . . read "Not Easy". Thank's Francis - an excellent analysis of the situation.
I might disagree that "a perfect random stream is indeed incompressible". A short, random stream may, entirely by chance, contain a sequence of bits that is susceptible to some form of compression, allowing the stream to be compressed somewhat. However, the larger the stream, the less chance of this being possible. And, as you point out, common compression routines rely on the expected nature of the input stream - compressing "random" data has little effect, pseudo- or otherwise.
As far as using the time as a random seed (@wobbly1 and @Matware), using -ti on the Pet WAS good practice - because (AFAIR) the PET, like most early computers, didn't have a real-time clock. The time counter on old machines generally counted ticks since system power-up. If this counted in milliseconds then, after just over a minute, the least-significant 16 bits of this number would provide a reasonably good random seed.
For "Modern" PCs, you could subtract the boot time from the real time and again take the least-significant portion. Knowing what time the generator had been seeded would then be useless without knowing when the system was last booted.
No deterministic machine can produce a truly random number. All computers user pseudo-random number algorithms. Indeed, this is often relied on: seeding the RNG can be used as a development tool (reproduce a "random" input stream that causes progem problems) or as in the example of Sean O'Connor, to consistently re-produce "random" maps.
Pseudo-RNGs are quite acceptable - as long as you know their limitations and the seed is well-chosen and protected. That's why being a good programmer is not just about coding. You need a good grounding in maths (particularly statistics & probability) so you at least know where the traps lie!
@AC:
That's insecure. Probably more insecure than Windows SSL certificates. Here's how you can tell:
"The key is transferred to reliable people within the banks and they enter the keys into the same or equally protected system as the system where the PATU dongles are stored"
They might as well claim that the key is transferred to the Kingdon of the Elves and that a big giant will club anyone who tries to take it away.
Rabbi, actually the longer you go, the *more* likely you are to find a compressible sequence. Suppose your compression is nothing more than run-length encoding, so every time you receive the same character more than once, you can compress it to say "2 As". The longer your random sequence, the greater the chances of finding two or more of the same character next to each other, just as you're more likely to get five consecutive coin-flips landing on heads in a series of a thousand flips than a series of six.
This reverses the claims made by other people above. If you *don't* get some compressible chunks in your sequence, that's a sure-fire indication that the sequence is *not* random, because any random sequence *would* have a few compressible chunks! If there are no compressible chunks, that indicates there's an algorithm behind it which is making sure the same output value isn't repeated until all other possible values have been used. That's a serious weakness in the algorithm, because if you can take a guess at the starting point and the range of values possible, you can predict future values by knowing which previous values can't come up this round - the digital equivalent of card-counting, essentially.
The problem is that humans are so deeply wired for pattern-recognition that we aren't very good at handling non-patterns - in particular, we intuitively see an apparently predictable sequence as a fault in a random system. So a truly random system that throws up "ABCD" or "AAAA" looks to us to be faulty, whereas a non-random system that throws up "ZQTW" looks like it's working.
Deterministic state machines found to behave deterministically. Who would have thought that?
If you want to get random numbers, you need some non-deterministic process. Radioactive decay is commonly used, but nobody is going to want a computer with a radioactive device in it (although I bet they've all got smoke detectors). Feeding amplified static into an A-to-D converter might also work.
there are a number of well-respected test suites for randomness available, here, here and here:
(NIST) http://csrc.nist.gov/groups/ST/toolkit/rng/index.html
(Diehard) http://stat.fsu.edu/pub/diehard/
(l'Ecuyer) http://www.iro.umontreal.ca/~simardr/indexe.html (in English)
(en francais) http://www.iro.umontreal.ca/~simardr/
there is (was?) a French development in the area of collecting entropy from modern CPU internal state, here:
http://www.irisa.fr/caps/projects/hipsor/old/HAVEGE.html (in English)
There are a number of hardware entropy generators, the performance of these varies, see here:
http://www.robertnz.net/true_rng.html
Also, go to Robert's home page for more stuff
> A key common to all banks is used in the calculation of the authentication identifier. The key is generated in the Finnish Bankers’ Association by tossing a coin 64 times
So are we saying that people working in banks are tossers as well as bankers?... (anonymous cos I work for a bank)
Just for the lot here that misses the point. This article contains 3 very important points:
- researchers reconstituted the state of the generator *without* knowing how it worked. The generator must really be a load of crap, then.
- each state is used to generate 128 KB data. That's f***ing *huge*. It means the attacker, after he knows it, will get access to the keys for a lot of the future new sessions (SSH, SSL, whatever else), before they're started !
Linux random generator, if I recall, spits out only 512 bytes (from /dev/random) from a known state, then it needs more entropy.
- also, being able to access to the generator from user privileges opens a lot to attackers.
As usual, very good work from MiKrosoft on the security aspect. Could El Reg send them a copy of Knuth books as Christmas gift ?
PS: for people asking themselves why SSL doesn't implement its own gen, that's because a random gen needs entropy, and the only way to collect that on a computer is to access low level timers/events etc ... which is only accessible from kernel, not userland where SSL runs.