Wednesday, October 31, 2007
Tuesday, October 30, 2007
Funny Vista Tricks with ASLR
Posted by
David Maynor
at
11:32 PM
While doing alot of testing around the implementation of ASLR on both OSX and Vista I noticed something odd. 3rd party dlls in the Internet Explorer don’t seem to change addresses. See the screenshot below. Googletoolbar and flash9d stay at the same address through multiple reboots. I thought this was odd.

Flash9d and Googletoolbar stay at the same address.
Doing some reading it turns out that a linker flag, /dynamicbase, is what tells Vista that it is ok to rebase a DLL. This gave me a bright idea that maybe I could manually enable ASLR support in a DLL. The first step to this is to find out exactly what the /dynamicbase flag does to a binary. I did a couple of things to run this down, but mainly I compared a DLL that utilized ASLR versus one that doesn't. Mshtml.dll takes advantage of ASLR so that the target. Googletoolbar is the binary I want to force to use ASLR. After comparing alot of fields in the PE header i narrowed it to an option in the PE header, DLL Characteristics. Setting this field to 0x40 enables rebasing the DLL.
I was worried that this would not work because of application signing. I thought that once an application is modified it would no longer run. No problems like that occurred. The toolbar seemed to work just fine.
I am not advocating doing this to your system DLLs, I just thought it was interesting.

Flash9d and Googletoolbar stay at the same address.
Doing some reading it turns out that a linker flag, /dynamicbase, is what tells Vista that it is ok to rebase a DLL. This gave me a bright idea that maybe I could manually enable ASLR support in a DLL. The first step to this is to find out exactly what the /dynamicbase flag does to a binary. I did a couple of things to run this down, but mainly I compared a DLL that utilized ASLR versus one that doesn't. Mshtml.dll takes advantage of ASLR so that the target. Googletoolbar is the binary I want to force to use ASLR. After comparing alot of fields in the PE header i narrowed it to an option in the PE header, DLL Characteristics. Setting this field to 0x40 enables rebasing the DLL.
The real difference is the DLL Characteristic field which has to get set to 0x40.
1. The first step in enabling ASLR for this program is to save a copy of the original file so that it can be restored in case of an accident.
2. The next step is to open the Googletoolbar in a hex editor and find the DLL Characteristics field and set it to 0x40.
2. The next step is to open the Googletoolbar in a hex editor and find the DLL Characteristics field and set it to 0x40.
These are images of modifying the googletoolbar dll. The byte that is changed is on the bottom role on the right.
Every reboot changes the load address of GoogleToolbar now.
I was worried that this would not work because of application signing. I thought that once an application is modified it would no longer run. No problems like that occurred. The toolbar seemed to work just fine.
I am not advocating doing this to your system DLLs, I just thought it was interesting.
Saturday, October 27, 2007
Errata goes to the races...
Posted by
David Maynor
at
8:12 PM
Today I spent time in the pits of the NASCAR truck series. It was a fun day, there was a minor accident, but the most surprising was the wireless access.

There were open wifi access points all over the pits. From Direct TV to access points used by reporters, it was ripe for credential theft not to mention people still using unencrypted pop3. Below are some screen shots from my iPhone running stumbler. These were collected just walking up and down the track. Sometimes people need to remember that although people who do security for a living know about these types of problems, the general public doesn't.

We should have a hamster and ferret package for the iPhone available soon.
Friday, October 26, 2007
Wednesday, October 24, 2007
Evil Rob Graham Fact #3
Posted by
David Maynor
at
4:45 PM
Once again I am not pointing fingers, but come on, they were separated at birth. Plus the other day at lunch Rob mentioned wanting to crush the rebellion.
Tuesday, October 23, 2007
Evil Rob Graham Fact #2
Posted by
David Maynor
at
3:37 PM
You remember the meteor that killed all the dinosaurs 65 million years ago? I don't want to point fingers but Rob IS missing a meteor, Vote for Graham getting stunned.
Monday, October 22, 2007
Password Cracking vs. Core2/Athlon64
Posted by
Robert Graham
at
4:30 PM
Today's CPU's can "crack" passwords 8 times faster than they can "check" the passwords. Normal software checks a password using roughly 500 CPU cycles, but a brute force password cracking program can check a password in roughly 65 CPU cycles. Brute force programs do this because they check multiple passwords at once using the parallel features of the CPU.
The United States is currently working on new "Advanced Hash Standard" to replace the current SHA-1 (to be used in things such as passwords). Normally, crypto algorithms are designed ignoring brute-force cracking, because it only makes a small difference overall (it only changes a constant on the complexity of the algorithm). However, since we know that software cracking is the primary way of breaking these things, it might be worthwhile tweaking the algorithm to make cracking more difficult.
In other words, if brute-force software can crack passwords eight times faster, we know that that a 160-bit hash algorithm has only 157-bits of effectiveness. If we tweak the algorithm to defeat the brute-force acceleration, then the algorithm would have a full 160-bits worth of effectiveness.
Today's CPUs are "wide", which means they can execute multiple instructions simultaneously. However, this only works if the instructions are INDEPENDENT. If the two instructions operate on the same registers (DEPENDENT), then the second instruction must wait for the first to complete before it can execute.The following is an example:
Dependent (runs in two clock cycles)
add r1 -> r2
add r2 -> r3
Independent (runs in one clock cycle)
add r1 -> r2
add r7 -> r8
The problem with cryptographic code is that it normally contains a long chain of instructions dependent upon each other. Thus, checking a single key takes 500 instructions, and therefore roughly 500 clock cycles. To get around this, brute force password crackers check multiple passwords at the same time, then interleave the instructions. They check eight passwords in the same 500 clock cycles.
The latest processors from AMD and Intel contain 3 execution units. Thus, in theory, they could check 3 passwords at the same time it would normally take to check a single password.
An even more powerful feature is the "SIMD" units. The acronym SIMD means "single-instruction multiple-data". This widens the registers so that they can hold 4 separate values simultaneously. A single instruction that adds two SIMD registers together would in fact do four simultaneous additions.
The modern processors from Intel and AMD contain 2 SIMD execution units. This means they can, in theory, check 8 passwords at the same time (two sets of registers, each holding four values).
The following is a snippet of code from John the Ripper that cracks Windows passwords. Each SIMD instruction is repeated twice for two different sets of passwords, and each register holds four different values for four different passwords. Thus, eight passwords are checked simultaneously.
paddd (512*base)+(x*32)+nt_buffer8x, aa
paddd (512*base)+(x*32)+16+nt_buffer8x, aa3
movdqa cc, t1
movdqa cc3, t13
pxor dd, t1
pxor dd3, t13
pand bb, t1
pand bb3, t13
pxor dd, t1
pxor dd3, t13
paddd t1, aa
paddd t13, aa3
movdqa aa, t2
movdqa aa3, t23
pslld $s, aa
pslld $s, aa3
psrld $(32-s), t2
psrld $(32-s), t23
por t2, aa
por t23, aa3
Right now, the above John the Ripper code doesn't give the eight-fold increase you would expect. It would take a bit more machine-language zen than simply duplicating the instructions in order to get closer to the theoretical performance. The second reason is that John the Ripper isn't fully optimized for this level of speed: while inner loop can check a password in 65 cycles instead of 500 cycles, it still takes 100 cycles for the program to choose a new password. In other words, if we could make checking passwords infinitely fast, it still would only double the speed of the code.
To get more speed, ee'd have to bring new password selection deeper into the code. For example, we could pass in a single password, and the code could add the constants {0,1,2,3, 4, 5,6,7} onto the end, thus choosing the eight sequential passwords in only a single clock cycle.
In order to defeat parallel cracking of hashes, the next hash standard could include parallel elements itself. They would have to break the long dependency chains in the algorithm into 8 independent chains. This means they could also increase the number of logical operations within the algorithm, making it 8 times stronger. Thus, a single hash could be calculated in the same 500 cycles as today's hashes, but with 8 times the number of logic operations. Brute-force crackers would therefore get no benefit from checking multiple keys at the same time.
An alternate design technique would be to include features that make SIMD processing difficult. Raw cryptographic operations consist of "permutations" and "substitutions". A permutation is an operation like "shift", "xor", "add", etc. A substitution is a table lookup of a new, mathematically unrelated value given a current value. For example, give a byte with a value of 0x61, looking the up in a table might replace that with 0xF7. There are SIMD instructions for all the permutation operators, but there aren't any equivalents for substitution operations.
The hashes we use now for passwords, such as MD4 and MD5, do not use substitutions, which is why they can be accelerated with SIMD. Some passwords still use DES, which contains a lot of substitution operations. Luckily (for crackers), a DES substitution can be replaced with multiple permutations, so it still gets accelerated by SIMD, but not as much acceleration as with MD5.
All of this may become a moot point. You can now get cheap FPGAs (for example, from Pico Computing) and open source code that cracks passwords even faster. You can also exploit graphics cards (and their GPUs) for faster cracking. Finally, botnets can herd millions of machines into a password cracking effort. With all this additional power, making software cracking slightly harder may not be worth it.
CONCLUSION: Today's crypto algorithms ignore the practicalities of software brute-force cracking. As a result, they are a few bits less effective than theory would admit. The reason is that carefully designed cracking code can take advantage of the parallelism in modern processors (multiple execution units, SIMD processors). A few algorithmic changes can defeat this, either by incorporating parallelism into the algorithm design, or by including features that are difficult to make parallel. This may be unnecessary: while the majority of brute-force cracking is done in software today, hardware approaches using FPGAs or graphics-cards may be the way of the future.
FAQ: BTW, by "multiple execution units", I mean in a single CPU core. I'm ignoring multiple cores, which do make cracking faster as well, but at the normal Moore's Law progression.
The United States is currently working on new "Advanced Hash Standard" to replace the current SHA-1 (to be used in things such as passwords). Normally, crypto algorithms are designed ignoring brute-force cracking, because it only makes a small difference overall (it only changes a constant on the complexity of the algorithm). However, since we know that software cracking is the primary way of breaking these things, it might be worthwhile tweaking the algorithm to make cracking more difficult.
In other words, if brute-force software can crack passwords eight times faster, we know that that a 160-bit hash algorithm has only 157-bits of effectiveness. If we tweak the algorithm to defeat the brute-force acceleration, then the algorithm would have a full 160-bits worth of effectiveness.
Today's CPUs are "wide", which means they can execute multiple instructions simultaneously. However, this only works if the instructions are INDEPENDENT. If the two instructions operate on the same registers (DEPENDENT), then the second instruction must wait for the first to complete before it can execute.The following is an example:
Dependent (runs in two clock cycles)
add r1 -> r2
add r2 -> r3
Independent (runs in one clock cycle)
add r1 -> r2
add r7 -> r8
The problem with cryptographic code is that it normally contains a long chain of instructions dependent upon each other. Thus, checking a single key takes 500 instructions, and therefore roughly 500 clock cycles. To get around this, brute force password crackers check multiple passwords at the same time, then interleave the instructions. They check eight passwords in the same 500 clock cycles.
The latest processors from AMD and Intel contain 3 execution units. Thus, in theory, they could check 3 passwords at the same time it would normally take to check a single password.
An even more powerful feature is the "SIMD" units. The acronym SIMD means "single-instruction multiple-data". This widens the registers so that they can hold 4 separate values simultaneously. A single instruction that adds two SIMD registers together would in fact do four simultaneous additions.
The modern processors from Intel and AMD contain 2 SIMD execution units. This means they can, in theory, check 8 passwords at the same time (two sets of registers, each holding four values).
The following is a snippet of code from John the Ripper that cracks Windows passwords. Each SIMD instruction is repeated twice for two different sets of passwords, and each register holds four different values for four different passwords. Thus, eight passwords are checked simultaneously.
paddd (512*base)+(x*32)+nt_buffer8x, aa
paddd (512*base)+(x*32)+16+nt_buffer8x, aa3
movdqa cc, t1
movdqa cc3, t13
pxor dd, t1
pxor dd3, t13
pand bb, t1
pand bb3, t13
pxor dd, t1
pxor dd3, t13
paddd t1, aa
paddd t13, aa3
movdqa aa, t2
movdqa aa3, t23
pslld $s, aa
pslld $s, aa3
psrld $(32-s), t2
psrld $(32-s), t23
por t2, aa
por t23, aa3
Right now, the above John the Ripper code doesn't give the eight-fold increase you would expect. It would take a bit more machine-language zen than simply duplicating the instructions in order to get closer to the theoretical performance. The second reason is that John the Ripper isn't fully optimized for this level of speed: while inner loop can check a password in 65 cycles instead of 500 cycles, it still takes 100 cycles for the program to choose a new password. In other words, if we could make checking passwords infinitely fast, it still would only double the speed of the code.
To get more speed, ee'd have to bring new password selection deeper into the code. For example, we could pass in a single password, and the code could add the constants {0,1,2,3, 4, 5,6,7} onto the end, thus choosing the eight sequential passwords in only a single clock cycle.
In order to defeat parallel cracking of hashes, the next hash standard could include parallel elements itself. They would have to break the long dependency chains in the algorithm into 8 independent chains. This means they could also increase the number of logical operations within the algorithm, making it 8 times stronger. Thus, a single hash could be calculated in the same 500 cycles as today's hashes, but with 8 times the number of logic operations. Brute-force crackers would therefore get no benefit from checking multiple keys at the same time.
An alternate design technique would be to include features that make SIMD processing difficult. Raw cryptographic operations consist of "permutations" and "substitutions". A permutation is an operation like "shift", "xor", "add", etc. A substitution is a table lookup of a new, mathematically unrelated value given a current value. For example, give a byte with a value of 0x61, looking the up in a table might replace that with 0xF7. There are SIMD instructions for all the permutation operators, but there aren't any equivalents for substitution operations.
The hashes we use now for passwords, such as MD4 and MD5, do not use substitutions, which is why they can be accelerated with SIMD. Some passwords still use DES, which contains a lot of substitution operations. Luckily (for crackers), a DES substitution can be replaced with multiple permutations, so it still gets accelerated by SIMD, but not as much acceleration as with MD5.
All of this may become a moot point. You can now get cheap FPGAs (for example, from Pico Computing) and open source code that cracks passwords even faster. You can also exploit graphics cards (and their GPUs) for faster cracking. Finally, botnets can herd millions of machines into a password cracking effort. With all this additional power, making software cracking slightly harder may not be worth it.
CONCLUSION: Today's crypto algorithms ignore the practicalities of software brute-force cracking. As a result, they are a few bits less effective than theory would admit. The reason is that carefully designed cracking code can take advantage of the parallelism in modern processors (multiple execution units, SIMD processors). A few algorithmic changes can defeat this, either by incorporating parallelism into the algorithm design, or by including features that are difficult to make parallel. This may be unnecessary: while the majority of brute-force cracking is done in software today, hardware approaches using FPGAs or graphics-cards may be the way of the future.
FAQ: BTW, by "multiple execution units", I mean in a single CPU core. I'm ignoring multiple cores, which do make cracking faster as well, but at the normal Moore's Law progression.
Evil Rob Fact #1
Posted by
David Maynor
at
12:26 AM
You know all those ships that went missing in the Bermuda triangle? That was Rob Graham. Vote for Rob to be tazed.
Sunday, October 21, 2007
Vote which of us gets tazed
Posted by
David Maynor
at
6:06 PM
Tazers and stun guns have been in the news lately, from intelligence agencies electrocuting suspects to student demonstrators getting tazed by campus police. In order to celebrate the 1-year founding of Errata Security, we have decided therefore that it's time that ONE of the founders gets tazed by a 100,000 volt stun gun.
We have set up a poll (part of a google blog feature) on the right-hand-side of this page. Vote for which founder you would like to see tazed. We will post the results Friday afternoon, with a video of the "winner" getting his just reward.
Dave would claim that Rob hates puppies and often has fields of daffodils paved over. However, Rob would like point out something evil about Dave Maynor, but can't figure out anything more evil than DAVE MAYNOR.
We have set up a poll (part of a google blog feature) on the right-hand-side of this page. Vote for which founder you would like to see tazed. We will post the results Friday afternoon, with a video of the "winner" getting his just reward.
Dave would claim that Rob hates puppies and often has fields of daffodils paved over. However, Rob would like point out something evil about Dave Maynor, but can't figure out anything more evil than DAVE MAYNOR.
Wednesday, October 10, 2007
If you want to hack something today...
Posted by
Robert Graham
at
4:28 PM
In this post, I point out that it takes only moments to find a vulnerable system to hack with Google and SQL injection. This blog post by "pdp" shows another way, this time using Citrix instead of SQL. If you search for filetype:ica you'll find hundreds of systems that you can hack.
Citrix is a remote GUI, like VNC or X Windows or Microsoft Remote Desktop. It is a popular way for people to "host" applications. Usually its a way to provide remote access to a Windows application that was originally written to be local.
In most cases, you'll connect to an application with no specific user credentials. The security rests with the application Citrix is connecting you to. Most of these have trivial or no security, and will allow you to gain control of the entire system with just a few clicks. The blog by "pdp" shows a video of Citrix connecting to the "calc.exe" program (the Windows Calculator accessory) and then gaining a command-prompt.
However, you can usually edit the ".ica" file that the server gives you and enter a different application to run, such as "explorer.exe". You can also edit the user credentials. Google for filetype:ica ClearPassword for some extra special fun. If you read the content of the ".ica" files, you'll quickly find other tricks you can do in order to hack systems.
Last month, major news outlets reported that the Chinese had hacked the Pentagon. My mother asked me how this could be. The answer is: a teenager can find and hack a .mil or .gov system in minutes using Citrix, SQL injection, or a dozen other well-known techniques.
Citrix is a remote GUI, like VNC or X Windows or Microsoft Remote Desktop. It is a popular way for people to "host" applications. Usually its a way to provide remote access to a Windows application that was originally written to be local.
In most cases, you'll connect to an application with no specific user credentials. The security rests with the application Citrix is connecting you to. Most of these have trivial or no security, and will allow you to gain control of the entire system with just a few clicks. The blog by "pdp" shows a video of Citrix connecting to the "calc.exe" program (the Windows Calculator accessory) and then gaining a command-prompt.
However, you can usually edit the ".ica" file that the server gives you and enter a different application to run, such as "explorer.exe". You can also edit the user credentials. Google for filetype:ica ClearPassword for some extra special fun. If you read the content of the ".ica" files, you'll quickly find other tricks you can do in order to hack systems.
Last month, major news outlets reported that the Chinese had hacked the Pentagon. My mother asked me how this could be. The answer is: a teenager can find and hack a .mil or .gov system in minutes using Citrix, SQL injection, or a dozen other well-known techniques.
Thursday, October 04, 2007
The Cost of Security
Posted by
Robert Graham
at
8:16 PM
Airplanes carry flight-recorders made of an indestructible material that will survive a crash. A common joke is to ask "Why can't the entire airplane be made from the same material?" The answer is, of course, that this would make the airplane too heavy to lift off the ground. Planes need to be dangerously flimsy to fly.
Cybersecurity has the same issues. People point out "obvious" solutions to cybersecurity while ignoring costs they would entail.
A good example is this story at The Register where former cyberczar Richard Clarke outlines his 5-point plan for securing the Internet. All his points make as much sense as building airplanes from cast iron. An example is his quote:
"We should look, as an industry, at improving the quality of secure code, so that we don't need to issue software patches, so there aren't trap doors - intentional or otherwise. This is not a revolutionary idea. We put this in place a long time ago for electrical appliances."
Is this as cost-free and uncontroversial as Clarke pretends? No, of course not, or else such laws would already have been passed.
"Safety" is not "security". Safety protects against ACCIDENTAL problems, security protects against INTENTIONAL problems. Regulations are designed to protect your gas stove from accidentally exploding, but they can't protect you if somebody intentionally rigs your stove to explode. In much the same way, automobile safety protects against accidental problems, but does nothing to stop your tires being slashed or your break lines from being punctured intentionally. Product safety protects against the INANIMATE effects of bad design, bad construction, and wear-and-tear. Security protects against the ANIMATE and unpredictable adversary who is likely more clever than the engineers who designed the product.
In other words, the analogy between "product safety" and "software security" is faulty.
Not only would regulations work less well, they would cost more. Government regulations have hidden costs. That's why economists and politicians spend so much time trying to get rid of them. While you can't see the costs directly, you can see their effects. For example, what appliances do you own that were NOT created by a huge multinational corporation? This is because small companies cannot afford the heavy costs of regulation. Countries that regulate more innovate less. Regulation favors large companies over small companies.
Think of this another way. Right now, you can start your own software company and (hopefully) make a million dollars. That's because you don't have to worry about government regulations. If Clarke were to get his way, it would take several full time employees and high-priced lawyers just to deal with the regulations, leaving nobody left to create your software. Software innovation would come to a stop in the name of cybersecurity. Only megacorporations would then be able to ship software (which is why companies like Microsoft support such regulation). A good example of this is the "Common Criteria" certification for government software, where it takes at least a million dollars to get certified (and which still fails to produce secure software).
Richard Clarke's remaining proposals are even worse. Clarke's solution to cybersecurity is to convert our free society into a totalitarian police state. He's right, of course, doing so WILL improve cybersecurity, but at a huge cost to civil liberties. Even today, we live in a slight police state: you are more likely to be arrested falsely by the police than you are to be attacked by terrorists. There are many of us who believe that the slight improvement in security is not worth the huge cost in liberty.
Anybody can pass themselves off as a security expert by proposing extreme solutions and silencing their opponents with the accusations that they aren't taking security "seriously" enough. That's not what an expert is. Instead, an expert is somebody who can find solutions WITHIN a reasonable set of costs/sacrifices.
Cybersecurity has the same issues. People point out "obvious" solutions to cybersecurity while ignoring costs they would entail.
A good example is this story at The Register where former cyberczar Richard Clarke outlines his 5-point plan for securing the Internet. All his points make as much sense as building airplanes from cast iron. An example is his quote:
"We should look, as an industry, at improving the quality of secure code, so that we don't need to issue software patches, so there aren't trap doors - intentional or otherwise. This is not a revolutionary idea. We put this in place a long time ago for electrical appliances."
Is this as cost-free and uncontroversial as Clarke pretends? No, of course not, or else such laws would already have been passed.
"Safety" is not "security". Safety protects against ACCIDENTAL problems, security protects against INTENTIONAL problems. Regulations are designed to protect your gas stove from accidentally exploding, but they can't protect you if somebody intentionally rigs your stove to explode. In much the same way, automobile safety protects against accidental problems, but does nothing to stop your tires being slashed or your break lines from being punctured intentionally. Product safety protects against the INANIMATE effects of bad design, bad construction, and wear-and-tear. Security protects against the ANIMATE and unpredictable adversary who is likely more clever than the engineers who designed the product.
In other words, the analogy between "product safety" and "software security" is faulty.
Not only would regulations work less well, they would cost more. Government regulations have hidden costs. That's why economists and politicians spend so much time trying to get rid of them. While you can't see the costs directly, you can see their effects. For example, what appliances do you own that were NOT created by a huge multinational corporation? This is because small companies cannot afford the heavy costs of regulation. Countries that regulate more innovate less. Regulation favors large companies over small companies.
Think of this another way. Right now, you can start your own software company and (hopefully) make a million dollars. That's because you don't have to worry about government regulations. If Clarke were to get his way, it would take several full time employees and high-priced lawyers just to deal with the regulations, leaving nobody left to create your software. Software innovation would come to a stop in the name of cybersecurity. Only megacorporations would then be able to ship software (which is why companies like Microsoft support such regulation). A good example of this is the "Common Criteria" certification for government software, where it takes at least a million dollars to get certified (and which still fails to produce secure software).
Richard Clarke's remaining proposals are even worse. Clarke's solution to cybersecurity is to convert our free society into a totalitarian police state. He's right, of course, doing so WILL improve cybersecurity, but at a huge cost to civil liberties. Even today, we live in a slight police state: you are more likely to be arrested falsely by the police than you are to be attacked by terrorists. There are many of us who believe that the slight improvement in security is not worth the huge cost in liberty.
Anybody can pass themselves off as a security expert by proposing extreme solutions and silencing their opponents with the accusations that they aren't taking security "seriously" enough. That's not what an expert is. Instead, an expert is somebody who can find solutions WITHIN a reasonable set of costs/sacrifices.
Subscribe to:
Posts (Atom)










