I decided to do research on shellcode obfuscation in order to see how hard it would be to develop a tool that can take any binary x86 shellcode and generate the completely unique version from it.
Please note that such tool is most useful in scenarios when shellcode has to be saved on disk (e.g. embedded inside an executable file). Obfuscated shellcodes will greatly increase in size, which may not be a desirable outcome, especially if such is to be embedded inside an exploit payload, where small shellcode size is often the highest priority.
Main reason for having a shellcode obfuscator is bypassing any static or run-time signature detections implemented by IDS or AV products. As an example, take Metasploit. Its shellcode payloads have been public for many years and by now most major IDS/AV solutions are able to detect them by searching their vast databases of malware signatures.
Manually modifying shellcodes or writing new ones from scratch in order to avoid detection is a never-ending job and a waste of time in the long run. My focus will be to write a tool that will take any shellcode in binary form and modify the instructions in such a way that will preserve its functionality, but make the code unique and thus harder to analyze.
I need to state here that this article won't cover the subject of preventing detection through behavioral analysis or code emulation implemented by various security products.
Some people may say that a good way to make the shellcode bypass signature detection methods is to use Shikata Ga Nai encoder. The shellcode decrypts itself in run-time using XOR. This means the memory block, where shellcode resides, needs to be writable which is often not the case. This method is also not immune to run-time signature detection during code emulation by security software.
Every shellcode is written in the manner to successfully execute from any memory location. That means all jumps or memory references are relative to its current position. There are no fixed memory locations.
There are exceptions though like this one that comes directly from one of Metasploit's payloads:
66 0000008d: 5d POP EBP 67 0000008e: 6a01 PUSH 0x1 68 00000090: 8d85b2000000 LEA EAX, [EBP+0xb2] ;<-- fixed offset that will break things when we insert code after this line! 69 00000096: 50 PUSH EAX 70 00000097: 68318b6f87 PUSH DWORD 0x876f8b31 71 0000009c: ffd5 CALL EBP
In order to generate obfuscated code, the tool needs to:
- Be able to insert new instructions before, after and between existing ones
- Be able to fix relative jump offsets when instructions are inserted or deleted
- Be able to replace existing instructions with alternative ones, making sure that no static data is left behind that may aid in creation of detection signatures
- Be able to insert jump instructions that will change the execution flow and randomly divide the shellcode into separate blocks
The first requirement is not that hard to satisfy, but can be a little bit tricky. All call/jump instructions refer to relative memory location, which means something like "jump X bytes forward from this instruction or X bytes backward". That is, if we insert any bytes between the jump instruction and its destination, the relative jump offset will need to be corrected by the number of bytes we have inserted.
Second requirement involves keeping track of all relative jumps in the shellcode and adjusting the jump offsets whenever we increase or decrease the amount of code data between the jump's starting offset and its destination.
Third requirement will require proper fingerprinting of the disassembled instructions and recreating them as different instructions, while preserving original functionality.
Fourth requirement will allow to change the execution flow of the shellcode, dividing it into many separate blocks of code, completely changing the order, the instructions were originally placed in.
You may be thinking at this moment that fixing the jump instructions shouldn't be that hard. We just need to adjust the relative memory offset and that's it. Issue is, that jumps may be
far. Short jump memory offset is written as
1 byte whereas far jump memory offset is
4 bytes. Take a look at the example of
jnz 401020 instruction that resides at
401000 memory location:
00401000: 75 1E
75 - JNZ opcode
1E - jump 0x1E bytes forward (
0x401020 - 0x401000 - instruction length [2 bytes])
00401000: 0F 85 1A 00 00 00
0F 85 - JNZ opcode
1A 00 00 00 - jump 0x1A forward (
0x401020 - 0x401000 - instruction length [4 bytes])
As you can see, both instructions perform the same operation, but are written differently. Also the relative memory offset used with the instruction is affected by the instruction length itself. This is important to keep in mind.
Now let's take a look at the following code:
00401000: 75 7E : jnz 00401080 ... 00401080: 90 : nop
The JNZ instruction would be
75 7E, but what will happen if we insert 4 bytes between
00401080 addresses? It would make sense to increase the relative memory offset by 4, which would become
75 82. See what happens now to that instruction after it is fixed like this:
00401000: 75 82 : jnz 00400F84 ... -- added 4 bytes -- ... 00401084: nop
The instruction jumps backwards now? Yes, that's right. Short jump's relative memory offset is a
1 signed byte which means the value range is between -128 and 127.
0x82 in this example is in fact treated as -126.
We need to be very careful when inserting bytes. The tool also needs to detect if the instruction needs to be converted from
far. Proper jump fix from the previous example should look as follows:
00401000: 0F 85 7E 00 00 00 : jnz 00401084 ... -- added 4 bytes -- ... 00401084: nop
Every time bytes are inserted, the tool needs to look for affected jump instructions and detect if fixing the relative address also involves replacing the affected jump instruction with the longer alternative.
Unfortunately there is also another complication that needs handling.
Several instructions that operate on
short relative memory offsets don't have their
far alternatives, meaning it is not easy to replace one instruction with another in such case. Instead, in order to properly handle such instructions, we need to replace one instruction with several other instructions, while retaining original instruction functionality.
Problematic instructions are:
Fortunately I found a very helpful excerpt from the book The Art of Assembly Language Programming by Randall Hyde that covers the replacement of aforementioned instructions:
test ecx, ecx ;Sets the zero flag if ecx=0 jz Target
dec ecx jnz Target
jnz/jz quit dec ecx jz quit2 jmp Target quit: dec ecx quit2:
In the obfuscation tool, I decided to replace the problematic instructions with their longer alternatives at the beginning in order to avoid issues later.
Most shellcodes embed some form of binary data that is not code. This may be a command to execute, IP address to connect back to with reverse shell or just anything that shouldn't be treated as code. It is not possible for the disassembler to make the distinction between code and data. Any binary data can and will be interpreted as code.
Before we let the tool perform any code obfuscation, we need the ability for the user to specify where real code instructions are and make it treat the rest as data that can be skipped during the obfuscation process.
Now, this is where the fun part starts.
The tool is able to properly parse the disassembled code. It is able to insert and delete instructions while fixing any relative jumps that may become affected in the process. It is now time to make the tool do the real work and start replacing instructions with random and unique equivalents.
The more support we add for different instructions, the more unique the output shellcode will become.
I will start slowly with one instruction and add support for more in future parts of this article series.
MOV REG, IMM
(moves immediate static value to register)
B8 EF BE AD DE : MOV EAX, 0xDEADBEEF 66 BA FE CA : MOV DX, 0xCAFE B1 77 : MOV CL, 0x77
This is one of the simplest instructions, but if we don't replace these, they may become a great source of information for writing effective signatures to detect our shellcode.
I won't be getting into much detail here how x86 instructions are encoded. If you want to learn more (and you should), you can find a useful information following these links:
The MOV R16/32, IMM16/32 instruction starts with the
0xB8 opcode. The register value that is used with the instruction is saved in the lowest 3 bits of this opcode.
The register values are as follows:
EAX/AX/AL : 000b / 00h ECX/CX/CL : 001b / 01h EDX/DX/DL : 010b / 02h EBX/BX/BL : 011b / 03h ESP/SP/AH : 100b / 04h EBP/BP/CH : 101b / 05h ESI/SI/DH : 110b / 06h EDI/DI/BH : 111b / 07h
To form the opcode value with the register we want, we need to add the register value to the opcode starting value (
B8 44 33 22 11 : MOV EAX, 11223344h BA 44 33 22 11 : MOV EDX, 11223344h
If we want to use 16-bit registers and values instead of 32-bit, the trick is to prefix the opcode with
66 B8 22 11 : MOV AX, 1122h 66 BA 22 11 : MOV DX, 1122h
The MOV R8, IMM8 is very similar to the previous one. The only difference is that the opcode starting value is
B0 11 : MOV AL, 11h B2 11 : MOV DL, 11h
To summarize, in order to properly detect the MOV REG, IMM instructions we need to look for opcodes that start with the byte in range of
0xBF and also keep in mind that the first byte may be prefixed with
0x66, putting the instruction in 16-bit mode.
We will convert each instruction into several ADD, SUB or XOR instructions that will perform computation of the original immediate value.
As an example this is how the obfuscation will look like:
B8 44 33 22 11 : MOV EAX, 0x11223344
BA 38 2C 30 A2 : MOV EDX, 0xA2302C38 81 C2 BD 85 4F D8 : ADD EDX, 0xD84F85BD 81 EA E0 5C 59 BF : SUB EDX, 0xBF595CE0 81 F2 1C 9A 82 23 : XOR EDX, 0x23829A1C 81 F2 4D FC 86 89 : XOR EDX, 0x8986FC4D
EDX = 0xA2302C38 + 0xD84F85BD - 0xBF595CE0 ^ 0x23829A1C ^ 0x8986FC4D = 0x11223344
As you can see the instruction is now completely different, but still after the last instruction is executed, EDX will have the original value of
0x11223344. We can generate as many computation instructions as we want.
I decided to write this tool in Python, considering it would be more approachable for wider audience. Although I performed a bit of simple disassembly myself to fingerprint several instruction types, I needed a full-blown disassembler library that would detect the length of each disassembled instruction.
Current version of the tool is obfuscating only MOV REG, IMM instructions, but support for more instructions is coming soon.
Here is the example dud shellcode, for testing purposes, I've prepared to demonstrate its obfuscation:
0 00000000: fc CLD 1 00000001: b105 MOV CL, 0x5 2 00000003: 80fc02 CMP AH, 0x2 3 00000006: 7504 JNZ 0xc 4 00000008: 66b88888 MOV AX, 0x8888 5 0000000c: ba44332211 MOV EDX, 0x11223344 6 00000011: e2f9 LOOP 0xc
0 00000000: fc CLD 1 00000001: b1a2 MOV CL, 0xa2 2 00000003: 82c1c4 ADD CL, 0xc4 3 00000006: 82e961 SUB CL, 0x61 4 00000009: 80fc02 CMP AH, 0x2 5 0000000c: 7518 JNZ 0x26 6 0000000e: 66b80596 MOV AX, 0x9605 7 00000012: 6681c0ce9a ADD AX, 0x9ace 8 00000017: 6681c0a039 ADD AX, 0x39a0 9 0000001c: 6681f04371 XOR AX, 0x7143 10 00000021: 6681c0886d ADD AX, 0x6d88 11 00000026: ba44351577 MOV EDX, 0x77153544 12 0000002b: 81f2fbd13236 XOR EDX, 0x3632d1fb 13 00000031: 81f299f2d3ed XOR EDX, 0xedd3f299 14 00000037: 81c22c9ffb09 ADD EDX, 0x9fb9f2c 15 0000003d: 81eae2468266 SUB EDX, 0x668246e2 16 00000043: 81f2345d4f41 XOR EDX, 0x414f5d34 17 00000049: 49 DEC ECX 18 0000004a: 75da JNZ 0x26
The source code is fully available on GitHub.
To be continued...
This is the end of Part 1 of this blog series. Next part will cover obfuscation of other instructions.
If you have any questions or suggestions, please post them in the comments below.
Part 2 is out!