X86 Shellcode Obfuscation - Part 3

Hello and welcome back to the shellcode obfuscation series! If you've missed the previous episodes, take your time and catch up here:

X86 Shellcode Obfuscation - Part 1

X86 Shellcode Obfuscation - Part 2

Last time, I've added obfuscation support for most common x86 instructions, which allowed to process the obfuscation output several times in order to get even better results. The obfuscated code output now, while being pretty well obfuscated, still is pretty easy to navigate as the execution flow is not changed. I will fix it this episode as I explain methods of implementing full blown execution flow obfuscation by injecting dozens of jumps to make the code output unrecognizable.

Execution Flow Deception

Every assembly code consists of calls and jumps separated by instructions that reside in the middle. These are commonly called branch instructions as they allow the code to branch out into multiple directions.

Let's take a look at the following code:

00390000   B8 02000000      MOV EAX,2  
00390005   B9 06000000      MOV ECX,6  
0039000A   03C1             ADD EAX,ECX  
0039000C   85C0             TEST EAX,EAX  
0039000E   74 05            JE SHORT 00390015  
00390010   83F8 08          CMP EAX,8  
00390013  ^75 EB            JNZ SHORT 00390000  
00390015   C3               RETN  

This simple code puts some values in EAX and ECX registers. After that it adds ECX to EAX and jumps to 00390015 if EAX == 0 (which is false). Later it compares if the value of EAX == 8 and if not, jumps back to the beginning. If EAX == 8 the jump is not taken and RETN is executed.

That sample code will be used for demonstration purposes. You can see the code is pretty straight forward and easy to navigate. Is there any way we can disturb its execution flow to make it both harder to read and harder to step through? There sure is!

The purpose of execution flow obfuscation will be to add as many unnecessary jumps as possible, while retaining the original code functionality. The obfuscation tool should go over every instruction one by one and decide at each step wether the jump should be placed at that spot. If it decides the jump is to be placed, it will pick a random place in the shellcode to jump to. The "jump-to" place will need to handle inserted code, but as there will already be other code in this place, we need to split the code with another jump, that will jump over our inserted code.
It may be a bit hard to explain in words, but hopefully it gets clearer with the following example.

Lets say we are inserting a jump at 00390010 and we randomly picked instruction at address 0039000A to insert into. This is how our sample code will look like after the jump is inserted:

00390000   B8 02000000      MOV EAX,2  
00390005   B9 06000000      MOV ECX,6  
0039000A   EB 06            JMP SHORT 00390012 <-- inserted code with a jump split, jumping over the inserted code  
0039000C   83F8 08          CMP EAX,8  
0039000F  ^75 EF            JNZ SHORT 00390000  
00390011   C3               RETN               <-- inserted code ends here  
00390012   03C1             ADD EAX,ECX  
00390014   85C0             TEST EAX,EAX  
00390016  ^74 F9            JE SHORT 00390011  
00390018  ^EB F2            JMP SHORT 0039000C <-- inserted jump  

You can see that the code was split at 0039000A with a jump that jumps over the inserted code. The execution order is preserved, but the flow is now different. If you follow the instructions one by one, you can see that the instructions are executed in perfect order as they were in the original sample.

This is basically the whole idea. Not much to add here, but what if we can up the game with...

Anti-disassembly Tricks

I will focus here on a very popular method that involves inserting a wild byte in-between instructions, which is never executed, but effectively makes byte-by-byte disassemblers trip and fail miserably. Smart disassemblers that follow the execution flow of instructions, won't be fooled by this (IDA is handling this very well), but other debuggers/disassemblers that focus on disassembling instructions one by one from top to bottom will have problems (like OllyDbg).

Let's imagine we have the following code:

00390000   B9 01000000      MOV ECX,1  
00390005   03C1             ADD EAX,ECX  

Now we insert a jump at 00390000 that will point to the next instruction:

00390000   EB 00            JMP SHORT 00390002  
00390002   B9 01000000      MOV ECX,1  
00390007   03C1             ADD EAX,ECX  

Simple right? Now that the jump is in place and it is set to jump to the next instruction, what will happen if we insert 0xE8 byte in the middle between JMP SHORT and MOV ECX,1?

00390000   EB 00            JMP SHORT 00390003  
00390002   E8 B9010000      CALL 003901C0  
00390007   0003             ADD BYTE PTR DS:[EBX],AL  
00390009   C100 00          ROL DWORD PTR DS:[EAX],0  

As you can see, the disassembler that tried to disassemble instructions one by one, came across our 0xE8 byte and interpreted it as a next instruction opcode, which is a CALL followed by an immediate relative address. The disassembler doesn't know that the inserted 0xE8 wild byte is never executed and instead tries its best to make sense of the machine code it sees. This one-byte shift also affects disassembly of other instructions that follow.

The obfuscation tool inserts a lot of jumps in order to change the execution flow and it would be a waste if we didn't put some wild bytes, after the jumps, here and there to make the disassembled code practically unreadable in debuggers/disassemblers that do not analyze the execution flow.

Here is the full list of wild bytes that I decided to use with their corresponding instruction opcodes:

68 - PUSH IMM32  
E8 - CALL  
E9 - JMP  


With all of the obfuscation features in place, let's see what will happen if we run the sample code through the 1-pass obfuscation process with highest level of control flow mixing. For reference this is the untouched sample code in its original form:

0    00000000: b802000000                       MOV EAX, 0x2  
1    00000005: b906000000                       MOV ECX, 0x6  
2    0000000a: 01c8                             ADD EAX, ECX  
3    0000000c: 85c0                             TEST EAX, EAX  
4    0000000e: 7405                             JZ 0x15  
5    00000010: 83f808                           CMP EAX, 0x8  
6    00000013: 75eb                             JNZ 0x0  
7    00000015: c3                               RET  

And here is the same code after obfuscation:

0    00000000: b837272a61                       MOV EAX, 0x612a2737  
1    00000005: eb3a                             JMP 0x41  
2    00000007: eb2f                             JMP 0x38  
3    00000009: 81c06de78ae1                     ADD EAX, 0xe18ae76d  
4    0000000f: eb16                             JMP 0x27  
5    00000011: 81e87daf9577                     SUB EAX, 0x7795af7d  
6    00000017: e994000000                       JMP 0xb0  
7    0000001c: 83  
7    0000001d: eb08                             JMP 0x27  
8    0000001f: bb0089c4e5                       MOV EBX, 0xe5c48900  
9    00000024: eb53                             JMP 0x79  
10   00000026: 68  
10   00000027: ebe8                             JMP 0x11  
11   00000029: f7  
11   0000002a: eb0c                             JMP 0x38  
12   0000002c: 81e9699dec08                     SUB ECX, 0x8ec9d69  
13   00000032: e9ab000000                       JMP 0xe2  
14   00000037: 83  
14   00000038: 81c038e2d5f6                     ADD EAX, 0xf6d5e238  
15   0000003e: ebc9                             JMP 0x9  
16   00000040: d8  
16   00000041: eb56                             JMP 0x99  
17   00000043: eb4b                             JMP 0x90  
18   00000045: eb05                             JMP 0x4c  
19   00000047: 75b7                             JNZ 0x0  
20   00000049: eb2d                             JMP 0x78  
21   0000004b: e9  
21   0000004c: b995f3fde5                       MOV ECX, 0xe5fdf395  
22   00000051: eb1a                             JMP 0x6d  
23   00000053: eb13                             JMP 0x68  
24   00000055: 81ebef03dbd0                     SUB EBX, 0xd0db03ef  
25   0000005b: eb08                             JMP 0x65  
26   0000005d: 39d8                             CMP EAX, EBX  
27   0000005f: e98f000000                       JMP 0xf3  
28   00000064: 81  
28   00000065: eb51                             JMP 0xb8  
29   00000067: ea  
29   00000068: 01c8                             ADD EAX, ECX  
30   0000006a: eb06                             JMP 0x72  
31   0000006c: d8  
31   0000006d: ebbd                             JMP 0x2c  
32   0000006f: 81  
32   00000070: eb13                             JMP 0x85  
33   00000072: 85c0                             TEST EAX, EAX  
34   00000074: eb0c                             JMP 0x82  
35   00000076: eb01                             JMP 0x79  
36   00000078: c3                               RET  
37   00000079: 81ebeb56863b                     SUB EBX, 0x3b8656eb  
38   0000007f: eb52                             JMP 0xd3  
39   00000081: d8  
39   00000082: eb27                             JMP 0xab  
40   00000084: f7  
40   00000085: eb09                             JMP 0x90  
41   00000087: 81e93cdbac3f                     SUB ECX, 0x3facdb3c  
42   0000008d: ebd9                             JMP 0x68  
43   0000008f: ea  
43   00000090: 81c0c757b8d5                     ADD EAX, 0xd5b857c7  
44   00000096: eba0                             JMP 0x38  
45   00000098: ea  
45   00000099: ebf5                             JMP 0x90  
46   0000009b: dc  
46   0000009c: eb09                             JMP 0xa7  
47   0000009e: 81c388775b66                     ADD EBX, 0x665b7788  
48   000000a4: ebaf                             JMP 0x55  
49   000000a6: d8  
49   000000a7: eb00                             JMP 0xa9  
50   000000a9: eb05                             JMP 0xb0  
51   000000ab: 74cb                             JZ 0x78  
52   000000ad: eb21                             JMP 0xd0  
53   000000af: d8  
53   000000b0: 81c0dc665268                     ADD EAX, 0x685266dc  
54   000000b6: eb09                             JMP 0xc1  
55   000000b8: 81c396e99ded                     ADD EBX, 0xed9de996  
56   000000be: eb9d                             JMP 0x5d  
57   000000c0: 83  
57   000000c1: eb09                             JMP 0xcc  
58   000000c3: 81f14ffbe7e8                     XOR ECX, 0xe8e7fb4f  
59   000000c9: ebbc                             JMP 0x87  
60   000000cb: d8  
60   000000cc: eb1d                             JMP 0xeb  
61   000000ce: eb12                             JMP 0xe2  
62   000000d0: 53                               PUSH EBX  
63   000000d1: eb09                             JMP 0xdc  
64   000000d3: 81eb3c8f5c2d                     SUB EBX, 0x2d5c8f3c  
65   000000d9: ebc3                             JMP 0x9e  
66   000000db: 81  
66   000000dc: e93effffff                       JMP 0x1f  
67   000000e1: e8  
67   000000e2: 81c1e1c939fa                     ADD ECX, 0xfa39c9e1  
68   000000e8: ebd9                             JMP 0xc3  
69   000000ea: da  
69   000000eb: e95cffffff                       JMP 0x4c  
70   000000f0: 83  
70   000000f1: eb00                             JMP 0xf3  
71   000000f3: 5b                               POP EBX  
72   000000f4: e94effffff                       JMP 0x47  
73   000000f9: da  

Please keep in mind that this code is properly disassembled as we knew where the wild bytes are and we could filter them out during the disassembly process.

Unaware of the wild bytes, poor OllyDbg disassembles the code like this:

00390000   B8 37272A61      MOV EAX,612A2737  
00390005   EB 3A            JMP SHORT 00390041  
00390007   EB 2F            JMP SHORT 00390038  
00390009   81C0 6DE78AE1    ADD EAX,E18AE76D  
0039000F   EB 16            JMP SHORT 00390027  
00390011   81E8 7DAF9577    SUB EAX,7795AF7D  
00390017   E9 94000000      JMP 003900B0  
0039001C   83EB 08          SUB EBX,8  
0039001F   BB 0089C4E5      MOV EBX,E5C48900  
00390024   EB 53            JMP SHORT 00390079  
00390026   68 EBE8F7EB      PUSH EBF7E8EB  
0039002B   0C 81            OR AL,81  
0039002D  -E9 699DEC08      JMP 09259D9B  
00390032   E9 AB000000      JMP 003900E2  
00390037   8381 C038E2D5 F6 ADD DWORD PTR DS:[ECX+D5E238C0],-0A  
0039003E  ^EB C9            JMP SHORT 00390009  
00390040   D8EB             FSUBR ST,ST(3)  
00390042   56               PUSH ESI  
00390043   EB 4B            JMP SHORT 00390090  
00390045   EB 05            JMP SHORT 0039004C  
00390047  ^75 B7            JNZ SHORT 00390000  
00390049   EB 2D            JMP SHORT 00390078  
0039004B  -E9 B995F3FD      JMP FE2C9609  
00390050   E5 EB            IN EAX,0EB                               ; I/O command  
00390052   1AEB             SBB CH,BL  
00390054   1381 EBEF03DB    ADC EAX,DWORD PTR DS:[ECX+DB03EFEB]  
0039005A   D0EB             SHR BL,1  
0039005C   0839             OR BYTE PTR DS:[ECX],BH  
0039005E   D8E9             FSUBR ST,ST(1)  
00390060   8F00             POP DWORD PTR DS:[EAX]  
00390062   0000             ADD BYTE PTR DS:[EAX],AL  
00390064   81EB 51EA01C8    SUB EBX,C801EA51  
0039006A   EB 06            JMP SHORT 00390072  
0039006C   D8EB             FSUBR ST,ST(3)  
0039006E   BD 81EB1385      MOV EBP,8513EB81  
00390073   C0EB 0C          SHR BL,0C  
00390076   EB 01            JMP SHORT 00390079  
00390078   C3               RETN  
00390079   81EB EB56863B    SUB EBX,3B8656EB  
0039007F   EB 52            JMP SHORT 003900D3  
00390081   D8EB             FSUBR ST,ST(3)  
00390083   27               DAA  
00390084   F7EB             IMUL EBX  
00390086   0981 E93CDBAC    OR DWORD PTR DS:[ECX+ACDB3CE9],EAX  
0039008C   3F               AAS  
0039008D  ^EB D9            JMP SHORT 00390068  
0039008F   EA 81C0C757 B8D5 JMP FAR D5B8:57C7C081                    ; Far jump  
00390096  ^EB A0            JMP SHORT 00390038  
00390098   EA EBF5DCEB 0981 JMP FAR 8109:EBDCF5EB                    ; Far jump  
0039009F   C3               RETN  
003900A0   8877 5B          MOV BYTE PTR DS:[EDI+5B],DH  
003900A3  -66:EB AF         JMP SHORT 00000055  
003900A6   D8EB             FSUBR ST,ST(3)  
003900A8   00EB             ADD BL,CH  
003900AA   05 74CBEB21      ADD EAX,21EBCB74  
003900AF   D881 C0DC6652    FADD DWORD PTR DS:[ECX+5266DCC0]  
003900B5   68 EB0981C3      PUSH C38109EB  
003900BA   96               XCHG EAX,ESI  
003900BB  -E9 9DEDEB9D      JMP 9E24EE5D  
003900C0   83EB 09          SUB EBX,9  
003900C3   81F1 4FFBE7E8    XOR ECX,E8E7FB4F  
003900C9  ^EB BC            JMP SHORT 00390087  
003900CB   D8EB             FSUBR ST,ST(3)  
003900CD   1D EB1253EB      SBB EAX,EB5312EB  
003900D2   0981 EB3C8F5C    OR DWORD PTR DS:[ECX+5C8F3CEB],EAX  
003900D8   2D EBC381E9      SUB EAX,E981C3EB  
003900DD   3E:FFFF          ???                                      ; Unknown command  
003900E0   FFE8             JMP FAR EAX                              ; Illegal use of register  
003900E2   81C1 E1C939FA    ADD ECX,FA39C9E1  
003900E8  ^EB D9            JMP SHORT 003900C3  
003900EA   DAE9             FUCOMPP  
003900EC   5C               POP ESP  
003900ED   FFFF             ???                                      ; Unknown command  
003900EF   FF83 EB005BE9    INC DWORD PTR DS:[EBX+E95B00EB]  
003900F5   4E               DEC ESI  
003900F6   FFFF             ???                                      ; Unknown command  
003900F8   FFDA             CALL FAR EDX                             ; Illegal use of register  

Have fun reversing that!

This part, covering execution flow obfuscation feature, sums up the subject of x86 shellcode obfuscation. I have learned a lot in the process of doing this research and I hope I've managed to show you something new.

From this point forward I may start working on porting the Python obfuscator to C/C++ and develop a stand-alone x86 obfuscation library that may be later used in more advanced projects I have in mind.

Such obfuscation engine would make a strong foundation into development of software anti-cracking protections. My next research may be finding ways to translate x86 instructions into other CPU instructions and writing an emulator engine in assembly language that would emulate the translated virtual instructions in x86 CPU architecture. This is as far as I know the best way to protect your critical code from prying eyes.

The tool in its current form, should do perfectly fine in preventing AV software from detecting known shellcodes embedded in executable files. Metasploit payloads, though, require some work first as they are not fully prepared for obfuscation. Generated payloads assume their execution flow and length won't be modified, so in order to decrease their size, fixed relative offsets are used.

Source code

As always, you can get the latest version of the tool on GitHub.

If you liked this project, you have suggestions or you just want to say hi, you can find me on Twitter @mrgretzky or Google+.

Stay tuned for more posts!


Kuba Gretzky

I am a reverse engineer and software developer. When I'm not working on my own projects, I seek jobs related to my interests. I would say I'm most proficient in C/C++ and low-level tinkering.