Welcome back to the series where I research the subject of shellcode obfuscation. If you missed the last episode, feel free to catch up by following this link:
Last time I've created the tool's backbone for obfuscation. Ability to insert and delete instructions, while fixing the relative jumps between instructions, made it possible to freely modify the shellcode. That is done and we don't have to care about this part anymore.
This episode I decided to explain a bit on the format of x86 instructions. There is no running away from it if I want to make any sense in this part.
X86 Instruction Format
On the following image you can see the structure of every instruction for Intel processors:
(0 to 4 bytes) - here you will find bytes that somehow affect the operation of the instruction. There are several different prefix bytes and here are just few examples:
0x66 : Operand-size override which changes the instruction from 32-bit to 16-bit mode. Yes, it is enough to put this as a prefix to any instruction to change for example MOV EAX, EDX into MOV AX, DX. 0x64 : FS segment override which allows you to access data from the FS memory segment. Useful when you want to access the Thread Information Block of current thread, for example when you want to retrieve the the address of PEB (MOV EAX, [FS:0x30])
(1 to 2 bytes) - this is the main set of instruction bytes that are always present and indicate the main instruction purpose.
(0 to 1 bytes) - this operand is present only if the instruction supports register or memory operands. There is no way to tell from the opcode bits if the instruction supports this operand. Only checking documentation will provide this answer. Examples:
8B D1 : MOV EDX, ECX # MOD-R/M is present (0xD1) 90 : NOP # MOD-R/M not present 52 : PUSH EDX # MOD-R/M not present
(0 to 1 bytes) - this operand is present only if it is specifically indicated in the Mod-R/M byte. SIB handles any form of advanced memory addressing with relation to register values. Examples:
8B 04 31 : MOV EAX, DWORD [ESI+ECX] (SIB says that ESI is offseted by ECX in this instruction) C7 04 8F BE BA FE CA : MOV [EDI+ECX*4], 0xCAFEBABE (SIB says that EDI is offseted by ECX multiplied by 4)
0, 1, 2 or 4 bytes - these bytes are present only when Mod-R/M top bits say so. Displacement is present whenever there is a need to offset the register operand with a fixed value. Can be used with SIB byte. Examples:
8B 84 31 FE CA 00 00 : MOV EAX, DWORD [ECX+ESI+0xCAFE] (offseted with 0xCAFE displacement) 8B 45 F0 : MOV EAX, DWORD [EBP-0x10] (offseted with -0x10 displacement)
0, 1, 2 or 4 bytes - these bytes are present only when the instruction supports the immediate operand and their availability and size can be only found by reading the documentation. Examples:
C7 04 8F BE BA FE CA : MOV [EDI+ECX*4], 0xCAFEBABE (0xCAFEBABE is a 4-byte immediate)
In later parts of this article, there will be places with embedded 3 bit register values, described as REG. For future reference, here is the table that explains what value is what register in different bit sizes:
|REG||Register (8 bits)||Register (16 bits)||Register (32 bits)|
Believe me, this is a lot of information to process at once and it took me quite some time to organize all this in my head. Lets take it slow and step by step.
This byte contains the information about the presence of SIB operand, presence and size of Displacement operand and also about which register or mode is used with the instruction.
|00||No displacement, register as pointer addressing (e.g.
|01||1-byte displacement, register as pointer addressing (e.g.
|10||4-byte displacement, register as pointer addressing (e.g.
|11||No displacement, register addressing (e.g. |
Source or destination register used in the instruction. May also be instruction mode. Depends on instruction opcode.
Register affected by SIB and Displacement operands.
If R/M equals 100 the SIB operand is present and register is defined in SIB operand in Base field.
If MOD equals 00 and R/M equals 101 only 32-bit displacement value will be used (e.g.
MOV EAX, [0xDEADBEEF])
Here is the full table with various combinations of MOD and R/M bits:
MOD R/M Addressing Mode === === ================================ 00 000 [ eax ] 01 000 [ eax + disp8 ] 10 000 [ eax + disp32 ] 11 000 register ( al / ax / eax ) 00 001 [ ecx ] 01 001 [ ecx + disp8 ] 10 001 [ ecx + disp32 ] 11 001 register ( cl / cx / ecx ) 00 010 [ edx ] 01 010 [ edx + disp8 ] 10 010 [ edx + disp32 ] 11 010 register ( dl / dx / edx ) 00 011 [ ebx ] 01 011 [ ebx + disp8 ] 10 011 [ ebx + disp32 ] 11 011 register ( bl / bx / ebx ) 00 100 SIB Mode 01 100 SIB + disp8 Mode 10 100 SIB + disp32 Mode 11 100 register ( ah / sp / esp ) 00 101 32-bit Displacement-Only Mode 01 101 [ ebp + disp8 ] 10 101 [ ebp + disp32 ] 11 101 register ( ch / bp / ebp ) 00 110 [ esi ] 01 110 [ esi + disp8 ] 10 110 [ esi + disp32 ] 11 110 register ( dh / si / esi ) 00 111 [ edi ] 01 111 [ edi + disp8 ] 10 111 [ edi + disp32 ] 11 111 register ( bh / di / edi )
This operand indicates if there is a more advanced addressing mode used in the instruction, like offsetting the base register with another one and multiplying it by 1, 2, 4, or 8. It is always present if R/M value in MOD-R/M operand equals 100.
Index REG value of the index register that acts as an offset to the described below Base register.
Base REG value of the main register that acts as a base of the addressing mode. If you don't want to use the Base register in the instruction and just focus on the Index register, Base bits have to be set to 101 as
ESP register is not supported in SIB.
This is a 1, 2 or 4 byte representation of the memory offset used in the instruction. 1-byte displacement is present if MOD value in MOD-R/M operand equals 01 and 4-bytes displacement is present if MOD value in MOD-R/M operand equals 10. 2-bytes displacement is triggered when
0x66 16-bit mode prefix is used with the instruction and MOD bits are 10.
There is one exception when this operand is present when it shouldn't be. When MOD equals 00 and R/M equals 101 in MOD-R/M operand, the instruction will use the Displacement as a direct memory pointer in the instruction (e.g.
MOV [0xCAFEBABE], EDX).
This value is present only if specific instruction opcode allows it. The size of the immediate value is also controlled by the opcode itself. To find out if this operand is present in any given instruction, one has to look up the opcode reference like this one:
Everything I've written above may seem complicated at first, but the more time you spend analyzing different instructions, the more it clicks in your brain. What I do is fire up OllyDbg or IDA, disassemble any executable and just look around and analyze machine code of different operands by pasting the hex values in the calculator and converting them to binary bits. From the binary data I try to disassemble the instruction by hand.
If you get lost or want to read more on the subject of formatting x86 instructions, check out these links:
With that out of the way, we can now focus on the main subject of this article, which is...
Our goal as before is to obfuscate as many instructions that contain static data that can be easily identified with signature detections. The next instruction I decided to focus on is
(pushes immediate static value onto the stack)
68 BE BA FE CA : PUSH 0xCAFEBABE 6A 40 : PUSH 0x40
We can detect these instructions by looking for
We will use the current calculations generator that we used with the previous instruction. We will take the immediate value and obfuscate it with several random calculations that will store the result in one of the randomly picked registers.
Tricky part here is that we need to
PUSH the value of our temporary register onto the stack in order to preserve its value inside the shellcode, as it may be used for something else in the place where we insert our obfuscation block. Thing is, the obfuscation flow should also put the obfuscated immediate value onto the stack and we need to make sure it is placed before our preserved register value.
In order to get it done we will
PUSH the temporary result before we
PUSH the register value. Before we
POP the temporary register value we will
MOV [ESP+4], REG which should modify the result directly on the stack. After we
POP the value of the preserved register value, our result will stay on the stack, which is exactly the same behavior the original instruction had.
This is how it looks like after implementation:
68 BE BA FE CA : PUSH 0xCAFEBABE
54 : PUSH ESP <-- pushing ANY random register to prepare place on the stack 57 : PUSH EDI BF AE 4A 6F 25 : MOV EDI, 0x256F4AAE 81 F7 50 D0 63 DE : XOR EDI, 0xDE63D050 81 EF 25 51 61 D1 : SUB EDI, 0xD1615125 81 C7 EF F7 27 62 : ADD EDI, 0x6227F7EF 81 F7 76 FB 2D 41 : XOR EDI, 0x412DFB76 89 7C 24 04 : MOV [ESP+0x04], EDI 5F : POP EDI
MOV REG, [REG+DISP]
(moves value into register from memory pointer formed by registers with displacement values)
64 8B 50 30 : MOV EDX, [FS:EAX+0x30] 8B 52 0C : MOV EDX, [EDX+0x0C] 8B 52 14 : MOV EDX, [EDX+0x14] 8B 72 28 : MOV ESI, [EDX+0x28]
With this instruction, we want to get rid of the static displacement values, which are very tasty bits for signature generators.
First we need to detect the instructions by looking for
0x8B opcode values. We are interested only in instructions that contain the displacement operand, so we make sure the top 2 bits of the MOD-R/M operand are 01 or 10.
Having detected the proper instruction for obfuscation, we extract the displacement value by carefully disassembling the instruction by hand, making sure we detect if the displacement is 1, 2 or 4 bytes long. We put the value into our calculations generator and after that, we modify the instruction in such a way that instead of the displacement operand (
MOV EDX, [EDX+0x0C]) it takes the temporary register, with our calculations result, as an index (
MOV EDX, [EDX+EDI]).
We do this by carefully crafting the new
MOV instruction from scratch with a new SIB operand that holds our index data with temporary register. Issue appears if the original instruction already used an SIB operand with the index register as we can't put more index registers into such instructions. That's when we need to insert another instruction that adds the value of the original index register (with original scale included) to our temporary register with calculations result.
You can see exactly what I mean by looking at the following obfuscation example. The following instruction already includes the SIB operand with an index register.
8B 8C 72 88 00 00 00 : MOV ECX, [EDX+ESI*2+0x88]
57 : PUSH EDI BF 1C 62 1D 50 : MOV EDI, 0x501D621C 81 F7 1C 55 89 FE : XOR EDI, 0xFE89551C 81 F7 E4 4F FA 3A : XOR EDI, 0x3AFA4FE4 81 C7 16 F8 17 9F : ADD EDI, 0x9F17F816 81 F7 72 70 86 33 : XOR EDI, 0x33867072 8D 3C 77 : LEA EDI, [EDI+ESI*2] 8B 0C 3A : MOV ECX, [EDX+EDI] 5F : POP EDI
ADD/OR/ADC/SBB/AND/SUB/XOR/CMP REG, IMM
(arithmetic or logical operation on register with immediate static value as an argument)
This obfuscation takes into account several different instructions, but they actually have a lot in common. They all have a very similar structure and consist of only 4 different opcodes. We will also handle any combination of SIB or displacement operands to make this obfuscation pretty universal and applicable in majority of cases.
80 7C B2 50 FF : CMP BYTE [EDX+ESI*4+0x50], 0xFF 66 83 7C B2 50 88 : CMP WORD [EDX+ESI*4+0x50], -0x78 81 34 32 FF FF 00 00 : XOR DWORD [EDX+ESI], 0xFFFF 81 02 FF FF 00 00 : ADD DWORD [EDX], 0xFFFF 81 EA 88 00 00 00 : SUB EDX, 0x88
The opcodes for the instructions are
0x83. What makes them different from the previously obfuscated ones, is that the type of the operation (
XOR etc.) is encoded in the REG field of the MOD-R/M operand.
What we want to do is extract the immediate value, put it into the calculations generator and then modify the instruction preserving the prefix, SIB and displacement, so that we only transform the immediate value into a source register. For example
CMP [EDX+ESI*4+0x50], -0x78 will become
CMP [EDX+ESI*4+0x50], EDI. The only thing that will change is the opcode and REG field in MOD-R/M field.
In order to see how the opcode transformation works, take a look at the excerpt from the obfuscator's source code:
# opcode - current instruction opcode # mode - REG field from MOD-R/M field indicating type of operation (ADD, CMP, XOR etc.) # nopcode - new opcode after transformation if mode == 0: # add if opcode in [0x80, 0x82]: nopcode = 0x00 elif opcode in [0x81, 0x83]: nopcode = 0x01 elif mode == 1: # or if opcode in [0x80, 0x82]: nopcode = 0x08 elif opcode in [0x81, 0x83]: nopcode = 0x09 elif mode == 2: # adc if opcode in [0x80, 0x82]: nopcode = 0x10 elif opcode in [0x81, 0x83]: nopcode = 0x11 elif mode == 3: # sbb if opcode in [0x80, 0x82]: nopcode = 0x18 elif opcode in [0x81, 0x83]: nopcode = 0x19 elif mode == 4: # and if opcode in [0x80, 0x82]: nopcode = 0x20 elif opcode in [0x81, 0x83]: nopcode = 0x21 elif mode == 5: # sub if opcode in [0x80, 0x82]: nopcode = 0x28 elif opcode in [0x81, 0x83]: nopcode = 0x29 elif mode == 6: # xor if opcode in [0x80, 0x82]: nopcode = 0x30 elif opcode in [0x81, 0x83]: nopcode = 0x31 elif mode == 7: # cmp if opcode in [0x80, 0x82]: nopcode = 0x38 elif opcode in [0x81, 0x83]: nopcode = 0x39
After the opcode is changed, we need to replace the REG field in MOD-R/M with the temporary register value that we used in calculations generator, like this:
# treg - temporary register value # modrm - current MOD-R/M # nmodrm - new MOD-R/M nmodrm = (modrm & 0xc7) | (treg << 3) # 0xc7 = 11000111b
Here is an example of successful obfuscation:
83 7C B2 50 EE : CMP DWORD [EDX+ESI*4+0x50], -0x12
50 : PUSH EAX B8 35 80 50 58 : MOV EAX, 0x58508035 81 C0 73 E7 46 14 : ADD EAX, 0x1446E773 81 E8 B1 4E D2 1D : SUB EAX, 0x1DD24EB1 81 F0 B9 13 B4 DF : XOR EAX, 0xDFB413B9 81 F0 A0 F4 8E 6E : XOR EAX, 0x6E8EF4A0 39 44 B2 50 : CMP [EDX+ESI*4+0x50], EAX 58 : POP EAX
As you may have noticed, this obfuscation obfuscates the logical and arithmetic operations that are spewed out by our calculations generator. That means now you can also obfuscate the obfuscation output. Behold the obfuscception!
I heard you like obfuscation so I put an obfuscator in your obfuscator, so you can obfuscate when you obfuscate
I have added a command-line parameter that tells the tool how many obfuscation passes it should do. Unfortunately Python's limitations start showing after the 3rd obfuscation pass when the process gets exponentially more time consuming. Ideally, one day, I will create this tool in C/C++ for maximum efficiency. This Python implementation should be treated as an experiment or PoC.
Please do send me suggestions on Twitter or in the comments below.
Next part will cover execution flow obfuscation. There is nothing more annoying for reverse engineers than jumps inserted every 1-3 instructions to different parts of code.
As always the source code is available on GitHub.
Stay tuned for the next episode!
Part 3 is out!