x86

X86 Shellcode Obfuscation - Part 2

Kuba Gretzky

May 19, 2016 • 11 min read

Welcome back to the series where I research the subject of shellcode obfuscation. If you missed the last episode, feel free to catch up by following this link:

X86 Shellcode Obfuscation - Part 1

Last time I've created the tool's backbone for obfuscation. Ability to insert and delete instructions, while fixing the relative jumps between instructions, made it possible to freely modify the shellcode. That is done and we don't have to care about this part anymore.

This episode I decided to explain a bit on the format of x86 instructions. There is no running away from it if I want to make any sense in this part.

X86 Instruction Format

On the following image you can see the structure of every instruction for Intel processors:

Blocks with instruction format

Prefix (0 to 4 bytes) - here you will find bytes that somehow affect the operation of the instruction. There are several different prefix bytes and here are just few examples:

0x66 : Operand-size override which changes the instruction from 32-bit to 16-bit mode. Yes, it is enough to put this as a prefix to any instruction to change for example MOV EAX, EDX into MOV AX, DX.
0x64 : FS segment override which allows you to access data from the FS memory segment. Useful when you want to access the Thread Information Block of current thread, for example when you want to retrieve the the address of PEB (MOV EAX, [FS:0x30])

Opcode (1 to 2 bytes) - this is the main set of instruction bytes that are always present and indicate the main instruction purpose.

Mod-R/M (0 to 1 bytes) - this operand is present only if the instruction supports register or memory operands. There is no way to tell from the opcode bits if the instruction supports this operand. Only checking documentation will provide this answer. Examples:

8B D1       : MOV EDX, ECX    # MOD-R/M is present (0xD1)
90          : NOP             # MOD-R/M not present
52          : PUSH EDX        # MOD-R/M not present

SIB (0 to 1 bytes) - this operand is present only if it is specifically indicated in the Mod-R/M byte. SIB handles any form of advanced memory addressing with relation to register values. Examples:

8B 04 31             : MOV EAX, DWORD [ESI+ECX] (SIB says that ESI is offseted by ECX in this instruction)
C7 04 8F BE BA FE CA : MOV [EDI+ECX*4], 0xCAFEBABE (SIB says that EDI is offseted by ECX multiplied by 4)

Displacement 0, 1, 2 or 4 bytes - these bytes are present only when Mod-R/M top bits say so. Displacement is present whenever there is a need to offset the register operand with a fixed value. Can be used with SIB byte. Examples:

8B 84 31 FE CA 00 00 : MOV EAX, DWORD [ECX+ESI+0xCAFE] (offseted with 0xCAFE displacement)
8B 45 F0             : MOV EAX, DWORD [EBP-0x10] (offseted with -0x10 displacement)

Immediate 0, 1, 2 or 4 bytes - these bytes are present only when the instruction supports the immediate operand and their availability and size can be only found by reading the documentation. Examples:

C7 04 8F BE BA FE CA : MOV [EDI+ECX*4], 0xCAFEBABE (0xCAFEBABE is a 4-byte immediate)

In later parts of this article, there will be places with embedded 3 bit register values, described as REG. For future reference, here is the table that explains what value is what register in different bit sizes:

REG	Register (8 bits)	Register (16 bits)	Register (32 bits)
000	AL	AX	EAX
001	CL	CX	ECX
010	DL	DX	EDX
011	BL	BX	EBX
100	AH	SP	ESP
101	CH	BP	EBP
110	DH	SI	ESI
111	BH	DI	EDI

Believe me, this is a lot of information to process at once and it took me quite some time to organize all this in my head. Lets take it slow and step by step.

Mod-R/M

This byte contains the information about the presence of SIB operand, presence and size of Displacement operand and also about which register or mode is used with the instruction.

Byte layout of Mod-R/M operand

MOD	Meaning
00	No displacement, register as pointer addressing (e.g. `MOV EAX, [EDX]`)
01	1-byte displacement, register as pointer addressing (e.g. `MOV EAX, [EDX+0x10]`)
10	4-byte displacement, register as pointer addressing (e.g. `MOV EAX, [EDX+0xDEADBEEF]`)
11	No displacement, register addressing (e.g. `MOV EAX, EDX`)

REG

Source or destination register used in the instruction. May also be instruction mode. Depends on instruction opcode.

R/M

Register affected by SIB and Displacement operands.
If R/M equals 100 the SIB operand is present and register is defined in SIB operand in Base field.
If MOD equals 00 and R/M equals 101 only 32-bit displacement value will be used (e.g. MOV EAX, [0xDEADBEEF])

Here is the full table with various combinations of MOD and R/M bits:

MOD R/M Addressing Mode
=== === ================================
 00 000 [ eax ]
 01 000 [ eax + disp8 ]
 10 000 [ eax + disp32 ]
 11 000 register  ( al / ax / eax )
 00 001 [ ecx ]
 01 001 [ ecx + disp8 ]
 10 001 [ ecx + disp32 ]
 11 001 register  ( cl / cx / ecx )
 00 010 [ edx ]
 01 010 [ edx + disp8 ]
 10 010 [ edx + disp32 ]
 11 010 register  ( dl / dx / edx )
 00 011 [ ebx ]
 01 011 [ ebx + disp8 ]
 10 011 [ ebx + disp32 ]
 11 011 register  ( bl / bx / ebx )
 00 100 SIB Mode
 01 100 SIB + disp8 Mode
 10 100 SIB + disp32 Mode
 11 100 register  ( ah / sp / esp )
 00 101 32-bit Displacement-Only Mode
 01 101 [ ebp + disp8 ]
 10 101 [ ebp + disp32 ]
 11 101 register  ( ch / bp / ebp )
 00 110 [ esi ]
 01 110 [ esi + disp8 ]
 10 110 [ esi + disp32 ]
 11 110 register  ( dh / si / esi )
 00 111 [ edi ]
 01 111 [ edi + disp8 ]
 10 111 [ edi + disp32 ]
 11 111 register  ( bh / di / edi )

SIB

This operand indicates if there is a more advanced addressing mode used in the instruction, like offsetting the base register with another one and multiplying it by 1, 2, 4, or 8. It is always present if R/M value in MOD-R/M operand equals 100.

Byte layout of SIB operand

Scale	Index*Scale
00	Index*1
01	Index*2
10	Index*4
11	Index*8

Index

Index REG value of the index register that acts as an offset to the described below Base register.

Base

Base REG value of the main register that acts as a base of the addressing mode. If you don't want to use the Base register in the instruction and just focus on the Index register, Base bits have to be set to 101 as ESP register is not supported in SIB.

Displacement

This is a 1, 2 or 4 byte representation of the memory offset used in the instruction. 1-byte displacement is present if MOD value in MOD-R/M operand equals 01 and 4-bytes displacement is present if MOD value in MOD-R/M operand equals 10. 2-bytes displacement is triggered when 0x66 16-bit mode prefix is used with the instruction and MOD bits are 10.

There is one exception when this operand is present when it shouldn't be. When MOD equals 00 and R/M equals 101 in MOD-R/M operand, the instruction will use the Displacement as a direct memory pointer in the instruction (e.g. MOV [0xCAFEBABE], EDX).

Immediate

This value is present only if specific instruction opcode allows it. The size of the immediate value is also controlled by the opcode itself. To find out if this operand is present in any given instruction, one has to look up the opcode reference like this one:
http://ref.x86asm.net/coder32.html

Everything I've written above may seem complicated at first, but the more time you spend analyzing different instructions, the more it clicks in your brain. What I do is fire up OllyDbg or IDA, disassemble any executable and just look around and analyze machine code of different operands by pasting the hex values in the calculator and converting them to binary bits. From the binary data I try to disassemble the instruction by hand.

If you get lost or want to read more on the subject of formatting x86 instructions, check out these links:

Encoding Real x86 Instructions

x86 Instruction Encoding Revealed: Bit Twiddling for Fun and Profit

With that out of the way, we can now focus on the main subject of this article, which is...

Obfuscation

Our goal as before is to obfuscate as many instructions that contain static data that can be easily identified with signature detections. The next instruction I decided to focus on is PUSH IMM.

PUSH IMM

(pushes immediate static value onto the stack)

Examples:

68 BE BA FE CA : PUSH 0xCAFEBABE
6A 40          : PUSH 0x40

We can detect these instructions by looking for 0x68 and 0x6A opcodes.

We will use the current calculations generator that we used with the previous instruction. We will take the immediate value and obfuscate it with several random calculations that will store the result in one of the randomly picked registers.

Tricky part here is that we need to PUSH the value of our temporary register onto the stack in order to preserve its value inside the shellcode, as it may be used for something else in the place where we insert our obfuscation block. Thing is, the obfuscation flow should also put the obfuscated immediate value onto the stack and we need to make sure it is placed before our preserved register value.

In order to get it done we will PUSH the temporary result before we PUSH the register value. Before we POP the temporary register value we will MOV [ESP+4], REG which should modify the result directly on the stack. After we POP the value of the preserved register value, our result will stay on the stack, which is exactly the same behavior the original instruction had.

This is how it looks like after implementation:

Before:

68 BE BA FE CA : PUSH 0xCAFEBABE

After:

54                : PUSH ESP <-- pushing ANY random register to prepare place on the stack
57                : PUSH EDI
BF AE 4A 6F 25    : MOV EDI, 0x256F4AAE
81 F7 50 D0 63 DE : XOR EDI, 0xDE63D050
81 EF 25 51 61 D1 : SUB EDI, 0xD1615125
81 C7 EF F7 27 62 : ADD EDI, 0x6227F7EF
81 F7 76 FB 2D 41 : XOR EDI, 0x412DFB76
89 7C 24 04       : MOV [ESP+0x04], EDI
5F                : POP EDI

MOV REG, [REG+DISP]

(moves value into register from memory pointer formed by registers with displacement values)

Examples:

64 8B 50 30 : MOV EDX, [FS:EAX+0x30]
8B 52 0C    : MOV EDX, [EDX+0x0C]
8B 52 14    : MOV EDX, [EDX+0x14]
8B 72 28    : MOV ESI, [EDX+0x28]

With this instruction, we want to get rid of the static displacement values, which are very tasty bits for signature generators.

First we need to detect the instructions by looking for 0x8A or 0x8B opcode values. We are interested only in instructions that contain the displacement operand, so we make sure the top 2 bits of the MOD-R/M operand are 01 or 10.

Having detected the proper instruction for obfuscation, we extract the displacement value by carefully disassembling the instruction by hand, making sure we detect if the displacement is 1, 2 or 4 bytes long. We put the value into our calculations generator and after that, we modify the instruction in such a way that instead of the displacement operand (MOV EDX, [EDX+0x0C]) it takes the temporary register, with our calculations result, as an index (MOV EDX, [EDX+EDI]).

We do this by carefully crafting the new MOV instruction from scratch with a new SIB operand that holds our index data with temporary register. Issue appears if the original instruction already used an SIB operand with the index register as we can't put more index registers into such instructions. That's when we need to insert another instruction that adds the value of the original index register (with original scale included) to our temporary register with calculations result.

You can see exactly what I mean by looking at the following obfuscation example. The following instruction already includes the SIB operand with an index register.

Before:

8B 8C 72 88 00 00 00 : MOV ECX, [EDX+ESI*2+0x88]

After:

57                : PUSH EDI
BF 1C 62 1D 50    : MOV EDI, 0x501D621C
81 F7 1C 55 89 FE : XOR EDI, 0xFE89551C
81 F7 E4 4F FA 3A : XOR EDI, 0x3AFA4FE4
81 C7 16 F8 17 9F : ADD EDI, 0x9F17F816
81 F7 72 70 86 33 : XOR EDI, 0x33867072
8D 3C 77          : LEA EDI, [EDI+ESI*2]
8B 0C 3A          : MOV ECX, [EDX+EDI]
5F                : POP EDI

ADD/OR/ADC/SBB/AND/SUB/XOR/CMP REG, IMM

(arithmetic or logical operation on register with immediate static value as an argument)

This obfuscation takes into account several different instructions, but they actually have a lot in common. They all have a very similar structure and consist of only 4 different opcodes. We will also handle any combination of SIB or displacement operands to make this obfuscation pretty universal and applicable in majority of cases.

Examples:

80 7C B2 50 FF          : CMP BYTE [EDX+ESI*4+0x50], 0xFF
66 83 7C B2 50 88       : CMP WORD [EDX+ESI*4+0x50], -0x78
81 34 32 FF FF 00 00    : XOR DWORD [EDX+ESI], 0xFFFF
81 02 FF FF 00 00       : ADD DWORD [EDX], 0xFFFF
81 EA 88 00 00 00       : SUB EDX, 0x88

The opcodes for the instructions are 0x80, 0x81, 0x82 or 0x83. What makes them different from the previously obfuscated ones, is that the type of the operation (AND, ADD, XOR etc.) is encoded in the REG field of the MOD-R/M operand.

What we want to do is extract the immediate value, put it into the calculations generator and then modify the instruction preserving the prefix, SIB and displacement, so that we only transform the immediate value into a source register. For example CMP [EDX+ESI*4+0x50], -0x78 will become CMP [EDX+ESI*4+0x50], EDI. The only thing that will change is the opcode and REG field in MOD-R/M field.

In order to see how the opcode transformation works, take a look at the excerpt from the obfuscator's source code:

# opcode - current instruction opcode
# mode - REG field from MOD-R/M field indicating type of operation (ADD, CMP, XOR etc.)
# nopcode - new opcode after transformation

if mode == 0: # add
	if opcode in [0x80, 0x82]:
		nopcode = 0x00
	elif opcode in [0x81, 0x83]:
		nopcode = 0x01
elif mode == 1: # or
	if opcode in [0x80, 0x82]:
		nopcode = 0x08
	elif opcode in [0x81, 0x83]:
		nopcode = 0x09
elif mode == 2: # adc
	if opcode in [0x80, 0x82]:
		nopcode = 0x10
	elif opcode in [0x81, 0x83]:
		nopcode = 0x11
elif mode == 3: # sbb
	if opcode in [0x80, 0x82]:
		nopcode = 0x18
	elif opcode in [0x81, 0x83]:
		nopcode = 0x19
elif mode == 4: # and
	if opcode in [0x80, 0x82]:
		nopcode = 0x20
	elif opcode in [0x81, 0x83]:
		nopcode = 0x21
elif mode == 5: # sub
	if opcode in [0x80, 0x82]:
		nopcode = 0x28
	elif opcode in [0x81, 0x83]:
		nopcode = 0x29
elif mode == 6: # xor
	if opcode in [0x80, 0x82]:
		nopcode = 0x30
	elif opcode in [0x81, 0x83]:
		nopcode = 0x31
elif mode == 7: # cmp
	if opcode in [0x80, 0x82]:
		nopcode = 0x38
	elif opcode in [0x81, 0x83]:
		nopcode = 0x39

After the opcode is changed, we need to replace the REG field in MOD-R/M with the temporary register value that we used in calculations generator, like this:

# treg - temporary register value
# modrm - current MOD-R/M
# nmodrm - new MOD-R/M

nmodrm = (modrm & 0xc7) | (treg << 3) # 0xc7 = 11000111b

Here is an example of successful obfuscation:

Before:

83 7C B2 50 EE : CMP DWORD [EDX+ESI*4+0x50], -0x12

After:

50                : PUSH EAX
B8 35 80 50 58    : MOV EAX, 0x58508035
81 C0 73 E7 46 14 : ADD EAX, 0x1446E773
81 E8 B1 4E D2 1D : SUB EAX, 0x1DD24EB1
81 F0 B9 13 B4 DF : XOR EAX, 0xDFB413B9
81 F0 A0 F4 8E 6E : XOR EAX, 0x6E8EF4A0
39 44 B2 50       : CMP [EDX+ESI*4+0x50], EAX
58                : POP EAX

As you may have noticed, this obfuscation obfuscates the logical and arithmetic operations that are spewed out by our calculations generator. That means now you can also obfuscate the obfuscation output. Behold the obfuscception!

I heard you like obfuscation so I put an obfuscator in your obfuscator, so you can obfuscate when you obfuscate

I have added a command-line parameter that tells the tool how many obfuscation passes it should do. Unfortunately Python's limitations start showing after the 3rd obfuscation pass when the process gets exponentially more time consuming. Ideally, one day, I will create this tool in C/C++ for maximum efficiency. This Python implementation should be treated as an experiment or PoC.

Wrapping up

I hope you enjoyed the second part of the obfuscation series. If you liked it and want to be first to know when the next part is out, follow me on Twitter @mrgretzky or Google+.

Please do send me suggestions on Twitter or in the comments below.

Next part will cover execution flow obfuscation. There is nothing more annoying for reverse engineers than jumps inserted every 1-3 instructions to different parts of code.

Source code

As always the source code is available on GitHub.

Stay tuned for the next episode!

Update:

Part 3 is out!

X86 Shellcode Obfuscation - Part 3

EOF

REG	Register (8 bits)	Register (16 bits)	Register (32 bits)
000	AL	AX	EAX
001	CL	CX	ECX
010	DL	DX	EDX
011	BL	BX	EBX
100	AH	SP	ESP
101	CH	BP	EBP
110	DH	SI	ESI
111	BH	DI	EDI

REG	Register (8 bits)	Register (16 bits)	Register (32 bits)
000	AL	AX	EAX
001	CL	CX	ECX
010	DL	DX	EDX
011	BL	BX	EBX
100	AH	SP	ESP
101	CH	BP	EBP
110	DH	SI	ESI
111	BH	DI	EDI

REG	Register (8 bits)	Register (16 bits)	Register (32 bits)
000	AL	AX	EAX
001	CL	CX	ECX
010	DL	DX	EDX
011	BL	BX	EBX
100	AH	SP	ESP
101	CH	BP	EBP
110	DH	SI	ESI
111	BH	DI	EDI