Unicorn VS. Malware


It has bin a while since my last post so I figured it’s time to write something. Recently I Stumbled on a piece of malware called Ponmocup which is a interesting strain of malware, but since there is plenty written about it I wont go into it’s details. While analyzing the malware I noticed that all the strings it uses are encrypted and decrypted at runtime. The decryption loops are all over the code(inline) and it seems to use various methodes to decrypt the strings, where other malware use the same routine/algorithm most the time since programmers are lazy(fact!). Normally I would make a IDA script to decrypt them but this time I choose a different approach by using the Unicorn emulator.

String decryption loop

1000B7C1   > \B9 30897B61              MOV ECX,617B8930
1000B7C6   .  898D 90F1FFFF            MOV DWORD PTR SS:[EBP-E70],ECX
1000B7CC   .  33C0                     XOR EAX,EAX
1000B7CE   >  8985 88F1FFFF            MOV DWORD PTR SS:[EBP-E78],EAX
1000B7D4   .  83F8 1D                  CMP EAX,1D
1000B7D7   .  73 1E                    JNB SHORT 1100_300.1000B7F7
1000B7D9   .  81C1 5F6BA82E            ADD ECX,2EA86B5F
1000B7DF   .  898D 90F1FFFF            MOV DWORD PTR SS:[EBP-E70],ECX
1000B7E5   .  33D2                     XOR EDX,EDX
1000B7E7   .  8A1445 14070110          MOV DL,BYTE PTR DS:[EAX*2+10010714]
1000B7EE   .  03D1                     ADD EDX,ECX
1000B7F0   .  885405 C4                MOV BYTE PTR SS:[EBP+EAX-3C],DL
1000B7F4   .  40                       INC EAX
1000B7F5   .^ EB D7                    JMP SHORT 1100_300.1000B7CE

This is an sample snippet of such decryption loop, it has a pretty simple code flow where some registers are initialized at the beginning, it then enters a loop based on a unconditional jump upwards and having a conditional jump to leave the loop. The encrypted data is loaded from 10010714.

So the idea is to use the Capstone disassembler library to analyze the decryption loop starting from a given Virtual address and locating the first unconditional jump backwards and also keeping track of write operations where a EBP base register is used, whenever such instruction is found we log the displacement value(offset to stack) which helps us to locate the decrypted string later.
Once we traced to code with the disassembler library we map the target binary into memory at it’s known imagebase and copy each section to memory so virtual addresses in the binary match up with whats in memory.

Code analyzer

def code_analyzer(pe, virtualaddress, max_instructions=128):
	# get the raw offset from the virtualaddress
	a_off = pe.get_offset_from_rva(virtualaddress - pe.OPTIONAL_HEADER.ImageBase)
	# init disassembler lib
	caps = Cs(CS_ARCH_X86, CS_MODE_32)
	caps.detail = True
	# init vars
	code_len = 0
	stack_offsets = []
	jmpfound = False
	# disassemble code and analyze the instructions
	for ins in caps.disasm(pe.__data__[a_off:], virtualaddress, max_instructions):
		# increase code_len with current instruction size
		code_len += ins.size
		if verbose:
			print format_disasembly(ins)
		# process operands
		if ins.operands:
			for ops in ins.operands:
				# memory access operands
				if ops.type == X86_OP_MEM:
					# ebp base register and disp value not 0
					if ops.value.mem.base == X86_REG_EBP and ops.value.mem.disp != 0:
						disp = abs(ops.value.mem.disp)
						# add new disp value
						if disp not in stack_offsets:
		# process groups
		if ins.groups:
			# jump types
			if ins.group(CS_GRP_JUMP):
				# JMP backwards
				if ins.id == X86_INS_JMP and int(ins.op_str, 16) < ins.address:
					jmpfound = True
			# return types
			elif ins.group(CS_GRP_RET):
	# false if max instructions reached
	if not jmpfound:
		print "End decryption loop not found"
		return 0,[]

	# paranoid mode
	if len(stack_offsets) == 0:
		print "No stack offsets found"
		return 0,[]

	# ...
	for offset in stack_offsets:
		if offset > stacksize:
			print "Stack offset 0x%08x is larger then the stacksize 0x%08x" %(offset, stacksize)
			return 0,[]

	# return code length and stackoffsets sorted descending
	return code_len, sorted(stack_offsets, reverse=True)

This code returns the amount of bytes of all instructions till the jmp and a list with all displacement values where a EBP register was involved eq. MOV DWORD PTR SS:[EBP-E78],EAX


# Initialize emulator
emu = Uc(UC_ARCH_X86, UC_MODE_32)

# map memory at the imagebase and copy each section
# data to it's virtualaddress
emu.mem_map(imagebase, imagesize + stacksize)
for section in pe.sections:
	emu.mem_write(imagebase + section.VirtualAddress, section.get_data())

# initialize stack registers ebp and esp
emu.reg_write(UC_X86_REG_ESP, stackaddress + stacksize)
emu.reg_write(UC_X86_REG_EBP, stackaddress + stacksize)

# start emulator
emu.emu_start(virtualaddress, virtualaddress + code_len)

# use the largest stack_offset value to define the min.
# ammount of stack data to read
ebp_addr = stackaddress + stacksize - stack_offsets[0]

# read stack memory, largest stack_offset as size
data = emu.mem_read(ebp_addr, stack_offsets[0])

The next code snippet maps the target binary into memory as explained earlier, it set’s up some stack memory and registers and then starts the emulator and once done it reads the stack memory and processes it by trying to locate strings at the known displacement offsets.

Some results

C:\>pomno_decrstr.py 1100.3002.dll 0x1000b7c1
1000B7C1  B930897B61            mov ecx, 0x617b8930
1000B7C6  898D90F1FFFF          mov dword ptr [ebp - 0xe70], ecx
1000B7CC  33C0                  xor eax, eax
1000B7CE  898588F1FFFF          mov dword ptr [ebp - 0xe78], eax
1000B7D4  83F81D                cmp eax, 0x1d
1000B7D7  731E                  jae 0x1000b7f7
1000B7D9  81C15F6BA82E          add ecx, 0x2ea86b5f
1000B7DF  898D90F1FFFF          mov dword ptr [ebp - 0xe70], ecx
1000B7E5  33D2                  xor edx, edx
1000B7E7  8A144514070110        mov dl, byte ptr [eax*2 + 0x10010714]
1000B7EE  03D1                  add edx, ecx
1000B7F0  885405C4              mov byte ptr [ebp + eax - 0x3c], dl
1000B7F4  40                    inc eax
1000B7F5  EBD7                  jmp 0x1000b7ce
offset   type   length   content
00003c   ASCII      29   %u.%u.%u.%u.%u.%u.%u.%u.%s.%i

C:\>pomno_decrstr.py 1100.3002.dll 0x1000b547
1000B547  BA0F000000            mov edx, 0xf
1000B54C  899520E8FFFF          mov dword ptr [ebp - 0x17e0], edx
1000B552  B8EB000000            mov eax, 0xeb
1000B557  898524E8FFFF          mov dword ptr [ebp - 0x17dc], eax
1000B55D  B9E5000000            mov ecx, 0xe5
1000B562  898D28E8FFFF          mov dword ptr [ebp - 0x17d8], ecx
1000B568  898D2CE8FFFF          mov dword ptr [ebp - 0x17d4], ecx
1000B56E  C78530E8FFFF19000000  mov dword ptr [ebp - 0x17d0], 0x19
1000B578  C78534E8FFFF23000000  mov dword ptr [ebp - 0x17cc], 0x23
1000B582  C78538E8FFFFE1000000  mov dword ptr [ebp - 0x17c8], 0xe1
1000B58C  89853CE8FFFF          mov dword ptr [ebp - 0x17c4], eax
1000B592  C78540E8FFFFEA000000  mov dword ptr [ebp - 0x17c0], 0xea
1000B59C  898544E8FFFF          mov dword ptr [ebp - 0x17bc], eax
1000B5A2  C78548E8FFFFA1000000  mov dword ptr [ebp - 0x17b8], 0xa1
1000B5AC  89854CE8FFFF          mov dword ptr [ebp - 0x17b4], eax
1000B5B2  C78550E8FFFFE6000000  mov dword ptr [ebp - 0x17b0], 0xe6
1000B5BC  C78554E8FFFF2F000000  mov dword ptr [ebp - 0x17ac], 0x2f
1000B5C6  C78558E8FFFFF3000000  mov dword ptr [ebp - 0x17a8], 0xf3
1000B5D0  C7855CE8FFFFDE000000  mov dword ptr [ebp - 0x17a4], 0xde
1000B5DA  B848840000            mov eax, 0x8448
1000B5DF  898560E8FFFF          mov dword ptr [ebp - 0x17a0], eax
1000B5E5  32C9                  xor cl, cl
1000B5E7  888D9EE8FFFF          mov byte ptr [ebp - 0x1762], cl
1000B5ED  899518E8FFFF          mov dword ptr [ebp - 0x17e8], edx
1000B5F3  3ACA                  cmp cl, dl
1000B5F5  732F                  jae 0x1000b626
1000B5F7  0FB7C0                movzx eax, ax
1000B5FA  8BF0                  mov esi, eax
1000B5FC  C1EE04                shr esi, 4
1000B5FF  C1E00C                shl eax, 0xc
1000B602  0BC6                  or eax, esi
1000B604  898560E8FFFF          mov dword ptr [ebp - 0x17a0], eax
1000B60A  0FB6F1                movzx esi, cl
1000B60D  33DB                  xor ebx, ebx
1000B60F  8A9CB524E8FFFF        mov bl, byte ptr [ebp + esi*4 - 0x17dc]
1000B616  03D8                  add ebx, eax
1000B618  885C35D4              mov byte ptr [ebp + esi - 0x2c], bl
1000B61C  FEC1                  inc cl
1000B61E  888D9EE8FFFF          mov byte ptr [ebp - 0x1762], cl
1000B624  EBCD                  jmp 0x1000b5f3
offset   type   length   content
00002c   ASCII      15   /images2/%s.swf

C:\>pomno_decrstr.py 1100.3002.dll 0x1000b20d
1000B20D  C685A7E8FFFF1C        mov byte ptr [ebp - 0x1759], 0x1c
1000B214  33C9                  xor ecx, ecx
1000B216  898D74E8FFFF          mov dword ptr [ebp - 0x178c], ecx
1000B21C  BF3F000000            mov edi, 0x3f
1000B221  89BD14E8FFFF          mov dword ptr [ebp - 0x17ec], edi
1000B227  0FBFC7                movsx eax, di
1000B22A  3BC8                  cmp ecx, eax
1000B22C  7D36                  jge 0x1000b264
1000B22E  0FB685A7E8FFFF        movzx eax, byte ptr [ebp - 0x1759]
1000B235  8BD0                  mov edx, eax
1000B237  C1EA02                shr edx, 2
1000B23A  C1E006                shl eax, 6
1000B23D  33D0                  xor edx, eax
1000B23F  33C0                  xor eax, eax
1000B241  8AC2                  mov al, dl
1000B243  8885A7E8FFFF          mov byte ptr [ebp - 0x1759], al
1000B249  33D2                  xor edx, edx
1000B24B  8A144D60C40010        mov dl, byte ptr [ecx*2 + 0x1000c460]
1000B252  2BD0                  sub edx, eax
1000B254  88940D44FFFFFF        mov byte ptr [ebp + ecx - 0xbc], dl
1000B25B  41                    inc ecx
1000B25C  898D74E8FFFF          mov dword ptr [ebp - 0x178c], ecx
1000B262  EBC3                  jmp 0x1000b227
offset   type   length   content
0000bc   ASCII      62   %u&%04X&%02X&%u.%u&%u&%s&%s&%u.%u&%u&%x.%x.%x&%s&%04x.%04x&%s&

C:\>pomno_decrstr.py 1100.3002.dll 0x1000afe5
1000AFE5  B9B8690000            mov ecx, 0x69b8
1000AFEA  894D84                mov dword ptr [ebp - 0x7c], ecx
1000AFED  32C0                  xor al, al
1000AFEF  8845A7                mov byte ptr [ebp - 0x59], al
1000AFF2  3C39                  cmp al, 0x39
1000AFF4  7326                  jae 0x1000b01c
1000AFF6  0FB7C9                movzx ecx, cx
1000AFF9  8BD1                  mov edx, ecx
1000AFFB  C1EA0E                shr edx, 0xe
1000AFFE  C1E102                shl ecx, 2
1000B001  0BCA                  or ecx, edx
1000B003  894D84                mov dword ptr [ebp - 0x7c], ecx
1000B006  0FB6F0                movzx esi, al
1000B009  33D2                  xor edx, edx
1000B00B  8A14B530060110        mov dl, byte ptr [esi*4 + 0x10010630]
1000B012  03D1                  add edx, ecx
1000B014  885435A8              mov byte ptr [ebp + esi - 0x58], dl
1000B018  FEC0                  inc al
1000B01A  EBD3                  jmp 0x1000afef
offset   type   length   content
000058   ASCII      57   Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)

C:\>pomno_decrstr.py 1100.3002.dll 0x1000ae47
1000AE47  B8BF1293F9            mov eax, 0xf99312bf
1000AE4C  8945AC                mov dword ptr [ebp - 0x54], eax
1000AE4F  32C9                  xor cl, cl
1000AE51  884DBE                mov byte ptr [ebp - 0x42], cl
1000AE54  33D2                  xor edx, edx
1000AE56  8A150C060110          mov dl, byte ptr [0x1001060c]
1000AE5C  81F2BF000000          xor edx, 0xbf
1000AE62  8855BF                mov byte ptr [ebp - 0x41], dl
1000AE65  3ACA                  cmp cl, dl
1000AE67  7329                  jae 0x1000ae92
1000AE69  8BD0                  mov edx, eax
1000AE6B  C1EA19                shr edx, 0x19
1000AE6E  C1E007                shl eax, 7
1000AE71  0BC2                  or eax, edx
1000AE73  8945AC                mov dword ptr [ebp - 0x54], eax
1000AE76  0FB6F9                movzx edi, cl
1000AE79  33D2                  xor edx, edx
1000AE7B  8A147D0E060110        mov dl, byte ptr [edi*2 + 0x1001060e]
1000AE82  33D0                  xor edx, eax
1000AE84  88543DC0              mov byte ptr [ebp + edi - 0x40], dl
1000AE88  FEC1                  inc cl
1000AE8A  884DBE                mov byte ptr [ebp - 0x42], cl
1000AE8D  8A55BF                mov dl, byte ptr [ebp - 0x41]
1000AE90  EBD3                  jmp 0x1000ae65
offset   type   length   content
000040   ASCII      16   jAhX4n4xQfx8p9P3

I coudnt find a unicode sample, but those are handled aswell(dont mind my string lookup code, it sucks, I know)

I fell in love with the Unicorn emulator library, I tried a few other in the past and this one is by far the best out there currently. This malware sample was a simple example of the usage of such emulator in a rather simple way to reach your goals.

The full script can he found here


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s