Synopsis
To get the flag of course ;-). Slightly more serious, this post presents one way to solve 2017's third Flare-On challenge. The solution will make extensive use of python to statically extract the flag.
Setup
We start by opening the file in IDA and jumping to the entry point. The entry point simply calls sub_401008
.
Screen Shot 1: Entry Point
Following this lead, open the function and take a look around. Voila! Scrolling towards the bottom we see text that looks like a success message. If you're new to reverse engineering CTFs, the standard process is to find where the success path indicator is and work your way back from there.
Screen Shot 2: Success Message
Walking backwards from this point, we find a comparison check at 40105E
. eax
is compared to the value 0xFB5E
.
Screen Shot 3: EAX Comparison Check
If eax
matches, then the program will proceed to execute the success branch. So how do we get this to match?
Taking a look inside sub_4011E6
we see a bunch of mov
, add
, shl
, and shr
instructions which can be indicative of some form of obfuscation routine. Let's take a look at the parameters passed to the function.
.text:00401047 mov eax, offset loc_40107C
.text:0040104C mov [ebp+var_C], eax
.text:0040104F push 79h
.text:00401051 push [ebp+var_C]
.text:00401054 call sub_4011E6
Code Snippet 1: sub_40011E6
Function Call
Reading this from the bottom, the first parameter is what was stored in eax
and the second parameter is the value 0x79
. Notice that eax
contains an offset to 0x40107C
. The second parameter, 0x79
, looks suspiciously like a length parameter. Further inspection of sub_4011E6
confirms this. Thus, the below snippet of assembly code is what is going to be modifed by this program.
Screen Shot 4: Obfuscated Assembly Code
No variable was entered into the de-obfuscation routine. Guessing there needs to be some form of user input, so let's keep looking.
Going to the next code block, we see that the same section of assembly code is being modifed with xor
and add
instructions before it is getting passed to sub_4011E6
.
Screen Shot 5: First De-Obfuscation Routine
The xor
value is stored in dl
and that is moved into dl
from [ebp+buf]
. This looks like it might be our user input. Tracing this further up the function we see that it is passed as a parameter into sub_401121
.
Screen Shot 6: sub_401121
Function Call
Taking a quick peek inside that function, we see that [ebp+buf]
is set at 0x4011BC
.
Screen Shot 7: Call to recv
The rest of the function just spins up a server which waits to receive this input.
That looks like all we need!
Summarizing:
- Found success text at:
0x4010FE
- Comparison check at:
0x40105E
- Must match
0xFB5E
to take the success route
- Must match
- Obfuscated code starting point:
0x40107C
- Length of obfuscated code:
0x79
- Second level de-obfuscation function called at:
0x401054
- First round of de-obfuscation from
0x401029
to0x401045
- User input happens in the function called at:
0x401015
- User input is from the network
- Length of the input buffer is
0x4
Solving
Before we can start scripting our solution, we need to extract the obfuscated bytes.
Here we extract the bytes using IDAPython:
with open('greek_to_me_buffer.asm', 'wb') as f:
f.write(idaapi.get_many_bytes(0x40107C, 0x79))
Code Snippet 2: Copy Bytes Out Using IDA
Now we can move onto the fun part. Scripting!
User Input
We know the length of the values received from the network is 4 bytes, but the astute reader would have noticed that the value of buf
is moved into dl
and not edx
causing the actual range of values is from 0x0
to 0xff
(dl
is one byte in size). The start of our script will look something like this.
for buf in xrange(0x100):
print("Using {0}".format(buf))
Code Snippet 3: Loop Through Possible Input Values
Brute Force
Next we need to modify the bits extracted to disk to get the comparison check to pass. This involves passing them through the two de-obfuscation routines.
De-obfuscation Step 1
For the first de-obfuscation routine (0x401039
) we can easily write the "decoder" in python.
# Variable to store the bits written to disk using IDA
asm = None
# Store the output from the first de-obfuscation routine
b2 = []
# Read in bytes written to file from IDA
with open('greek_to_me_buffer.asm', 'rb') as f:
asm = f.read()
# Re-implement loc_401039
dl = buf
for byte in b:
bl = ord(byte)
bl = bl ^ dl
bl = bl & 0xff
bl = bl + 0x22
bl = bl & 0xff
b2.append(bl)
Code Snippet 4: First De-Obfuscation Routine
Remember this should be kept inside the for
loop block.
De-obfuscation Step 2
With Angr's[1] help we will work our way through the second de-obfuscation routine (0x4011E6
) Kind of feels like cheating, but who wants to re-write the routine in Python or C!
Declare an angr
project instance just prior to the for
loop, so that it won't be re-created each time the for
loop is executed.
p = angr.Project('greek_to_me.exe', load_options={'auto_load_libs': False})
Code Snippet 5: Angr Project
Set up Angr to simulate sub_4011E6
but this one needs to be inside our for
loop.
# Set up angr to "run" sub_4011E6
s = p.factory.blank_state(addr=0x4011E6)
s.mem[s.regs.esp+4:].dword = 1 # Angr memory location to hold the xor'ed and add'ed bytes
s.mem[s.regs.esp+8:].dword = 0x79 # Length of ASM
# Copy bytes output from loc_401039 into address 0x1 so Angr can run it
asm = ''.join(map(lambda x: chr(x), b2))
s.memory.store(1, s.se.BVV(int(asm.encode('hex'), 16), 0x79 * 8 ))
# Create a simulation manager...
simgr = p.factory.simulation_manager(s)
# Tell Angr where to go, though there is only one way through this function,
# we just need to stop after ax is set
simgr.explore(find=0x401268)
Code Snippet 6: Angr Simulating Function
While I realize that using Angr here might be overkill, it was the newest tool in my belt so all problems had to be solved using Angr in some fashion ;-)
Input Validation Check
Next we need to check the output of ax
to see if it matches 0xfb5e
# Once ax is set, check to see if the value in ax matches the comparison value
for found in simgr.found:
print(hex(found.state.solver.eval(found.state.regs.ax)))
# Comparison check
if hex(found.state.solver.eval(found.state.regs.ax)) == '0xfb5eL':
# Will cover what to do here in the next section
pass
Code Snippet 7: Check Angr ax
Result
De-Obfuscated Code
Now that we have a match, we end up with this when printing it to the screen.
�e�]��E�t�_�U��E�t�E�u�U��E�b�E�r�E�u�E�t�]߈U��E�f�E�o�E�r�E�c�]��E�@�E�f�E�l�E�a�E�r�]��E�-�E�o�E�n�E�.�E�c�E�o�E�m�E�
Code Snippet 8: Illegible Bytes
Guessing it is assembly code. Using Capstone, we can disassemble the code.
from capstone import *
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(code, 0x1000):
print("0x%x\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
Code Snippet 9: Dissasemble Code
Running our script again, we confirm that it is assembly code and the values being added to the buffer appear to be ASCII characters.
0x1000 mov bl, 0x65 None
0x1002 mov byte ptr [ebp - 0x2b], bl
0x1005 mov byte ptr [ebp - 0x2a], 0x74
0x1009 mov dl, 0x5f None
0x100b mov byte ptr [ebp - 0x29], dl
0x100e mov byte ptr [ebp - 0x28], 0x74
0x1012 mov byte ptr [ebp - 0x27], 0x75
0x1016 mov byte ptr [ebp - 0x26], dl
0x1019 mov byte ptr [ebp - 0x25], 0x62
0x101d mov byte ptr [ebp - 0x24], 0x72
0x1021 mov byte ptr [ebp - 0x23], 0x75
0x1025 mov byte ptr [ebp - 0x22], 0x74
0x1029 mov byte ptr [ebp - 0x21], bl
0x102c mov byte ptr [ebp - 0x20], dl
0x102f mov byte ptr [ebp - 0x1f], 0x66
0x1033 mov byte ptr [ebp - 0x1e], 0x6f
0x1037 mov byte ptr [ebp - 0x1d], 0x72
0x103b mov byte ptr [ebp - 0x1c], 0x63
0x103f mov byte ptr [ebp - 0x1b], bl
0x1042 mov byte ptr [ebp - 0x1a], 0x40
0x1046 mov byte ptr [ebp - 0x19], 0x66
0x104a mov byte ptr [ebp - 0x18], 0x6c
0x104e mov byte ptr [ebp - 0x17], 0x61
0x1052 mov byte ptr [ebp - 0x16], 0x72
0x1056 mov byte ptr [ebp - 0x15], bl
0x1059 mov byte ptr [ebp - 0x14], 0x2d
0x105d mov byte ptr [ebp - 0x13], 0x6f
0x1061 mov byte ptr [ebp - 0x12], 0x6e
0x1065 mov byte ptr [ebp - 0x11], 0x2e
0x1069 mov byte ptr [ebp - 0x10], 0x63
0x106d mov byte ptr [ebp - 0xf], 0x6f
0x1071 mov byte ptr [ebp - 0xe], 0x6d
0x1075 mov byte ptr [ebp - 0xd], 0
Code Snippet 10: Dissasembled Code
Embedded ASCII
While we could do the hex to ASCII mapping manually, why don't we get the script to do the work for us. Modifing the for
loop we get:
bl = None
dl = None
flag = []
# Using capstone, interpret the ASM
from capstone import *
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(code, 0x1000):
flag_char = None
# The if statements do the work of interpreting the ASCII codes to their value counterpart
if i.op_str.split(',')[0].startswith("byte ptr"):
flag_char = chr(long(i.op_str.split(',')[1], 16))
if i.op_str.split(',')[0].startswith('bl'):
bl = chr(long(i.op_str.split(',')[1], 16))
if i.op_str.split(',')[0].startswith('dl'):
dl = chr(long(i.op_str.split(',')[1], 16))
if i.op_str.split(',')[1].strip() == 'dl':
flag_char = dl
if i.op_str.split(',')[1].strip() == 'bl':
flag_char = bl
if (flag_char):
flag.append(flag_char.strip())
print("0x%x\t%s\t%s\t%s" %(i.address, i.mnemonic, i.op_str, flag_char))
print(''.join(flag))
Code Snippet 11: Code to Print the Flag
Finally, running the script we get the flag!
[email protected]
Code Snippet 12: Flag
Conclusion
Overall this was a fun challenge to solve statically while getting a chance to use Angr and Capstone for the first time in a CTF. Originally I had solved this challenge using Angr 6; however, the script here targets Angr 7 as Angr was updated in the middle of the CTF. I'll update this post if I go back and solve it using the unicorn-engine versus Angr.
The complete script can be found here [2].
Below is an Asciinema[3] recording for those that just want to see the script in action; though it is a lot more fun to run it yourself!