Flare-On 2017: Challenge 3

Synopsis

To get the flag of course ;-). Slightly more serious, this post presents one way to solve 2017's third Flare-On challenge. The solution will make extensive use of python to statically extract the flag.

Setup

We start by opening the file in IDA and jumping to the entry point. The entry point simply calls sub_401008.


Screen Shot 1: Entry Point

Following this lead, open the function and take a look around. Voila! Scrolling towards the bottom we see text that looks like a success message. If you're new to reverse engineering CTFs, the standard process is to find where the success path indicator is and work your way back from there.


Screen Shot 2: Success Message

Walking backwards from this point, we find a comparison check at 40105E. eax is compared to the value 0xFB5E.


Screen Shot 3: EAX Comparison Check

If eax matches, then the program will proceed to execute the success branch. So how do we get this to match?

Taking a look inside sub_4011E6 we see a bunch of mov, add, shl, and shr instructions which can be indicative of some form of obfuscation routine. Let's take a look at the parameters passed to the function.

.text:00401047 mov     eax, offset loc_40107C
.text:0040104C mov     [ebp+var_C], eax
.text:0040104F push    79h
.text:00401051 push    [ebp+var_C]
.text:00401054 call    sub_4011E6

Code Snippet 1: sub_40011E6 Function Call

Reading this from the bottom, the first parameter is what was stored in eax and the second parameter is the value 0x79. Notice that eax contains an offset to 0x40107C. The second parameter, 0x79, looks suspiciously like a length parameter. Further inspection of sub_4011E6 confirms this. Thus, the below snippet of assembly code is what is going to be modifed by this program.

greek_to_me_asm_snippet
Screen Shot 4: Obfuscated Assembly Code

No variable was entered into the de-obfuscation routine. Guessing there needs to be some form of user input, so let's keep looking.

Going to the next code block, we see that the same section of assembly code is being modifed with xor and add instructions before it is getting passed to sub_4011E6.


Screen Shot 5: First De-Obfuscation Routine

The xor value is stored in dl and that is moved into dl from [ebp+buf]. This looks like it might be our user input. Tracing this further up the function we see that it is passed as a parameter into sub_401121.


Screen Shot 6: sub_401121 Function Call

Taking a quick peek inside that function, we see that [ebp+buf] is set at 0x4011BC.


Screen Shot 7: Call to recv

The rest of the function just spins up a server which waits to receive this input.

That looks like all we need!

Summarizing:

  • Found success text at: 0x4010FE
  • Comparison check at: 0x40105E
    • Must match 0xFB5E to take the success route
  • Obfuscated code starting point: 0x40107C
  • Length of obfuscated code: 0x79
  • Second level de-obfuscation function called at: 0x401054
  • First round of de-obfuscation from 0x401029 to 0x401045
  • User input happens in the function called at: 0x401015
    • User input is from the network
    • Length of the input buffer is 0x4

Solving

Before we can start scripting our solution, we need to extract the obfuscated bytes.

Here we extract the bytes using IDAPython:

with open('greek_to_me_buffer.asm', 'wb') as f:
  f.write(idaapi.get_many_bytes(0x40107C, 0x79))

Code Snippet 2: Copy Bytes Out Using IDA

Now we can move onto the fun part. Scripting!

User Input

We know the length of the values received from the network is 4 bytes, but the astute reader would have noticed that the value of buf is moved into dl and not edx causing the actual range of values is from 0x0 to 0xff (dl is one byte in size). The start of our script will look something like this.

for buf in xrange(0x100):
    print("Using {0}".format(buf))

Code Snippet 3: Loop Through Possible Input Values

Brute Force

Next we need to modify the bits extracted to disk to get the comparison check to pass. This involves passing them through the two de-obfuscation routines.

De-obfuscation Step 1

For the first de-obfuscation routine (0x401039) we can easily write the "decoder" in python.

    # Variable to store the bits written to disk using IDA
    asm = None
    # Store the output from the first de-obfuscation routine
    b2 = []
    # Read in bytes written to file from IDA
    with open('greek_to_me_buffer.asm', 'rb') as f:
        asm = f.read()

    # Re-implement loc_401039
    dl = buf
    for byte in b:
        bl = ord(byte)
        bl = bl ^ dl
        bl = bl & 0xff
        bl = bl + 0x22
        bl = bl & 0xff
        b2.append(bl)

Code Snippet 4: First De-Obfuscation Routine

Remember this should be kept inside the for loop block.

De-obfuscation Step 2

With Angr's[1] help we will work our way through the second de-obfuscation routine (0x4011E6) Kind of feels like cheating, but who wants to re-write the routine in Python or C!

Declare an angr project instance just prior to the for loop, so that it won't be re-created each time the for loop is executed.

p = angr.Project('greek_to_me.exe', load_options={'auto_load_libs': False})

Code Snippet 5: Angr Project

Set up Angr to simulate sub_4011E6 but this one needs to be inside our for loop.

    # Set up angr to "run" sub_4011E6 
    s = p.factory.blank_state(addr=0x4011E6)
    s.mem[s.regs.esp+4:].dword = 1    # Angr memory location to hold the xor'ed and add'ed bytes
    s.mem[s.regs.esp+8:].dword = 0x79 # Length of ASM

    # Copy bytes output from loc_401039 into address 0x1 so Angr can run it
    asm = ''.join(map(lambda x: chr(x), b2))
    s.memory.store(1, s.se.BVV(int(asm.encode('hex'), 16), 0x79 * 8 ))

    # Create a simulation manager...
    simgr = p.factory.simulation_manager(s)

    # Tell Angr where to go, though there is only one way through this function, 
    # we just need to stop after ax is set
    simgr.explore(find=0x401268)

Code Snippet 6: Angr Simulating Function

While I realize that using Angr here might be overkill, it was the newest tool in my belt so all problems had to be solved using Angr in some fashion ;-)

Input Validation Check

Next we need to check the output of ax to see if it matches 0xfb5e

    # Once ax is set, check to see if the value in ax matches the comparison value
    for found in simgr.found:
        print(hex(found.state.solver.eval(found.state.regs.ax)))
        # Comparison check
        if hex(found.state.solver.eval(found.state.regs.ax)) == '0xfb5eL':
            # Will cover what to do here in the next section
            pass

Code Snippet 7: Check Angr ax Result

De-Obfuscated Code

Now that we have a match, we end up with this when printing it to the screen.

�e�]��E�t�_�U��E�t�E�u�U��E�b�E�r�E�u�E�t�]߈U��E�f�E�o�E�r�E�c�]��E�@�E�f�E�l�E�a�E�r�]��E�-�E�o�E�n�E�.�E�c�E�o�E�m�E�

Code Snippet 8: Illegible Bytes

Guessing it is assembly code. Using Capstone, we can disassemble the code.

from capstone import *
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(code, 0x1000):
    print("0x%x\t%s\t%s" %(i.address, i.mnemonic, i.op_str))

Code Snippet 9: Dissasemble Code

Running our script again, we confirm that it is assembly code and the values being added to the buffer appear to be ASCII characters.

0x1000	mov	bl, 0x65	None
0x1002	mov	byte ptr [ebp - 0x2b], bl
0x1005	mov	byte ptr [ebp - 0x2a], 0x74
0x1009	mov	dl, 0x5f	None
0x100b	mov	byte ptr [ebp - 0x29], dl
0x100e	mov	byte ptr [ebp - 0x28], 0x74
0x1012	mov	byte ptr [ebp - 0x27], 0x75
0x1016	mov	byte ptr [ebp - 0x26], dl
0x1019	mov	byte ptr [ebp - 0x25], 0x62
0x101d	mov	byte ptr [ebp - 0x24], 0x72
0x1021	mov	byte ptr [ebp - 0x23], 0x75
0x1025	mov	byte ptr [ebp - 0x22], 0x74
0x1029	mov	byte ptr [ebp - 0x21], bl
0x102c	mov	byte ptr [ebp - 0x20], dl
0x102f	mov	byte ptr [ebp - 0x1f], 0x66
0x1033	mov	byte ptr [ebp - 0x1e], 0x6f
0x1037	mov	byte ptr [ebp - 0x1d], 0x72
0x103b	mov	byte ptr [ebp - 0x1c], 0x63
0x103f	mov	byte ptr [ebp - 0x1b], bl
0x1042	mov	byte ptr [ebp - 0x1a], 0x40
0x1046	mov	byte ptr [ebp - 0x19], 0x66
0x104a	mov	byte ptr [ebp - 0x18], 0x6c
0x104e	mov	byte ptr [ebp - 0x17], 0x61
0x1052	mov	byte ptr [ebp - 0x16], 0x72
0x1056	mov	byte ptr [ebp - 0x15], bl
0x1059	mov	byte ptr [ebp - 0x14], 0x2d
0x105d	mov	byte ptr [ebp - 0x13], 0x6f
0x1061	mov	byte ptr [ebp - 0x12], 0x6e
0x1065	mov	byte ptr [ebp - 0x11], 0x2e
0x1069	mov	byte ptr [ebp - 0x10], 0x63
0x106d	mov	byte ptr [ebp - 0xf], 0x6f
0x1071	mov	byte ptr [ebp - 0xe], 0x6d
0x1075	mov	byte ptr [ebp - 0xd], 0	

Code Snippet 10: Dissasembled Code

Embedded ASCII

While we could do the hex to ASCII mapping manually, why don't we get the script to do the work for us. Modifing the for loop we get:

bl = None
dl = None
flag = []
# Using capstone, interpret the ASM
from capstone import *
md = Cs(CS_ARCH_X86, CS_MODE_32)
for i in md.disasm(code, 0x1000):
    flag_char = None
    # The if statements do the work of interpreting the ASCII codes to their value counterpart
    if i.op_str.split(',')[0].startswith("byte ptr"):
        flag_char = chr(long(i.op_str.split(',')[1], 16))
    if i.op_str.split(',')[0].startswith('bl'):
        bl = chr(long(i.op_str.split(',')[1], 16))
    if i.op_str.split(',')[0].startswith('dl'):
        dl = chr(long(i.op_str.split(',')[1], 16))
    if i.op_str.split(',')[1].strip() == 'dl':
        flag_char = dl
    if i.op_str.split(',')[1].strip() == 'bl':
        flag_char = bl

    if (flag_char):
        flag.append(flag_char.strip())

    print("0x%x\t%s\t%s\t%s" %(i.address, i.mnemonic, i.op_str, flag_char))

print(''.join(flag))

Code Snippet 11: Code to Print the Flag

Finally, running the script we get the flag!

[email protected]

Code Snippet 12: Flag

Conclusion

Overall this was a fun challenge to solve statically while getting a chance to use Angr and Capstone for the first time in a CTF. Originally I had solved this challenge using Angr 6; however, the script here targets Angr 7 as Angr was updated in the middle of the CTF. I'll update this post if I go back and solve it using the unicorn-engine versus Angr.

The complete script can be found here [2].

Below is an Asciinema[3] recording for those that just want to see the script in action; though it is a lot more fun to run it yourself!


  1. angr ↩︎

  2. Snippet ↩︎

  3. Asciinema ↩︎

comments powered by Disqus