Cheatsheet - Resolving Windows API Calls

Synopsis

Recently came across a sample which obfuscated Windows API calls making static analysis challenging. I know; obfuscated Windows functions is nothing new. This article's goal is to outline a method that may or may not work to assist with de-obfuscation. It all hinges on how the obfuscation is implemented.

Steps

The steps outlined in this article strive to focus on methodology versus analyzing a particular sample; however, we'll be using an example as well so the delineation might get a little blurry. Let's get started!

Obfuscated Window APIs Clues

Using IDA or your favorite PE viewer, check the list of imports to see if there are a reasonable number of imports. A short list of imported functions is one clue that the binary probably resolves Windows API calls at run time (along with the presence of LoadLibrary and GetProcAddress but those are not always there).

Next we'll want to look through the code (using IDA in this case) to find calls or jumps where the destination is a held in a register that is not resolved to a Windows API call by the disassembler.

Use the binary search feature (Alt + b) in IDA to look for the following.

  • jmp eax -> FF E0
  • call eax -> FF D0
  • jmp edi -> FF E7
  • call edi -> FF D7

If these turn up nothing, expand the search criteria to include additional registers.

In our example, FF E0 did the trick. We found one instance. Doing a XRef on that location we discover 72 references. Looking closely at these we discover a pattern that goes like this:

mov eax, dword_<address>
jmp loc_<address>
jmp eax

Code Snippet 1 - Windows API Calling Convention

Here is a screenshot of one.


Screenshot 1 - Windows API Calling Convention Example in IDA

Next we check the dword's value to see what it contains. IDA shows us nothing useful when looking at the dword's value (see Screenshot 2 below).

Based upon our observations, the value stored at dword_<address> appears to be dynamically resolved at run time just prior to the code's execution switching to that location.

Hypothesis

These dwords hold Windows API function address. Not only that, but we can get the malware to de-obfuscate these calls in a debugger and map them into IDA.

Additional Observations

The following observations along with some deductions further support our theory. And some simply make proving it easier.

  1. All the dword_<address> appear to be global variables
    1. Thus the memory location used will be the same in the debugger as in the disassembler.
    2. Also the address written to those locations should correlate to the same Windows API function every single time unless the malware author did something really crazy.
  2. Looking at each dword_<address> address, they seem to be located in a contiguous memory block in the .data section
    1. One place to look in the debugger
  • The size of each is 4 bytes
  • Address length on a 32bit machine

So we should be able to let the debugger de-obfuscate the Windows API calls and map them back into IDA based upon the address holding the Windows API function's address.

Proof Part One - Debugger De-Obfuscation

  1. Take an address of a dword_<address> value
  • Screenshot 2 - List of Global Variable Dwords

  • Pick one towards the bottom, thinking that will be one of the last ones resolved
  • Confirm that it is used to assign a value to eax before a jmp eax to ensure we have not scrolled too far down the list
    • Screenshot 3 - Confirming the use of the dword
    • Recall from Screenshot 1 that jmp loc_4066C0 goes to a jmp eax
  1. In the x64dbg, set a Hardware, Write -> Dword break point at that location in the dump window.
  • Screenshot 4 - Setting hardware breakpoint
  1. Run the program (F9), letting it reach that breakpoint
  • Screenshot 5 - Execution breaks on write
  • Notice it breaks a few instructions after a call to GetProcAddress
  • The value in the dump window has also changed
    • Screenshot 6 - Dump window when the break point is hit
  1. Run until the binary has finished resolving all of the API addresses
  • In this example it appears that the purpose of this function is to resolve API calls, so we let it run until the function exits (Ctrl + F9).
    • Screenshot 7 - The dump window once the function returned
  1. In the Dump Window, change the view to Address.
  • Screenshot 8 - Adjusting the Dump window's view

  • And the Window function names are shown!

  • Screenshot 9 - Dump window in Address View mode

Voola, the first half of our hypothesis is correct and we see all of the dynamically resolved Windows API calls contiguously listed.

Proof Part Two - IDA Pro Scripting

Great! Now for the second half of the hypothesis, getting that information into IDA.

Enter IDAPython.

Before we can start the scripting process, we need to copy the contents of x64dbg's dump window into a text file. So scroll up to find the start of the list and highlight all of the entries and click Select lines from the context menu.


Screenshot 10 - Copy dump window

Paste this into a text file. Make sure there are no duplicate names. For each duplicate add a suffix to make it unique. In our example we need to do this for htonl. Now we are ready for IDAPython!

The goal here is to rename the dwords in IDA to be the value from the comment section using the address in the first column as the key. The middle column can be ignored.

Here is the script. Not going to explain the script line by line but there are comments in-line.

import idc

# Function to do the actual renaming of the dword
def rename_global_dword(addr, new_name):
  print 'Old Name %s' % Name(int(addr, 16))
  MakeName(int(addr, 16), 'dw_'+new_name)
  print 'New Name %s' % Name(int(addr, 16))

# Iterate through each line of the text file
with open('decoded_api_calls.txt', 'r') as f:
  for line in f:

    # Get the address from the first column
    addr = line.split()[0]

    # Go to that memory location in the UI
    Jump(int(addr, 16))

    # Get the Comment (DLL + Function Name) from the third column
    api_name = line.split()[2].strip()

    # Make sure the size at that location is a dword
    if ItemSize(int(addr, 16)) < 4:
      print 'Making %i a dword' %int(addr, 16)
      MakeDword(int(addr, 16))

    # Call our custom function to rename the dword
    rename_global_dword(addr, api_name)

Code Snippet 2 - IDAPython script renaming the dword variables.

The values in IDA now look like this after running the script.


Screenshot 11 - Dwords renamed

Done!!

Conclusion

Today we saw how to let the debugger do the work to de-obfuscate Windows API calls and take that output and rename variables in IDA Pro to assist with static analysis.

This method works due to how the Windows API calls were obfuscated in this piece of malware. This will not work for all obfuscation techniques.

Also, don't worry, I have not forgotten about our Virus Share: Random Sample #1 series. We'll be resuming that soon!

Appendix

IDAPython Function Calls Used [^n]

  • ItemSize : The size in bytes of the item
  • Jump - Go to that location in IDA Pro's UI
  • MakeDword : Converts size of the value to a dword (4 bytes)
  • MakeName : For our case, used to rename the dword_<address> names
  • Name : The current name of the variable at that location

References

comments powered by Disqus