xorhex logo

xorhex

Focus on Threat Research Things.

Day 7 of 100 Discontiguous Days of YARA

Improving YARA writing skills by writing more YARA rules.

xorhex

3-Minute Read

#100DaysOfDiscontiguousDaysOfYARA

Summary

Partaking in Greg’s #100DaysOfYARA, but to be honest it’s more likely to be #100DiscontiguousDaysOfYARA for me - if I make it that far.

I doubt that the rules shared these 100 days will contain any truly original ideas, but I’d still like to share what I’ve learned.

Day 7

This is the second of at least a 3 (maybe 4) part series targeting SFX 4.x files. Yes, I’m absolutely stretching my #100DaysOfYARA content.

The 4 parts are:

  1. Identifying RAR 4.x file headers <- Prior article
  2. Targeting RAR 4.x embedded file names <- This Article
  3. Targeting RAR 4.x embedded file content (non-compressed)
  4. Targeting RAR 4.x sfx script content (maybe)

The Rule

import "math"

rule sfx_4x_file_filename {
    meta:
        author = "xorhex"
        credit = "Credit for $chk goes to @notareverser"
    strings:
        $rar4x = { 52 61 72 21 1A 07 00 }
        $file_header = { 74 [12] 02 [8] (0f | 14 | 1A | 1D | 24 | 30 ) ( 30 | 31 | 32 | 33 | 34 | 35) [2] 20 00 00 00 }
        $chk = {80 7? 01 61 75 ?? 80 7? 02 72 75 ?? 80 7? 03 21 75 ?? 80 7? 04 1a 75 ?? 80 7? 05 07 75}
        $name = "crslm.exe"
    condition:
            $chk
        and
            #rar4x > 0
        and 
            for any i in (1..math.min(500, #file_header)) : (
                    @file_header[i] > @rar4x
                and
                    $name in (@file_header[i] + 30 .. @file_header[i]+30 + uint16(@file_header[i] + 24 )) 
            ) 
}

This rule identifies WinRAR SFX files targeting RAR version 4.x and where the embedded file name matches $name in one of the file_header entries.

The secret with this rule lies in this:

$name in (@file_header[i] + 30 .. @file_header[i]+30 + uint16(@file_header[i] + 24 )) 

@file_header[i] gives the address at the start of the $file_header[i] match. Add 30 to @file_header[i] to get the start position of the embedded file name.

.. says from start position (to the left of these dots) to the end position (to the right of these dots) check and see if $name exists inside this location.

The end position is defined as @file_header[i]+30 + uint16(@file_header[i] + 24 ), so $file_header[i] start position plus 30 plus the length of the file name gives us the ending address. The file name length is found 24 bytes from the start of the $file_header[i] hit. Looking at the $file_header definition, the length is where the only value of [2] is found in the string definition.

$file_header = { 74 [12] 02 [8] (0f | 14 | 1A | 1D | 24 | 30 ) ( 30 | 31 | 32 | 33 | 34 | 35) [2] 20 00 00 00 }

Here is the same thing but mapped out visually:

ImHex file_header matching

As seen below, the match is between these two locations in the file.

ImHex YARA Hit

Caution

The rule is not perfect as it doesn’t take into account these fields when the LHD_LARGE flag is set.

  if (header_flags.LHD_LARGE == 1){
    u32               HighPackSize;
    u32               HighUnpSize;
  }

Here is the full file header struct, notice the above fields come before the FileName:

struct rar_4_file_header{
	u16                 header_crc;
	Header_Type         header_type;
	File_Header_Flags   header_flags;
	u16                 header_size;
	u32                 PackSize;
	u32                 UnpSize;
	OSType              HostOS;
	u32                 FileCRC;
	u32                 FileTime;
	Unpack_Algo         UnpVer;
	PackMethod          Method;
	u16                 NameSize;
	
  if (HostOS == OSType::Win32){
    WinFileAttributes FileAttr;
	}
  else {
    u32               FileAttr;
	}
	
  if (header_flags.LHD_LARGE == 1){
    u32               HighPackSize;
    u32               HighUnpSize;
  }
    
  char                FileName[NameSize];
    
  if (header_flags.LHD_SALT == 1){
    u64             Salt;
  }
    
  if (header_flags.LHD_EXTTIME == 1){
    u16             ExtTime;
  }

  u8                PackedData[PackSize];
};
Challenge

Anyone want to try their hand at updating this rule to also handle the case when the LHD_LARGE bit is set?

Recent Posts

Categories

About

Hosting my custom tools, threat research, and general reverse engineering notes.