Day 7 of 100 Discontiguous Days of YARA
Improving YARA writing skills by writing more YARA rules.
Summary
Partaking in Greg’s #100DaysOfYARA, but to be honest it’s more likely to be #100DiscontiguousDaysOfYARA for me - if I make it that far.
I doubt that the rules shared these 100 days will contain any truly original ideas, but I’d still like to share what I’ve learned.
Day 7
This is the second of at least a 3 (maybe 4) part series targeting SFX 4.x files. Yes, I’m absolutely stretching my #100DaysOfYARA content.
The 4 parts are:
- Identifying RAR 4.x file headers <- Prior article
- Targeting RAR 4.x embedded file names <- This Article
- Targeting RAR 4.x embedded file content (non-compressed)
- Targeting RAR 4.x sfx script content (maybe)
The Rule
import "math"
rule sfx_4x_file_filename {
meta:
author = "xorhex"
credit = "Credit for $chk goes to @notareverser"
strings:
$rar4x = { 52 61 72 21 1A 07 00 }
$file_header = { 74 [12] 02 [8] (0f | 14 | 1A | 1D | 24 | 30 ) ( 30 | 31 | 32 | 33 | 34 | 35) [2] 20 00 00 00 }
$chk = {80 7? 01 61 75 ?? 80 7? 02 72 75 ?? 80 7? 03 21 75 ?? 80 7? 04 1a 75 ?? 80 7? 05 07 75}
$name = "crslm.exe"
condition:
$chk
and
#rar4x > 0
and
for any i in (1..math.min(500, #file_header)) : (
@file_header[i] > @rar4x
and
$name in (@file_header[i] + 30 .. @file_header[i]+30 + uint16(@file_header[i] + 24 ))
)
}
This rule identifies WinRAR SFX files targeting RAR version 4.x and where the embedded file name matches $name
in one of the file_header
entries.
The secret with this rule lies in this:
$name in (@file_header[i] + 30 .. @file_header[i]+30 + uint16(@file_header[i] + 24 ))
@file_header[i]
gives the address at the start of the $file_header[i]
match. Add 30 to @file_header[i]
to get the start position of the embedded file name.
..
says from start position (to the left of these dots) to the end position (to the right of these dots) check and see if $name
exists inside this location.
The end position is defined as @file_header[i]+30 + uint16(@file_header[i] + 24 )
, so $file_header[i]
start position plus 30 plus the length of the file name gives us the ending address. The file name length is found 24 bytes from the start of the $file_header[i]
hit. Looking at the $file_header
definition, the length is where the only value of [2]
is found in the string definition.
$file_header = { 74 [12] 02 [8] (0f | 14 | 1A | 1D | 24 | 30 ) ( 30 | 31 | 32 | 33 | 34 | 35) [2] 20 00 00 00 }
Here is the same thing but mapped out visually:
As seen below, the match is between these two locations in the file.
Caution
The rule is not perfect as it doesn’t take into account these fields when the LHD_LARGE flag is set.
if (header_flags.LHD_LARGE == 1){
u32 HighPackSize;
u32 HighUnpSize;
}
Here is the full file header struct, notice the above fields come before the FileName:
struct rar_4_file_header{
u16 header_crc;
Header_Type header_type;
File_Header_Flags header_flags;
u16 header_size;
u32 PackSize;
u32 UnpSize;
OSType HostOS;
u32 FileCRC;
u32 FileTime;
Unpack_Algo UnpVer;
PackMethod Method;
u16 NameSize;
if (HostOS == OSType::Win32){
WinFileAttributes FileAttr;
}
else {
u32 FileAttr;
}
if (header_flags.LHD_LARGE == 1){
u32 HighPackSize;
u32 HighUnpSize;
}
char FileName[NameSize];
if (header_flags.LHD_SALT == 1){
u64 Salt;
}
if (header_flags.LHD_EXTTIME == 1){
u16 ExtTime;
}
u8 PackedData[PackSize];
};
Challenge
Anyone want to try their hand at updating this rule to also handle the case when the LHD_LARGE bit is set?